Optimization Basics: Local vs Global Minima
Optimization Basics: Local vs Global Minima
1. What is Optimization?
Optimization is the process of finding the best value (minimum or maximum) of a function subject to given constraints.
In machine learning, optimization typically means:
where:
-
is the loss/cost function
-
is the feasible set
2. Local Minimum
Definition
A point is called a local minimum of if there exists a neighborhood such that:
📌 The comparison is only with nearby points, not the entire domain.
Intuition
-
The point looks like the bottom of a small valley
-
There may be lower points elsewhere
Example
-
Local minima at
-
Not global minima
3. Global Minimum
Definition
A point is a global minimum if:
📌 The comparison is with every point in the domain.
Intuition
-
The lowest possible point of the function
-
No point has a smaller function value
Example
-
Global minimum at
4. Local vs Global: Key Differences
| Aspect | Local Minimum | Global Minimum |
|---|---|---|
| Comparison | Nearby points only | Entire domain |
| Uniqueness | May be many | May be unique |
| Guarantee | No optimal guarantee | Best possible solution |
| ML relevance | May trap algorithms | Desired outcome |
5. Critical Points and Optimality
A critical point satisfies:
However:
-
Not all critical points are minima
-
Could be:
-
Local minimum
-
Local maximum
-
Saddle point
-
6. Tests for Minima
(a) First Derivative Test (1D)
-
-
Sign change:
-
: local minimum
-
: local maximum
-
(b) Second Derivative Test (1D)
-
: local minimum
-
: local maximum
(c) Hessian Test (Multivariate)
-
Hessian positive definite → local minimum
-
Hessian indefinite → saddle point
7. Convex Functions: The Key Advantage
Fundamental Theorem
For a convex function, every local minimum is also a global minimum.
This is why convex optimization is so important.
Example
-
Convex function
-
Unique minimum at
-
Local = Global
8. Non-Convex Functions
-
Can have many local minima
-
Algorithms may get stuck
-
Common in deep learning
Example
9. Visual Comparison
| Convex Function | Non-Convex Function |
|---|---|
| One bowl shape | Multiple valleys |
| One minimum | Many minima |
| Easy optimization | Hard optimization |
10. Optimization in Machine Learning
| Algorithm | Behavior |
|---|---|
| Gradient Descent | May converge to local min |
| Stochastic GD | Escapes shallow minima |
| Newton’s Method | Fast but local |
| Convex solvers | Guaranteed global min |
.png)
.png)
Comments
Post a Comment