Saddle Point
Saddle Point
1. Intuitive Meaning
A saddle point is a point where the function behaves like:
-
a minimum in one direction, and
-
a maximum in another direction.
📌 It is not a local minimum or a local maximum.
The name comes from the shape of a horse saddle — curved up in one direction and down in another.
2. Formal Definition
A point is called a saddle point of a function if:
-
(it is a critical point), and
-
is neither a local minimum nor a local maximum.
3. One-Dimensional vs Multi-Dimensional
In 1D
-
Saddle points do not occur
-
Every critical point is a min, max, or flat point
In 2D and higher
-
Saddle points are very common
4. Classic Example
Example 1: Two-Variable Function
Step 1: Compute the Gradient
Setting
So,
Step 2: Analyze Behavior
-
Along :
-
Along
:x = 0 x = 0 f ( 0 , y ) = − y 2 (minimum at x = 0 ) f(0,y) = -y^2 \quad \text{(maximum at } y=0)
📌 Therefore,
5. Hessian Test for Saddle Points
For a twice-differentiable function:
-
If the Hessian matrix is indefinite, the point is a saddle point.
Hessian of f ( x , y ) = x 2 − y 2 f(x,y) = x^2 - y^2
-
Eigenvalues:
and2 2 − 2 -2 -
Mixed signs ⇒ indefinite
6. Why Saddle Points Matter in Optimization
In Convex Optimization
-
Saddle points do not exist
-
Every critical point is global minimum
In Non-Convex Optimization
-
Saddle points are abundant
-
Gradient descent may:
-
Slow down
-
Stall temporarily
-
Escape with noise (SGD)
-
7. Saddle Point vs Local Minimum
| Feature | Saddle Point | Local Minimum |
|---|---|---|
| Gradient | Zero | Zero |
| Hessian | Indefinite | Positive definite |
| Optimal? | ❌ No | ✅ Yes |
| ML Impact | Optimization difficulty | Desired |
8. Saddle Points in Machine Learning
-
Neural network loss surfaces contain many saddle points
-
High-dimensional problems have:
-
Few bad local minima
-
Many saddle points
-
📌 This explains why:
-
SGD works well in practice
-
Noise helps escape saddle regions
.png)

Comments
Post a Comment