Saddle Point

 

Saddle Point

1. Intuitive Meaning

A saddle point is a point where the function behaves like:

  • a minimum in one direction, and

  • a maximum in another direction.

📌 It is not a local minimum or a local maximum.

The name comes from the shape of a horse saddle — curved up in one direction and down in another.


2. Formal Definition

A point xx^\ast is called a saddle point of a function ff if:

  • f(x)=0\nabla f(x^\ast) = 0 (it is a critical point), and

  • xx^\ast is neither a local minimum nor a local maximum.


3. One-Dimensional vs Multi-Dimensional

In 1D

  • Saddle points do not occur

  • Every critical point is a min, max, or flat point

In 2D and higher

  • Saddle points are very common


4. Classic Example

Example 1: Two-Variable Function

f(x,y)=x2y2f(x,y) = x^2 - y^2






Step 1: Compute the Gradient

f=[2x2y]\nabla f = \begin{bmatrix} 2x \\ -2y \end{bmatrix}

Setting f=0\nabla f = 0

x=0,y=0x = 0,\quad y = 0

So, (0,0) is a critical point.


Step 2: Analyze Behavior

  • Along y=0y = 0:

    f(x,0)=x2(maximum at y=0)f(x,0) = x^2 \quad \text{(minimum at } x=0)
  • Along x=0x = 0:

    f(0,y)=y2(minimum at x=0)f(0,y) = -y^2 \quad \text{(maximum at } y=0)

📌 Therefore, (0,0)(0,0) is a saddle point.


5. Hessian Test for Saddle Points

For a twice-differentiable function:

  • If the Hessian matrix is indefinite, the point is a saddle point.

Hessian of f(x,y)=x2y2f(x,y) = x^2 - y^2

2f=[2002]\nabla^2 f = \begin{bmatrix} 2 & 0 \\ 0 & -2 \end{bmatrix}
  • Eigenvalues: 22 and 2-2

  • Mixed signs ⇒ indefinite


6. Why Saddle Points Matter in Optimization

In Convex Optimization

  • Saddle points do not exist

  • Every critical point is global minimum

In Non-Convex Optimization

  • Saddle points are abundant

  • Gradient descent may:

    • Slow down

    • Stall temporarily

    • Escape with noise (SGD)


7. Saddle Point vs Local Minimum

Feature       Saddle Point            Local Minimum
Gradient                Zero                Zero
Hessian                Indefinite                        Positive definite
Optimal?                ❌ No                ✅ Yes
ML Impact                Optimization difficulty                Desired

8. Saddle Points in Machine Learning

  • Neural network loss surfaces contain many saddle points

  • High-dimensional problems have:

    • Few bad local minima

    • Many saddle points

📌 This explains why:

  • SGD works well in practice

  • Noise helps escape saddle regions

Comments

Popular posts from this blog

Advanced Mathematics for Computer Science HNCST409 KTU BTech Honors 2024 Scheme

AKS Primality Testing

Galois Field and Operations