Optimization Basics: Local vs Global Minima

 

Optimization Basics: Local vs Global Minima

1. What is Optimization?

Optimization is the process of finding the best value (minimum or maximum) of a function subject to given constraints.

In machine learning, optimization typically means:

minxDf(x)\min_{x \in \mathcal{D}} f(x)

where:

  • f(x)f(x) is the loss/cost function

  • D\mathcal{D} is the feasible set


2. Local Minimum

Definition

A point xx^\ast is called a local minimum of ff if there exists a neighborhood N(x)N(x^\ast) such that:

f(x)f(x),xN(x)f(x^\ast) \le f(x), \quad \forall x \in N(x^\ast)

📌 The comparison is only with nearby points, not the entire domain.


Intuition

  • The point looks like the bottom of a small valley

  • There may be lower points elsewhere


Example

f(x)=x4x2f(x) = x^4 - x^2
  • Local minima at x=±12​

  • Not global minima





3. Global Minimum

Definition

A point xx^\ast is a global minimum if:

f(x)f(x),xDf(x^\ast) \le f(x), \quad \forall x \in \mathcal{D}

📌 The comparison is with every point in the domain.


Intuition

  • The lowest possible point of the function

  • No point has a smaller function value


Example

f(x)=x2f(x) = x^2
  • Global minimum at x=0x = 0




4. Local vs Global: Key Differences

Aspect                        Local Minimum            Global Minimum
Comparison                    Nearby points only                Entire domain
Uniqueness                    May be many                May be unique
Guarantee                    No optimal guarantee                Best possible solution
ML relevance                    May trap algorithms                Desired outcome

5. Critical Points and Optimality

A critical point satisfies:

f(x)=0\nabla f(x) = 0

However:

  • Not all critical points are minima

  • Could be:

    • Local minimum

    • Local maximum

    • Saddle point


6. Tests for Minima

(a) First Derivative Test (1D)

  • f(x)=0f'(x^\ast) = 0

  • Sign change:

    • +- \to +: local minimum

    • ++ \to -: local maximum


(b) Second Derivative Test (1D)

  • f(x)>0f''(x^\ast) > 0: local minimum

  • f(x)<0f''(x^\ast) < 0: local maximum


(c) Hessian Test (Multivariate)

  • Hessian positive definite → local minimum

  • Hessian indefinite → saddle point


7. Convex Functions: The Key Advantage

Fundamental Theorem

For a convex function, every local minimum is also a global minimum.

This is why convex optimization is so important.


Example

f(x)=x2+4x+6f(x) = x^2 + 4x + 6
  • Convex function

  • Unique minimum at x=2x = -2

  • Local = Global


8. Non-Convex Functions

  • Can have many local minima

  • Algorithms may get stuck

  • Common in deep learning

Example

f(x)=sinx+x2f(x) = \sin x + x^2

9. Visual Comparison

Convex Function                Non-Convex Function
One bowl shape                Multiple valleys
One minimum                Many minima
Easy optimization                Hard optimization

10. Optimization in Machine Learning

AlgorithmBehavior
Gradient Descent            May converge to local min
Stochastic GD            Escapes shallow minima
Newton’s Method            Fast but local
Convex solvers            Guaranteed global min


Comments

Popular posts from this blog

Advanced Mathematics for Computer Science HNCST409 KTU BTech Honors 2024 Scheme

AKS Primality Testing

Galois Field and Operations