Optimization Basics: Local vs Global Minima

November 27, 2025

Optimization Basics: Local vs Global Minima

1. What is Optimization?

Optimization is the process of finding the best value (minimum or maximum) of a function subject to given constraints.

In machine learning, optimization typically means:

\min_{x \in \mathcal{D}} f(x)

where:

$f(x)$ is the loss/cost function
$\mathcal{D}$ is the feasible set

2. Local Minimum

Definition

A point $x^\ast$ is called a local minimum of $f$ if there exists a neighborhood $N(x^\ast)$ such that:

f(x^\ast) \le f(x), \quad \forall x \in N(x^\ast)

📌 The comparison is only with nearby points, not the entire domain.

Intuition

The point looks like the bottom of a small valley
There may be lower points elsewhere

Example

f(x) = x^4 - x^2

Local minima at $x = \pm \frac{1}{\sqrt{2}}$
Not global minima

3. Global Minimum

Definition

A point $x^\ast$ is a global minimum if:

f(x^\ast) \le f(x), \quad \forall x \in \mathcal{D}

📌 The comparison is with every point in the domain.

Intuition

The lowest possible point of the function
No point has a smaller function value

Example

f(x) = x^2

Global minimum at $x = 0$

4. Local vs Global: Key Differences

Aspect	Local Minimum	Global Minimum
Comparison	Nearby points only	Entire domain
Uniqueness	May be many	May be unique
Guarantee	No optimal guarantee	Best possible solution
ML relevance	May trap algorithms	Desired outcome

5. Critical Points and Optimality

A critical point satisfies:

\nabla f(x) = 0

However:

Not all critical points are minima
Could be:
- Local minimum
- Local maximum
- Saddle point

6. Tests for Minima

(a) First Derivative Test (1D)

$f'(x^\ast) = 0$
Sign change:
- $- \to +$ : local minimum
- $+ \to -$ : local maximum

(b) Second Derivative Test (1D)

$f''(x^\ast) > 0$ : local minimum
$f''(x^\ast) < 0$ : local maximum

(c) Hessian Test (Multivariate)

Hessian positive definite → local minimum
Hessian indefinite → saddle point

7. Convex Functions: The Key Advantage

Fundamental Theorem

For a convex function, every local minimum is also a global minimum.

This is why convex optimization is so important.

Example

f(x) = x^2 + 4x + 6

Convex function
Unique minimum at $x = -2$
Local = Global

8. Non-Convex Functions

Can have many local minima
Algorithms may get stuck
Common in deep learning

Example

f(x) = \sin x + x^2

9. Visual Comparison

Convex Function	Non-Convex Function
One bowl shape	Multiple valleys
One minimum	Many minima
Easy optimization	Hard optimization

10. Optimization in Machine Learning

Algorithm	Behavior
Gradient Descent	May converge to local min
Stochastic GD	Escapes shallow minima
Newton’s Method	Fast but local
Convex solvers	Guaranteed global min

Search This Blog

Advanced Mathematics for Computer Science HNCST409 KTU BTech Honors 2024 Scheme- Dr Binu V P