Convergence Behaviour of Batch Gradient Descent (BGD) vs Stochastic Gradient Descent (SGD)

November 21, 2025

Convergence Behaviour of Batch Gradient Descent (BGD) vs Stochastic Gradient Descent (SGD)

Batch Gradient Descent (BGD)

Uses the entire dataset to compute the gradient at each iteration.
The loss function decreases smoothly and monotonically (for suitable learning rates).
Convergence path is stable and deterministic.
For convex loss functions, BGD guarantees convergence to the global minimum.
Convergence can be slow per iteration due to high computational cost.
Less suitable for very large datasets.

📌 Behavior: Smooth, steady convergence without oscillations.

Stochastic Gradient Descent (SGD)

Uses one randomly selected data point per update.
The loss function shows fluctuations instead of smooth decrease.
Convergence path is noisy and stochastic.
Converges in expectation to the minimum, not exactly at every step.
Faster initial progress but may oscillate near the minimum.
Noise can help escape saddle points and shallow local minima.

📌 Behavior: Fast but zig-zag convergence around the minimum.

Comparison Summary

Aspect	Batch GD	Stochastic GD
Gradient estimate	Exact	Noisy
Convergence path	Smooth	Zig-zag
Stability	High	Low
Speed per update	Slow	Fast
Final convergence	Exact (convex case)	Approximate
Suitability	Small datasets	Large datasets

Conclusion

Batch Gradient Descent converges smoothly and deterministically to the minimum, while Stochastic Gradient Descent converges faster but with oscillations due to noisy gradient estimates.

Search This Blog

Advanced Mathematics for Computer Science HNCST409 KTU BTech Honors 2024 Scheme- Dr Binu V P

Convergence Behaviour of Batch Gradient Descent (BGD) vs Stochastic Gradient Descent (SGD)

Convergence Behaviour of Batch Gradient Descent (BGD) vs Stochastic Gradient Descent (SGD)

Batch Gradient Descent (BGD)

Stochastic Gradient Descent (SGD)

Comparison Summary

Conclusion

Comments

Post a Comment

Popular posts from this blog

Advanced Mathematics for Computer Science HNCST409 KTU BTech Honors 2024 Scheme

Convex and Non Convex Sets

Equivalence Relation