Linear Regression: Using Gradient Descent
Linear Regression: Using Gradient Descent
1. What is Linear Regression?
Linear Regression is a supervised learning algorithm used to model the relationship between an input variable and an output variable by fitting a straight line.
Goal
To find a line that best predicts the output for given inputs.
2. Simple Linear Regression Model
For one input variable , the model is:
Where:
-
→ slope (weight)
-
→ intercept (bias)
-
→ predicted output
3. A Simple Example
Problem Statement
Predict a student’s exam score based on hours studied.
| Hours studied (x) | Score (y) |
|---|---|
| 1 | 50 |
| 2 | 55 |
| 3 | 65 |
| 4 | 70 |
| 5 | 75 |
4. Visual Idea
We want to fit a straight line through these points such that the overall error is minimized.
The line might look like:
5. How Do We Measure Error?
We use Mean Squared Error (MSE):
This penalizes large errors more heavily.
6. Learning the Best Line (Concept)
-
Start with random and
-
Predict scores using
-
Compute errors
-
Adjust and to reduce errors
-
Repeat until the best line is found
This adjustment is done using Gradient Descent.
7. Interpretation of Parameters
-
Slope
→ Increase in score per additional hour studied
(e.g., means +6 marks per hour) -
Intercept
→ Expected score with zero hours studied
8. Why Linear Regression Works Here
-
Relationship is approximately linear
-
Errors are minimized globally (convex problem)
-
Model is simple and interpretable
9. Definition
Linear Regression is a supervised learning algorithm that models the relationship between dependent and independent variables by fitting a linear equation that minimizes the mean squared error.
10. Key Takeaways
✔ Simple and interpretable
✔ Works well for linear relationships
✔ Foundation for many ML algorithms
✔ Solved using optimization
11. Closed-Form Solution
For advanced students:
But in practice, gradient descent is preferred for large datasets.
12. One-Line Intuition
Linear regression finds the straight line that best fits the data by minimizing prediction errors.
How Gradient Descent Finds the Best-Fit Line in Linear Regression
1. Problem Setup (Recalled)
We have data points:
Model (Straight Line)
Our goal is to find (slope) and (intercept) such that the line fits the data as well as possible.
2. What Does “Best Fit” Mean?
“Best fit” means minimizing the total prediction error.
We measure error using Mean Squared Error (MSE):
📌 This is a convex function in and .
So it has one global minimum → one best-fit line.
3. Geometry of the Problem (Very Important)
-
Each choice of gives a different line
-
Each line produces a certain MSE value
-
If we plot , we get a bowl-shaped surface
-
The lowest point of the bowl corresponds to the best line
👉 Gradient descent moves us down this bowl.
4. Role of Gradient Descent
Key Idea
Gradient descent updates and in the direction that reduces error the fastest.
Mathematically:
5. Compute Gradients (Direction of Steepest Increase)
Gradient with respect to
Gradient with respect to
📌 These gradients tell us:
-
If slope is too steep or too flat
-
If line is too high or too low
6. Gradient Descent Update Rules
Each update slightly rotates or shifts the line.
7. Simple Numerical Example (Concrete)
Data:
Step 1: Initialize
Predictions:
Errors:
Step 2: Compute Gradients
Step 3: Update Parameters ()
📌 New line:
This line is closer to the data than before.
8. Repeating the Process
Each iteration:
-
Line moves closer to data
-
Error reduces
-
Gradients become smaller
-
Eventually updates stop changing
At convergence:
This gives the best-fit line.
9. Why Gradient Descent Always Finds the Best Line
✔ MSE loss is convex
✔ Only one minimum exists
✔ Gradient descent cannot get stuck elsewhere
So the final line is globally optimal.
10. Intuitive Explanation
Gradient descent keeps adjusting the slope and intercept until the line cannot be improved any further.
11. Visual Interpretation ( Analogy)
-
Imagine placing a ball on a bowl
-
The ball rolls down
-
It stops at the lowest point
-
That point corresponds to the best-fit line
12. Conclusion
In linear regression, gradient descent works by iteratively updating the slope and intercept in the direction opposite to the gradient of the mean squared error until the global minimum is reached, yielding the best-fit line.
13. Python Code
.png)
Comments
Post a Comment