Bayesian Paradigm: Probability as Belief
Bayesian Paradigm: Probability as Belief
1. Two Views of Probability (Context)
Before Bayesian thinking, students usually see the frequentist view.
Frequentist View
-
Probability = long-run relative frequency
-
Example:
“Probability of heads = 0.5” means in many coin tosses, half are heads.
Bayesian View
-
Probability is a degree of belief or certainty about an event.
-
Beliefs are updated when new evidence arrives.
📌 Key idea: Probability quantifies uncertainty, not just frequency.
2. Probability as Belief
In the Bayesian paradigm:
Probability represents how strongly we believe a statement is true, given the available information.
Example
-
“It will rain tomorrow”
-
No repeated experiment
-
Yet we say: “70% chance of rain”
➡ This is belief-based probability.
3. Prior Belief (Before Seeing Data)
A prior probability represents belief before observing data.
Example:
-
Belief that a coin is fair
-
Prior:
In ML:
-
Prior belief about model parameters
4. Evidence (Observed Data)
Data provides evidence.
Example:
-
Coin tossed 10 times → 8 heads
-
Data challenges prior belief
In ML:
-
Dataset
5. Likelihood (How Data Supports Beliefs)
The likelihood measures how likely the observed data is, given a hypothesis.
Example:
-
How likely are 8 heads if the coin bias is 0.5?
-
How likely if bias is 0.8?
6. Posterior Belief (Updated Belief)
Bayesian inference updates belief using Bayes’ theorem:
Where:
-
→ Prior
-
→ Likelihood
-
→ Posterior (updated belief)
-
→ Evidence (normalizing constant)
📌 Posterior combines prior belief + data.
7. Simple Example (Coin Toss)
Prior
-
Believe coin is fair:
Data
-
8 heads out of 10 tosses
Posterior
-
Belief shifts toward a biased coin
-
But prior prevents extreme conclusions from small data
➡ This is rational belief updating.
8. Bayesian Paradigm in Machine Learning
Parameter Estimation
-
Treat parameters as random variables
-
Goal:
Prediction
📌 Accounts for uncertainty in parameters.
9. Bayesian vs Frequentist (Quick Contrast)
| Aspect | Frequentist | Bayesian |
|---|---|---|
| Parameters | Fixed | Random |
| Probability | Frequency | Belief |
| Uncertainty | From data | From belief + data |
| Prior knowledge | Not used | Explicitly used |
10. Why Bayesian Thinking is Powerful
✔ Works with small data
✔ Incorporates prior knowledge
✔ Quantifies uncertainty
✔ Naturally handles learning over time
11. One-Line Intuition for Students
Bayesian probability measures how strongly we believe something is true and updates that belief when new evidence arrives.
12. Exam-Ready Definition
The Bayesian paradigm interprets probability as a degree of belief and uses Bayes’ theorem to update beliefs in the presence of new data.
Bayesian Paradigm: Worked Example
Coin Toss Experiment (Probability as Belief)
Problem Statement
We want to estimate the bias of a coin, i.e., the probability of getting Heads, using the Bayesian approach.
Step 1: Unknown Quantity (What Are We Learning?)
Let
In the Bayesian paradigm, is treated as a random variable, not a fixed unknown constant.
Step 2: Prior Belief
Before observing any data, suppose we believe the coin is fair.
We encode this belief using a prior distribution.
Choose a Prior
A common prior for probabilities is the Beta distribution:
Let:
This prior:
-
Is symmetric around 0.5
-
Reflects belief that the coin is roughly fair
📌 Prior mean:
Step 3: Observe Data (Evidence)
We toss the coin 10 times and observe:
-
Heads = 8
-
Tails = 2
This is our data .
Step 4: Likelihood Function
The likelihood of observing 8 heads and 2 tails given is:
This tells us how well each value of explains the data.
Step 5: Bayes’ Theorem (Belief Update)
Bayes’ theorem:
Substitute:
-
Prior:
-
Likelihood:
Step 6: Posterior Distribution
For a Beta prior and binomial likelihood, the posterior is also Beta:
So:
Step 7: Interpret the Posterior (Updated Belief)
Posterior Mean
📌 Our belief has shifted from 0.5 → 0.714, but not fully to 0.8, because the prior still influences the result.
Posterior Variance
-
Smaller than prior variance
-
Indicates increased confidence
Step 8: MAP Estimate (Most Probable Value)
The Maximum A Posteriori (MAP) estimate is:
For :
Step 9: Bayesian Interpretation (Key Insight)
| Stage | Belief About Coin |
|---|---|
| Before data | Coin is fair |
| After data | Coin is biased |
| Confidence | Moderate (only 10 tosses) |
📌 Bayesian learning balances prior belief and observed evidence.
Step 10: Comparison with Frequentist Estimate
Frequentist MLE:
Bayesian estimate:
➡ Bayesian estimate is more conservative, especially with small data.
Step 11: Why This Shows “Probability as Belief”
-
We were uncertain about
-
We expressed uncertainty using a probability distribution
-
We updated beliefs using Bayes’ rule
-
Probability represents belief, not frequency
Exam-Ready Conclusion
In the Bayesian paradigm, probability represents a degree of belief, which is updated using Bayes’ theorem when new data is observed, as illustrated by estimating a coin’s bias using a prior and posterior distribution.
One-Line Intuition for Students
Bayesian inference updates what we believe as evidence accumulates.
Comments
Post a Comment