Bayesian inference for a coin toss (Beta prior + Binomial likelihood)
Bayesian inference for a coin toss (Beta prior + Binomial likelihood)
We want to estimate the probability of heads, call it θ, using observed coin tosses.
Bayesian inference updates our prior belief about θ using observed data to get a posterior belief.
1️⃣ Model setup
✔ Likelihood: Binomial
If we toss the coin n times and see k heads, then
This says: given θ, what's the chance of seeing k heads?
✔ Prior: Beta distribution
We assume
Beta is ideal because:
-
Defined on [0,1]
-
Flexible shape
-
Conjugate to Binomial → posterior is also Beta
Prior density:
Interpretation:
-
α − 1 = prior pseudo-heads
-
β − 1 = prior pseudo-tails
2️⃣ Posterior update
Using Bayes’ rule:
Multiply likelihood and prior:
So posterior is:
This is the key result.
🎯 Example 1 — Neutral prior
Prior
Uniform → no preference
Data
10 tosses → 7 heads, 3 tails
Posterior
Posterior mean
Observed frequency = 0.7
Bayesian estimate slightly shrinks toward 0.5
🎯 Example 2 — Strong prior belief coin is fair
Prior
Mean = 0.5 with high confidence
Data
10 tosses → 7 heads
Posterior
Posterior mean:
Despite 70% heads, estimate stays near 0.5
because prior is strong.
🎯 Example 3 — Prior belief coin biased to heads
Prior
Mean = 0.8
Data
10 tosses → 3 heads
Posterior
Posterior mean:
Even though data suggests 0.3, posterior balances prior + data.
🧠 Intuition (most important)
Bayesian updating works like:
| Source | Heads | Tails |
|---|---|---|
| Prior contributes | α−1 | β−1 |
| Data contributes | k | n−k |
| Posterior totals | α+k | β+n−k |
So the prior behaves like imaginary observations.
📊 Predictive probability (next toss)
Chance next toss is heads:
Example from Example 1:
🧩 Why Beta is conjugate
Because:
Same functional form after updating → closed-form solution.
No numerical integration needed.
Comments
Post a Comment