Bayesian inference for a coin toss (Beta prior + Binomial likelihood)

November 14, 2025

Bayesian inference for a coin toss (Beta prior + Binomial likelihood)

We want to estimate the probability of heads, call it θ, using observed coin tosses.

Bayesian inference updates our prior belief about θ using observed data to get a posterior belief.

1️⃣ Model setup

✔ Likelihood: Binomial

If we toss the coin n times and see k heads, then

P(k \mid \theta) = \binom{n}{k}\theta^k(1-\theta)^{n-k}

This says: given θ, what's the chance of seeing k heads?

✔ Prior: Beta distribution

We assume

\theta \sim \text{Beta}(\alpha,\beta)

Beta is ideal because:

Defined on [0,1]
Flexible shape
Conjugate to Binomial → posterior is also Beta

Prior density:

P(\theta) \propto \theta^{\alpha-1}(1-\theta)^{\beta-1}

Interpretation:

α − 1 = prior pseudo-heads
β − 1 = prior pseudo-tails

2️⃣ Posterior update

Using Bayes’ rule:

P(\theta \mid k) \propto P(k \mid \theta)P(\theta)

Multiply likelihood and prior:

\theta^k(1-\theta)^{n-k} \cdot \theta^{\alpha-1}(1-\theta)^{\beta-1}

= \theta^{k+\alpha-1}(1-\theta)^{n-k+\beta-1}

So posterior is:

\boxed{\theta \mid k \sim \text{Beta}(\alpha+k,\;\beta+n-k)}

This is the key result.

🎯 Example 1 — Neutral prior

Prior

\text{Beta}(1,1)

Uniform → no preference

Data

10 tosses → 7 heads, 3 tails

Posterior

\text{Beta}(1+7,\;1+3) = \text{Beta}(8,4)

Posterior mean

E[\theta] = \frac{8}{8+4} = 0.667

Observed frequency = 0.7
Bayesian estimate slightly shrinks toward 0.5

🎯 Example 2 — Strong prior belief coin is fair

Prior

\text{Beta}(50,50)

Mean = 0.5 with high confidence

Data

10 tosses → 7 heads

Posterior

\text{Beta}(57,53)

Posterior mean:

\frac{57}{110} = 0.518

Despite 70% heads, estimate stays near 0.5
because prior is strong.

🎯 Example 3 — Prior belief coin biased to heads

Prior

\text{Beta}(8,2)

Mean = 0.8

Data

10 tosses → 3 heads

Posterior

\text{Beta}(11,9)

Posterior mean:

\frac{11}{20}=0.55

Even though data suggests 0.3, posterior balances prior + data.

🧠 Intuition (most important)

Bayesian updating works like:

Source	Heads	Tails
Prior contributes	α−1	β−1
Data contributes	k	n−k
Posterior totals	α+k	β+n−k

So the prior behaves like imaginary observations.

📊 Predictive probability (next toss)

Chance next toss is heads:

P(\text{heads next}) = \frac{\alpha+k}{\alpha+\beta+n}

Example from Example 1:

\frac{8}{12}=0.667

🧩 Why Beta is conjugate

Because:

\text{Beta} \times \text{Binomial} \rightarrow \text{Beta}

Same functional form after updating → closed-form solution.

No numerical integration needed.

Search This Blog

Advanced Mathematics for Computer Science HNCST409 KTU BTech Honors 2024 Scheme- Dr Binu V P

Bayesian inference for a coin toss (Beta prior + Binomial likelihood)

Bayesian inference for a coin toss (Beta prior + Binomial likelihood)

1️⃣ Model setup

✔ Likelihood: Binomial

✔ Prior: Beta distribution

2️⃣ Posterior update

🎯 Example 1 — Neutral prior

Prior

Data

Posterior

Posterior mean

🎯 Example 2 — Strong prior belief coin is fair

Prior

Data

Posterior

🎯 Example 3 — Prior belief coin biased to heads

Prior

Data

Posterior

🧠 Intuition (most important)

📊 Predictive probability (next toss)

🧩 Why Beta is conjugate

Comments

Post a Comment

Popular posts from this blog

Advanced Mathematics for Computer Science HNCST409 KTU BTech Honors 2024 Scheme

Convex and Non Convex Sets

Equivalence Relation