Normal–Normal Bayesian Model

 

Normal–Normal Bayesian Model

(Gaussian Likelihood + Gaussian Prior)


1. What Is the Normal–Normal Model?

The Normal–Normal model is a Bayesian model used when:

  • Data are normally distributed

  • The variance is known

  • The mean is unknown

  • Our prior belief about the mean is Gaussian

It is one of the most important Bayesian models because:

  • The posterior has a closed form

  • It illustrates Bayesian updating

  • It connects directly to regularization in ML


2. Model Assumptions

Likelihood (Data Model)

xiμN(μ,σ2),i=1,,nx_i \mid \mu \sim \mathcal{N}(\mu,\sigma^2), \quad i=1,\dots,n
  • μ\mu: unknown mean

  • σ2\sigma^2: known variance


Prior (Belief about the Mean)

μN(μ0,τ2)\mu \sim \mathcal{N}(\mu_0,\tau^2)
  • μ0\mu_0: prior mean

  • τ2\tau^2: prior variance


3. Why Is This Called “Normal–Normal”?

Component        Distribution
Likelihood            Normal
Prior            Normal
Posterior            Normal

📌 This is an example of a conjugate prior.


4. Posterior Distribution (Key Result)

After observing data D={x1,,xn}D = \{x_1,\dots,x_n\}:

μDN(μn,τn2)\boxed{ \mu \mid D \sim \mathcal{N}(\mu_n,\tau_n^2) }

Where:

Posterior Mean

μn=nσ2xˉ+1τ2μ0nσ2+1τ2\boxed{ \mu_n = \frac{\frac{n}{\sigma^2}\bar{x} + \frac{1}{\tau^2}\mu_0} {\frac{n}{\sigma^2} + \frac{1}{\tau^2}} }

Posterior Variance

τn2=(nσ2+1τ2)1\boxed{ \tau_n^2 = \left(\frac{n}{\sigma^2} + \frac{1}{\tau^2}\right)^{-1} }

5. Interpretation of the Posterior Mean

The posterior mean is a weighted average:

μn=wdataxˉ+wpriorμ0\mu_n = w_{\text{data}}\bar{x} + w_{\text{prior}}\mu_0

Where:

  • Weights are proportional to precision (inverse variance)

📌 More confidence → more influence.


6. Worked Numerical Example

Given:

  • Prior:

    μN(5,4)\mu \sim \mathcal{N}(5,4)
  • Known variance:

    σ2=1\sigma^2 = 1
  • Observed data:

    x={6,5,7}x = \{6,5,7\}

Step 1: Compute Sample Mean

xˉ=6+5+73=6\bar{x} = \frac{6+5+7}{3} = 6

Step 2: Compute Posterior Mean

μn=316+14531+14=18+1.253.25=5.92\mu_n = \frac{\frac{3}{1}\cdot 6 + \frac{1}{4}\cdot 5} {\frac{3}{1} + \frac{1}{4}} = \frac{18 + 1.25}{3.25} = \boxed{5.92}

Step 3: Compute Posterior Variance

τn2=(3+0.25)1=0.308\tau_n^2 = \left(3 + 0.25\right)^{-1} = \boxed{0.308}

7. Final Posterior

μDN(5.92,0.308)\boxed{ \mu \mid D \sim \mathcal{N}(5.92,\,0.308) }

8. Key Insights

1. Data vs Prior

  • More data → posterior moves toward sample mean

  • Strong prior → posterior stays near prior mean


2. Uncertainty Shrinks

τn2<τ2\tau_n^2 < \tau^2

More data → less uncertainty.


9. MAP and Bayesian Mean

For Gaussian posterior:

μMAP=μn\mu_{\text{MAP}} = \mu_n

📌 MAP estimate equals posterior mean.


10. Limiting Cases (Important for Exams)

Large Data Limit

nμnxˉn \to \infty \Rightarrow \mu_n \to \bar{x}

Very Strong Prior

τ20μnμ0\tau^2 \to 0 \Rightarrow \mu_n \to \mu_0

11. Connection to Machine Learning

Ridge Regression

  • Normal prior on weights

  • Equivalent to L2 regularization


Kalman Filter

  • Repeated Normal–Normal updates


12. Bayesian vs Frequentist

AspectBayesian    Frequentist
Estimate    Distribution    Point
Uncertainty    Explicit    Asymptotic
Prior knowledge    Included    Ignored

13. Summary

  • Normal–Normal is a conjugate Bayesian model

  • Posterior is Gaussian

  • Mean is precision-weighted average

  • Variance shrinks with data


One-Line Intuition for Students

Bayesian learning averages what you believed before with what the data tells you, weighted by confidence.

Comments

Popular posts from this blog

Advanced Mathematics for Computer Science HNCST409 KTU BTech Honors 2024 Scheme

AKS Primality Testing

Galois Field and Operations