Conjugate Prior
Conjugate Prior — Detailed Explanation
1. Bayesian Recap (Why Priors Matter)
In Bayesian inference, we update beliefs using Bayes’ theorem:
Where:
-
= unknown parameter
-
= prior
-
= likelihood
-
= posterior
The challenge: computing the posterior can be mathematically hard.
2. What Is a Conjugate Prior?
Definition
A conjugate prior is a prior distribution such that the posterior distribution belongs to the same family as the prior.
📌 Prior and posterior have the same functional form.
3. Why Are Conjugate Priors Important?
-
Closed-form posterior
-
Easy analytical updates
-
Clear interpretation
-
Computational efficiency
-
Educational clarity
This is why they are widely used in:
-
Machine learning
-
Bayesian statistics
-
Online learning
-
Signal processing
4. Simple Example: Beta–Bernoulli
Likelihood
Prior
Posterior
Where:
-
= number of successes
-
= total trials
📌 Prior and posterior are both Beta distributions.
5. Why Does Conjugacy Work?
The likelihood and prior have compatible mathematical forms.
Example:
-
Bernoulli likelihood:
-
Beta prior:
Multiplying them preserves the Beta form.
6. General Pattern of Conjugate Priors
| Likelihood | Conjugate Prior |
|---|---|
| Bernoulli / Binomial | Beta |
| Multinomial | Dirichlet |
| Poisson | Gamma |
| Exponential | Gamma |
| Gaussian (mean unknown) | Gaussian |
| Gaussian (mean & variance unknown) | Normal–Inverse-Gamma |
7. Gaussian–Gaussian Example (Brief)
Likelihood
Prior
Posterior
📌 Gaussian prior + Gaussian likelihood → Gaussian posterior
8. Conjugate Prior vs Non-Conjugate Prior
| Aspect | Conjugate | Non-Conjugate |
|---|---|---|
| Posterior | Closed form | No closed form |
| Computation | Easy | Requires MCMC / Variational methods |
| Interpretability | High | Lower |
| Flexibility | Limited | High |
9. Interpretation in Terms of “Pseudo-Counts”
In many conjugate priors:
-
Prior behaves like imaginary observations
-
Posterior = prior counts + data counts
Example (Beta prior):
10. Conjugate Priors in Machine Learning
Examples
-
Naive Bayes
-
Bayesian linear regression
-
Topic models (LDA)
-
A/B testing
-
Hidden Markov Models
11. MAP Estimation Connection
MAP estimate:
For conjugate priors:
-
MAP often has closed-form solution
-
MAP ≈ regularized MLE
12. Key Takeaways (Exam-Friendly)
-
Conjugate priors preserve distributional form
-
They simplify Bayesian updating
-
They allow analytical posteriors
-
Widely used in ML and statistics
One-Line Intuition for Students
A conjugate prior is chosen so that Bayesian updating stays mathematically simple.
Comments
Post a Comment