Information Theory & Entropy

November 07, 2025

Information Theory & Entropy

What is Information Theory?

Information theory studies:

how information is measured
how much uncertainty exists in data
how efficiently information can be encoded or transmitted

Introduced by Claude Shannon, it forms the basis of:

Data compression (ZIP, JPEG)
Machine learning loss functions
Communication systems
Cryptography

What is Entropy?

✅ Definition

Entropy measures the uncertainty or randomness in a probability distribution.

High entropy → very unpredictable outcome
Low entropy → predictable outcome

📐 Formula

For probabilities $P = \{p_1, p_2, ..., p_n\}$

H(P) = -\sum_{i=1}^{n} p_i \log_2 p_i

Unit = bits

Intuition

Situation	Entropy
Fair coin	High uncertainty → High entropy
Biased coin (always heads)	No uncertainty → Entropy = 0
Uniform distribution	Maximum entropy

Example Calculations

🎲 Example 1 — Fair Coin

P(H)=0.5,\; P(T)=0.5

H = -[0.5\log_2 0.5 + 0.5\log_2 0.5] = 1 \text{ bit}

👉 Interpretation: Need 1 bit to encode outcome.

🎲 Example 2 — Biased Coin

P(H)=0.9,\; P(T)=0.1

H = -(0.9\log_2 0.9 + 0.1\log_2 0.1) \approx 0.47 \text{ bits}

👉 Less uncertainty → lower entropy.

🎲 Example 3 — Deterministic Outcome

P(H)=1,\; P(T)=0

H = 0

👉 No uncertainty → zero information needed.

Significance of Entropy

📌 In Machine Learning

Measures impurity in decision trees
Used in information gain
Basis for cross-entropy loss

📌 In Communication

Minimum number of bits needed to encode message
Guides optimal compression schemes

📌 In Probability & Statistics

Quantifies unpredictability
Helps compare distributions

Python Code to Compute Entropy

✅ Basic Function


import numpy as np

def entropy(prob):
    prob = np.array(prob)
    prob = prob[prob > 0]   # remove zero probabilities
    return -np.sum(prob * np.log2(prob))

# examples
P1 = [0.5, 0.5]
P2 = [0.9, 0.1]
P3 = [0.6, 0.3, 0.1]

print("Entropy of fair coin:", entropy(P1))
print("Entropy of biased coin:", entropy(P2))
print("Entropy of [0.6,0.3,0.1]:", entropy(P3))

Visualization Code


import numpy as np
import matplotlib.pyplot as plt

p = np.linspace(0.001, 0.999, 100)
H = -(p*np.log2(p) + (1-p)*np.log2(1-p))

plt.plot(p, H)
plt.xlabel("P(Heads)")
plt.ylabel("Entropy (bits)")
plt.title("Entropy of a Coin vs Probability")
plt.show()

👉 Shows entropy is maximum at p = 0.5

Summary

Entropy = amount of surprise in the outcome.

Predictable event → low entropy
Random event → high entropy

Search This Blog

Advanced Mathematics for Computer Science HNCST409 KTU BTech Honors 2024 Scheme- Dr Binu V P

Information Theory & Entropy

Information Theory & Entropy

What is Information Theory?

What is Entropy?

✅ Definition

📐 Formula

Intuition

Example Calculations

🎲 Example 1 — Fair Coin

🎲 Example 2 — Biased Coin

🎲 Example 3 — Deterministic Outcome

Significance of Entropy

📌 In Machine Learning

📌 In Communication

📌 In Probability & Statistics

Python Code to Compute Entropy

✅ Basic Function

Visualization Code

Summary

Comments

Post a Comment

Popular posts from this blog

Advanced Mathematics for Computer Science HNCST409 KTU BTech Honors 2024 Scheme

Convex and Non Convex Sets

Gradient-Based Methods for Optimization-Gradient Descent Algorithm