1. Definition
- The Beta distribution is a continuous probability distribution defined on the interval $[0,1]$.
- It is controlled by two shape parameters: $\alpha$ and $\beta$.
- Probability density function (PDF):
$f(p \mid \alpha, \beta) = \frac{1}{B(\alpha, \beta)} \; p^{\alpha – 1} (1-p)^{\beta – 1}, \quad 0 \leq p \leq 1$
where
- $B(\alpha, \beta)$ = Beta function = normalizing constant.
2. Key Properties
- Support: $[0,1]$ → often used to model probabilities or proportions.
- Mean:
$E[p] = \frac{\alpha}{\alpha + \beta}$
- Variance:
$Var(p) = \frac{\alpha \beta}{(\alpha + \beta)^2 (\alpha + \beta + 1)}$
- Mode (if $\alpha, \beta > 1$):
$\text{Mode} = \frac{\alpha – 1}{\alpha + \beta – 2}$
3. Shapes of Beta Distributions
Depending on $\alpha$ and $\beta$:
- $\alpha = \beta = 1$: Uniform(0,1).
- $\alpha > \beta$: Skewed toward 1.
- $\alpha < \beta$: Skewed toward 0.
- Both $\alpha, \beta > 1$: Unimodal distribution.
- Both $\alpha, \beta < 1$: U-shaped (peaks near 0 and 1).
4. Connection to Binomial Likelihood (Conjugacy)
- The Beta distribution is the conjugate prior for the Binomial likelihood.
- Prior: $p \sim \text{Beta}(\alpha, \beta)$.
- Data: $k$ successes in $n$ trials.
- Posterior:
$p \mid \text{data} \sim \text{Beta}(\alpha + k, \; \beta + n – k)$
This makes Bayesian updating simple.
5. Examples
(a) Uniform Prior
- Beta(1,1) → flat prior on [0,1].
- No preference before data.
(b) Strong Belief Around 0.5
- Beta(20,20) → very peaked near 0.5.
- Encodes strong prior belief in fairness (like a fair coin).
(c) After Observing Data
- Prior: Beta(1,1).
- Data: 7 heads, 3 tails.
- Posterior: Beta(1+7, 1+3) = Beta(8,4).
- Mean = $8 / (8+4) = 0.67$.
Updated belief about $p$ after observing data.
6. Applications
- Bayesian Inference: posterior distributions of probabilities.
- A/B Testing: posterior belief about conversion rates.
- Machine Learning: Dirichlet distribution (generalization of Beta).
- Reliability analysis: modeling success/failure rates.
- Proportion modeling: when outcomes are percentages or rates.
7. Key Takeaways
- The Beta distribution is a flexible family of distributions on $[0,1]$.
- Controlled by shape parameters $\alpha$ and $\beta$.
- Conjugate prior for the Binomial likelihood.
- Central in Bayesian updating for probabilities (like coin tosses, conversion rates).
In short:
The Beta distribution is a probability distribution on $[0,1]$, parameterized by $\alpha, \beta$. It’s the conjugate prior for the binomial likelihood, making it essential in Bayesian inference about probabilities (e.g., coin bias, conversion rates).
