1. Definition

  • The Beta distribution is a continuous probability distribution defined on the interval $[0,1]$.
  • It is controlled by two shape parameters: $\alpha$ and $\beta$.
  • Probability density function (PDF):

$f(p \mid \alpha, \beta) = \frac{1}{B(\alpha, \beta)} \; p^{\alpha – 1} (1-p)^{\beta – 1}, \quad 0 \leq p \leq 1$

where

  • $B(\alpha, \beta)$ = Beta function = normalizing constant.

2. Key Properties

  • Support: $[0,1]$ → often used to model probabilities or proportions.
  • Mean:

$E[p] = \frac{\alpha}{\alpha + \beta}$

  • Variance:

$Var(p) = \frac{\alpha \beta}{(\alpha + \beta)^2 (\alpha + \beta + 1)}$

  • Mode (if $\alpha, \beta > 1$):

$\text{Mode} = \frac{\alpha – 1}{\alpha + \beta – 2}$


3. Shapes of Beta Distributions

Depending on $\alpha$ and $\beta$:

  • $\alpha = \beta = 1$: Uniform(0,1).
  • $\alpha > \beta$: Skewed toward 1.
  • $\alpha < \beta$: Skewed toward 0.
  • Both $\alpha, \beta > 1$: Unimodal distribution.
  • Both $\alpha, \beta < 1$: U-shaped (peaks near 0 and 1).

4. Connection to Binomial Likelihood (Conjugacy)

  • The Beta distribution is the conjugate prior for the Binomial likelihood.
  • Prior: $p \sim \text{Beta}(\alpha, \beta)$.
  • Data: $k$ successes in $n$ trials.
  • Posterior:

$p \mid \text{data} \sim \text{Beta}(\alpha + k, \; \beta + n – k)$

This makes Bayesian updating simple.


5. Examples

(a) Uniform Prior

  • Beta(1,1) → flat prior on [0,1].
  • No preference before data.

(b) Strong Belief Around 0.5

  • Beta(20,20) → very peaked near 0.5.
  • Encodes strong prior belief in fairness (like a fair coin).

(c) After Observing Data

  • Prior: Beta(1,1).
  • Data: 7 heads, 3 tails.
  • Posterior: Beta(1+7, 1+3) = Beta(8,4).
  • Mean = $8 / (8+4) = 0.67$.
    Updated belief about $p$ after observing data.

6. Applications

  • Bayesian Inference: posterior distributions of probabilities.
  • A/B Testing: posterior belief about conversion rates.
  • Machine Learning: Dirichlet distribution (generalization of Beta).
  • Reliability analysis: modeling success/failure rates.
  • Proportion modeling: when outcomes are percentages or rates.

7. Key Takeaways

  • The Beta distribution is a flexible family of distributions on $[0,1]$.
  • Controlled by shape parameters $\alpha$ and $\beta$.
  • Conjugate prior for the Binomial likelihood.
  • Central in Bayesian updating for probabilities (like coin tosses, conversion rates).

In short:
The Beta distribution is a probability distribution on $[0,1]$, parameterized by $\alpha, \beta$. It’s the conjugate prior for the binomial likelihood, making it essential in Bayesian inference about probabilities (e.g., coin bias, conversion rates).