What is A/B Testing?
- A/B Test = a randomized controlled experiment where you compare two (or more) variants to measure which performs better on a key metric.
- Typical setup:
- Group A (Control): Users see the current version (status quo).
- Group B (Treatment): Users see the new version (change, feature, campaign).
The goal:
determine whether the observed difference in outcomes is statistically significant and caused by the treatment.
Workflow of an A/B Test
- Define objective & hypothesis
- Example: “Does showing a discount banner increase conversion rate?”
- Null hypothesis $H_0$: No difference between A and B.
- Alternative hypothesis $H_1$: Treatment increases conversion.
- Choose success metric(s)
- Primary metric: e.g., Conversion Rate, Revenue per User, Click-through Rate.
- Secondary metrics: retention, churn, session length.
- Random assignment
- Randomly split users into Control (A) and Treatment (B).
- Ensures groups are statistically equivalent.
- Run experiment
- Collect data for a sufficient sample size and duration.
- Statistical testing
- Compare means/proportions.
- Use t-test, z-test, chi-square (depending on metric).
- Compute p-value or confidence interval.
- Decision
- If treatment effect is significant (and positive): roll out.
- If not: keep A (status quo).
Example
Suppose:
- Group A (Control): 10,000 users, 800 conversions → 8% conversion rate.
- Group B (Treatment): 10,000 users, 880 conversions → 8.8% conversion rate.
Effect size:
$8.8\% – 8.0\% = 0.8\%\ \text{point increase}$
Relative lift:
$\frac{0.88 – 0.80}{0.80} = 10\% \text{ uplift}$
If statistical test shows p < 0.05, we conclude B significantly outperforms A.
Advantages
- Simple, intuitive.
- High internal validity (causal inference).
- Can measure real-world impact directly.
Limitations
- Requires enough traffic and time to reach statistical power.
- May not generalize if sample ≠ population.
- Risk of peeking (stopping early when results look significant).
- Only compares a few variants at once (multivariate testing needed for more).
Variants
- A/B/n Test → multiple treatments vs control.
- Multivariate Test (MVT) → multiple factors varied simultaneously.
- Bandit Algorithms → adaptively allocate traffic to better variants.
- Sequential Testing → allows early stopping with corrections.
Summary
An A/B Test is a randomized experiment comparing Control (A) vs Treatment (B) to measure causal impact on a defined metric.
It’s the foundation of online experimentation in product, marketing, and ML-driven decision systems.
