What is A/B Testing?

  • A/B Test = a randomized controlled experiment where you compare two (or more) variants to measure which performs better on a key metric.
  • Typical setup:
    • Group A (Control): Users see the current version (status quo).
    • Group B (Treatment): Users see the new version (change, feature, campaign).

The goal:

determine whether the observed difference in outcomes is statistically significant and caused by the treatment.


Workflow of an A/B Test

  1. Define objective & hypothesis
    • Example: “Does showing a discount banner increase conversion rate?”
    • Null hypothesis $H_0$​: No difference between A and B.
    • Alternative hypothesis $H_1$​: Treatment increases conversion.
  2. Choose success metric(s)
  3. Random assignment
    • Randomly split users into Control (A) and Treatment (B).
    • Ensures groups are statistically equivalent.
  4. Run experiment
    • Collect data for a sufficient sample size and duration.
  5. Statistical testing
  6. Decision
    • If treatment effect is significant (and positive): roll out.
    • If not: keep A (status quo).

Example

Suppose:

  • Group A (Control): 10,000 users, 800 conversions → 8% conversion rate.
  • Group B (Treatment): 10,000 users, 880 conversions → 8.8% conversion rate.

Effect size:

$8.8\% – 8.0\% = 0.8\%\ \text{point increase}$

Relative lift:

$\frac{0.88 – 0.80}{0.80} = 10\% \text{ uplift}$

If statistical test shows p < 0.05, we conclude B significantly outperforms A.


Advantages

  • Simple, intuitive.
  • High internal validity (causal inference).
  • Can measure real-world impact directly.

Limitations

  • Requires enough traffic and time to reach statistical power.
  • May not generalize if sample ≠ population.
  • Risk of peeking (stopping early when results look significant).
  • Only compares a few variants at once (multivariate testing needed for more).

Variants


Summary
An A/B Test is a randomized experiment comparing Control (A) vs Treatment (B) to measure causal impact on a defined metric.
It’s the foundation of online experimentation in product, marketing, and ML-driven decision systems.