Definition

Drift detection is the process of monitoring and identifying changes in data distributions or model behavior over time that can harm model performance.

  • Real-world data often evolves (seasonality, new users, business changes).
  • Drift detection ensures the model stays valid and triggers retraining or recalibration when needed.

Types of Drift

  1. Covariate Drift (Feature Drift)
    • Input feature distributions change.
    • Example: previously 80% desktop users, now 60% mobile.
  2. Label Drift (Prior Probability Shift)
    • Distribution of target labels changes.
    • Example: fraud rate rises from 1% → 3%.
  3. Concept Drift
    • Relationship between features and labels changes.
    • Example: same features no longer predict fraud well because fraud tactics evolved.

How Drift is Detected

  1. Statistical Tests
    • Kolmogorov–Smirnov (KS test), Chi-Square test → compare old vs. new distributions.
    • Cramér’s V, PSI (Population Stability Index).
  2. Distance Measures
    • KL divergence, Jensen–Shannon divergence, Wasserstein distance, MMD (Maximum Mean Discrepancy), Energy distance.
  3. Classifier Two-Sample Tests
    • Train a classifier to distinguish “old vs. new” samples.
    • If it performs well → drift detected.
  4. Monitoring Model Performance (Lagging Indicator)
    • Track loss, AUC, accuracy over time.
    • A sudden drop signals possible drift.

Practical Approaches

  • Set baselines → distributions from training data.
  • Compare rolling windows → current vs. historical data (last 24h vs. last 4 weeks).
  • Alert thresholds → drift > 0.2 PSI → trigger investigation.
  • Integrate with pipelines → retrain or recalibrate automatically.

Example

  • Credit scoring model trained on 2022 data.
  • In 2025, income distribution shifts (more gig economy workers).
  • Drift detection system (using PSI) signals covariate drift.
  • Model monitor flags possible concept drift → retraining triggered.

Why Drift Detection Matters

  • Prevents silent model degradation.
  • Ensures fairness (drift can affect subgroups disproportionately).
  • Supports compliance (financial, medical AI regulations require monitoring).

Summary
Drift detection = monitoring changes in data or model behavior that can degrade performance.

  • Types: covariate drift, label drift, concept drift.
  • Techniques: statistical tests, distance measures, classifier tests, performance monitoring.
  • Goal: ensure models stay accurate, fair, and reliable in production.