Definition

Early stopping is a regularization technique where you stop training before the model overfits.

  • Instead of training for a fixed number of epochs, you monitor validation performance (loss, accuracy, AUC, etc.).
  • If performance stops improving (or starts getting worse), training halts.

Why It’s Needed

  • More epochs → lower training loss, but risk of overfitting.
  • Early stopping ensures you stop at the “sweet spot” → good generalization on unseen data.
  • Saves computation (no wasted training epochs).

How It Works

  1. Split data → training + validation sets.
  2. Train model for many epochs.
  3. After each epoch:
    • Compute validation metric.
    • Track “best score.”
  4. If metric hasn’t improved for N consecutive epochs (patience), stop training.

Parameters

  • Monitor → which metric to watch (validation loss, validation accuracy, AUC).
  • Mode → “min” (for loss) or “max” (for accuracy/AUC).
  • Patience → how many epochs to wait before stopping.
  • Restore Best Weights → option to revert to weights from the best epoch.

Example

  • Model set to max 100 epochs.
  • Validation loss:
    • Improves until epoch 25.
    • After epoch 25 → validation loss increases (overfitting).
  • With early stopping (patience=3):
    • Training stops at epoch 28, restoring best weights from epoch 25.

Benefits

  • Prevents overfitting.
  • Reduces training cost/time.
  • Automatically finds near-optimal stopping point.

Drawbacks

  • Needs a validation set.
  • Can stop too early if metric fluctuates (patience helps avoid this).

Summary
Early stopping = stop training when validation performance stops improving.

  • Prevents overfitting and saves compute.
  • Controlled by patience, monitored metric, and best weight restoration.