Definition
Early stopping is a regularization technique where you stop training before the model overfits.
- Instead of training for a fixed number of epochs, you monitor validation performance (loss, accuracy, AUC, etc.).
- If performance stops improving (or starts getting worse), training halts.
Why It’s Needed
- More epochs → lower training loss, but risk of overfitting.
- Early stopping ensures you stop at the “sweet spot” → good generalization on unseen data.
- Saves computation (no wasted training epochs).
How It Works
- Split data → training + validation sets.
- Train model for many epochs.
- After each epoch:
- Compute validation metric.
- Track “best score.”
- If metric hasn’t improved for N consecutive epochs (patience), stop training.
Parameters
- Monitor → which metric to watch (validation loss, validation accuracy, AUC).
- Mode → “min” (for loss) or “max” (for accuracy/AUC).
- Patience → how many epochs to wait before stopping.
- Restore Best Weights → option to revert to weights from the best epoch.
Example
- Model set to max 100 epochs.
- Validation loss:
- Improves until epoch 25.
- After epoch 25 → validation loss increases (overfitting).
- With early stopping (patience=3):
- Training stops at epoch 28, restoring best weights from epoch 25.
Benefits
- Prevents overfitting.
- Reduces training cost/time.
- Automatically finds near-optimal stopping point.
Drawbacks
- Needs a validation set.
- Can stop too early if metric fluctuates (patience helps avoid this).
Summary
Early stopping = stop training when validation performance stops improving.
- Prevents overfitting and saves compute.
- Controlled by patience, monitored metric, and best weight restoration.
