Definition

Isotonic Regression is a non-parametric regression technique that fits a monotonically increasing (isotonic) function to data.

  • “Isotonic” = values never decrease (monotone increasing).
  • Goal: Find the best-fitting non-decreasing curve that minimizes squared error.

In short: It smooths data into a stepwise non-decreasing function.


Mathematical Formulation

Given data $(x_i, y_i)$ with an ordering in $x$, find fitted values $\hat{y}_i$​i​ such that:

  1. Monotonic constraint: $\hat{y}_1 \leq \hat{y}_2 \leq \dots \leq \hat{y}_n$
  2. Optimization objective: Minimize squared error $\sum_{i=1}^n (y_i – \hat{y}_i)^2$

Solution is typically piecewise constant (“step function”).


How It Works

  • Sort the data by predictor $x$.
  • Fit a non-decreasing sequence to $y$.
  • Use the Pool Adjacent Violators Algorithm (PAVA) to enforce monotonicity efficiently.

Example

Suppose we have probabilities predicted by a model and observed frequencies:

Predicted ScoreObserved Proportion
0.20.25
0.40.35
0.60.55
0.80.50 (decrease!)

Since 0.50 < 0.55 violates monotonicity, isotonic regression adjusts the last two to maintain increasing order, e.g.:

  • Adjusted: 0.525, 0.525

Now the fitted line is monotonic.


Applications

  1. Probability Calibration (most common in ML)
    • Models like SVMs, Random Forests often produce overconfident or underconfident probabilities.
    • Isotonic Regression recalibrates raw scores → true probabilities (monotone transformation).
  2. Dose-Response Analysis
    • In medicine/pharmacology, expected response should increase with dose.
    • Isotonic regression enforces monotonicity.
  3. Economics/Ranking Problems
    • When output should logically be non-decreasing with predictor variables.

Advantages & Limitations

Non-parametric (no functional form assumption).
Handles monotonicity naturally.
Useful for calibration.

Can overfit if sample size is small.
Produces stepwise function (not smooth).
Only guarantees monotonicity, not linearity or smoothness.


In Practice (ML Calibration Example)

  • Train classifier → get raw scores/logits.
  • Use isotonic regression on a held-out validation set to map scores → calibrated probabilities.
  • Final probabilities are better aligned with true frequencies.

In short:
Isotonic Regression fits a non-decreasing function to data, making it perfect for tasks where the relationship must be monotonic (like probability calibration).