1. What are Percentiles?

  • A percentile is a quantile expressed in percentage terms.
  • Example:
    • 25th percentile = value below which 25% of data fall.
    • 50th percentile = median.
    • 90th percentile = value below which 90% of data fall.

So predicting percentiles = estimating these cut-points of the outcome distribution, possibly conditional on predictors.


2. Why Predict Percentiles?

  • Beyond the mean: Standard regression predicts the average (mean).
  • Percentiles tell us about distributional behavior:
    • Lower percentiles = pessimistic scenarios.
    • Higher percentiles = optimistic scenarios.
  • Useful in risk assessment, demand forecasting, and fairness analysis.

Examples:

  • Retail: predict 90th percentile demand (to set stock buffers).
  • Finance: predict 5th percentile return (Value-at-Risk).
  • Medicine: predict 95th percentile of wait times (worst-case planning).

3. How to Predict Percentiles

A. Quantile Regression

  • Directly models conditional quantiles: $Q_\alpha(y|x) = x^\top \beta_\alpha$​
  • Choose different α\alphaα levels (e.g., 0.1, 0.5, 0.9) to predict multiple percentiles.

B. Distributional Forecasting

  • Fit a parametric distribution (e.g., Normal, Lognormal) to predictions.
  • Then compute percentiles from the fitted CDF.
    • Example: If $y \sim N(\mu,\sigma^2)$, the 95th percentile is $\mu + 1.645\sigma$.

C. Empirical / Simulation-Based

  • Use bootstrapping, Bayesian posterior samples, or ensembles.
  • Collect predictive samples and compute empirical percentiles.

4. Example

Suppose we want to forecast daily demand.

  • Model outputs:
    • 10th percentile = 85 units
    • 50th percentile = 100 units (median forecast)
    • 90th percentile = 120 units

Interpretation:

  • Most likely demand ≈ 100
  • In a pessimistic scenario (low demand), ≈ 85
  • In a high-demand scenario, ≈ 120

5. Relation to Prediction Intervals

  • A prediction interval is defined by two percentiles.
  • Example: A 90% prediction interval = [5th percentile, 95th percentile].

6. Applications

  • Finance: Value-at-Risk = a lower percentile of return distribution.
  • Forecasting: supply chain, electricity load, weather extremes.
  • Medicine: survival analysis (percentile life expectancy).
  • Recommender Systems: estimate distribution of rating or engagement percentiles.

Summary:
Predicting percentiles = estimating conditional quantiles of the outcome (e.g., 10th, 50th, 90th) instead of just the mean. Methods include quantile regression, distributional modeling, and simulation/bootstrapping. Percentile predictions provide richer information, enabling risk management, uncertainty quantification, and scenario planning.