1. What is a Prediction Interval?

  • A prediction interval (PI) gives a range of likely values for a future observation, not just a point estimate.
  • Unlike a confidence interval (which covers the true mean of $y$), a prediction interval covers individual future outcomes.

Think:

  • Confidence interval → “I’m 95% confident the mean weight is between 68–72 kg.”
  • Prediction interval → “I’m 95% confident the next person’s weight will be between 55–85 kg.”

2. Formula (Regression Case)

For linear regression at a given $x_0$​, predicted value is $\hat{y}_0$​.

A $100(1-\alpha)\%$ prediction interval is:

$\hat{y}_0 \; \pm \; t_{\alpha/2, \, n-p} \; \cdot \; \hat{\sigma} \sqrt{1 + h_0}$

Where:

  • $\hat{\sigma}$ = estimated standard deviation of residuals
  • $h_0 = x_0^\top (X^\top X)^{-1} x_0$​ = leverage of the new point
  • The extra “1 + h0” term (compared to confidence intervals) accounts for future observation noise.

3. In Forecasting

  • Time series forecasting often outputs prediction intervals instead of just a single trajectory.
  • Example:
    • Forecasted demand next week = 500 units.
    • 95% PI = [420, 580].
  • This interval quantifies uncertainty due to model error, randomness, and future shocks.

4. Using Quantiles

Another approach is via quantile regression / quantile forecasting:

  • Lower quantile ($\alpha = 0.05$) = lower bound.
  • Upper quantile ($\alpha = 0.95$) = upper bound.

Example:

  • Predicted 0.05 quantile = 420
  • Predicted 0.95 quantile = 580
  • Then 90% PI = [420, 580].

5. Example

Suppose you build a regression model to predict house prices:

  • Predicted price = $300,000
  • 95% prediction interval = [$250,000, $360,000]

Interpretation:

“If we sample another house with the same features, there’s a 95% chance its price will fall in this interval.”


6. Key Distinction

  • Confidence interval: Uncertainty about the mean prediction. (narrower)
  • Prediction interval: Uncertainty about individual outcomes. (wider, because it includes variance of residuals).

7. Applications

  • Finance: Forecasting stock returns with risk bands.
  • Retail: Demand planning (upper bound = safety stock).
  • Healthcare: Patient recovery time ranges.
  • Weather: Forecast temperature ranges (e.g., 68–74°F).

Summary:

  • A prediction interval gives a range for future observations, wider than confidence intervals because it includes outcome variability.
  • Constructed from regression formulas or quantile forecasting.
  • Widely used to quantify uncertainty in predictions.