1. What is a Prediction Interval?
- A prediction interval (PI) gives a range of likely values for a future observation, not just a point estimate.
- Unlike a confidence interval (which covers the true mean of $y$), a prediction interval covers individual future outcomes.
Think:
- Confidence interval → “I’m 95% confident the mean weight is between 68–72 kg.”
- Prediction interval → “I’m 95% confident the next person’s weight will be between 55–85 kg.”
2. Formula (Regression Case)
For linear regression at a given $x_0$, predicted value is $\hat{y}_0$.
A $100(1-\alpha)\%$ prediction interval is:
$\hat{y}_0 \; \pm \; t_{\alpha/2, \, n-p} \; \cdot \; \hat{\sigma} \sqrt{1 + h_0}$
Where:
- $\hat{\sigma}$ = estimated standard deviation of residuals
- $h_0 = x_0^\top (X^\top X)^{-1} x_0$ = leverage of the new point
- The extra “1 + h0” term (compared to confidence intervals) accounts for future observation noise.
3. In Forecasting
- Time series forecasting often outputs prediction intervals instead of just a single trajectory.
- Example:
- Forecasted demand next week = 500 units.
- 95% PI = [420, 580].
- This interval quantifies uncertainty due to model error, randomness, and future shocks.
4. Using Quantiles
Another approach is via quantile regression / quantile forecasting:
- Lower quantile ($\alpha = 0.05$) = lower bound.
- Upper quantile ($\alpha = 0.95$) = upper bound.
Example:
- Predicted 0.05 quantile = 420
- Predicted 0.95 quantile = 580
- Then 90% PI = [420, 580].
5. Example
Suppose you build a regression model to predict house prices:
- Predicted price = $300,000
- 95% prediction interval = [$250,000, $360,000]
Interpretation:
“If we sample another house with the same features, there’s a 95% chance its price will fall in this interval.”
6. Key Distinction
- Confidence interval: Uncertainty about the mean prediction. (narrower)
- Prediction interval: Uncertainty about individual outcomes. (wider, because it includes variance of residuals).
7. Applications
- Finance: Forecasting stock returns with risk bands.
- Retail: Demand planning (upper bound = safety stock).
- Healthcare: Patient recovery time ranges.
- Weather: Forecast temperature ranges (e.g., 68–74°F).
Summary:
- A prediction interval gives a range for future observations, wider than confidence intervals because it includes outcome variability.
- Constructed from regression formulas or quantile forecasting.
- Widely used to quantify uncertainty in predictions.
