1) What a time series “correlation over time” means
When you observe a sequence over time—such as monthly sales, daily temperatures, or yearly lake levels—you often want to know whether today’s value is related to past values.
Two closely related tools help quantify this:
- Autocorrelation (ACF): how strongly the series is correlated with itself after shifting by time steps (called lag $h$).
- Partial autocorrelation (PACF): how strongly the series at time is related to the series at time after removing the influence of the intermediate lags .
In practice, you do not know the true (population) ACF/PACF of the data-generating mechanism, so you compute sample versions from the finite dataset you have.
2) Sample ACF: what it is and how it is computed
Assume you observe data .
Step A: compute the sample mean
Step B: compute the sample autocovariance at lag
For a lag (positive or negative, but usually we focus on ):
- This measures how values steps apart “move together,” centered around the mean.
- The sum ends at so that stays within the observed data range.
Step C: convert autocovariance to autocorrelation
- is the sample variance (up to the scaling), so dividing by it produces a dimensionless correlation between and (approximately, in typical cases).
Interpretation:
- If is large and positive, values steps apart tend to be similar.
- If it is large and negative, high values tend to be followed steps later by low values (and vice versa).
- If it is near 0, the series shows little linear dependence at that lag.
3) Sample PACF: what it is conceptually
Autocorrelation at lag can be “indirect.”
Example: If depends strongly on , and depends on , then and may look correlated even if there is no direct relationship beyond the chain through lag 1.
Partial autocorrelation at lag $h$ aims to measure the direct relationship between and after accounting for lags 1 through .
4) Sample PACF: how it is computed
To compute the sample PACF at lag , you fit a linear regression-style relationship:
- Predict using the previous values:
- The sample partial autocorrelation at lag $h$ is: meaning the coefficient on the farthest lag when you allow all intermediate lags into the predictor set.
The computation uses a system of equations built from sample autocovariances
The coefficients solve a Toeplitz linear system (a symmetric matrix whose entries depend on lag differences). In compact matrix form:
Where:
- is the matrix with entries
- is the corresponding matrix with entries
Practical meaning:
PACF is not computed by a simple single formula at each lag; it is derived from solving a linear algebra problem that “removes” the effects of intermediate lags.
5) Why these sample quantities are treated as random
Even if the underlying process (the “true mechanism”) is fixed, your observed dataset is only one realization—one random outcome—from that mechanism.
So , , and vary from sample to sample. If you simulated the same process many times, you would get many slightly different sample ACF/PACF curves.
6) Large-sample behavior: what happens when the dataset is long
Assume the data were generated by a standard time-series model such as an ARMA(p, q) process (a broad family that includes AR and MA models).
If:
- the length is very large, and
- the lag you care about is much smaller than (written ),
then:
- Sample autocovariance and autocorrelation stabilize
- tends to be close to the true autocovariance
- tends to be close to the true autocorrelation
- Sample PACF stabilizes for pure AR models
If the true process is a causal AR(p) process, then tends to be close to the true PACF value . - Approximate normality and shrinking variability
A more refined statement is that the vector is approximately multivariate normal around , and its variance decreases roughly like .
So as grows, the sample ACF/PACF plots become more stable and less noisy.
Intuition: longer datasets give more repeated evidence of how the process behaves, so the estimated correlations become more reliable.
7) How ACF and PACF help choose a model form (AR vs MA vs ARMA)
A common practical use of ACF and PACF plots is preliminary order selection—forming an initial guess for what kind of ARMA-type model might be reasonable.
The classic heuristics are:
A) MA(q) signature (moving average)
- ACF cuts off after lag (drops to near 0 beyond )
- PACF tails off gradually
So if the sample ACF becomes negligible after lag 2, an MA(2) is a plausible candidate.
B) AR(p) signature (autoregressive)
- PACF cuts off after lag
- ACF tails off gradually
So if the sample PACF becomes negligible after lag 3 while the ACF decays slowly, an AR(3) is a plausible candidate.
C) Mixed ARMA signature
- Both ACF and PACF tail off
- That suggests a mixed ARMA(p, q), but choosing and q from plots alone is harder and typically needs more systematic selection methods later.
Important caution: These are heuristics, not guarantees—real data is messy, and finite samples can make “cutoffs” look imperfect.
8) Why real data is different from simulated data
Simulated AR/MA/ARMA examples are “clean” because they match the theoretical assumptions exactly.
Real datasets:
- may include trends, seasonality, structural breaks, outliers, and changing variance,
- and are rarely generated by a perfect ARMA model.
Still, ARMA models are often useful approximations within a practical tolerance. The guiding mindset is: a model can be imperfect and still be valuable for understanding and forecasting.
9) Concrete example idea: LakeHuron
A real dataset (annual lake levels) is used to illustrate that:
- the ACF/PACF may suggest something like an AR(2),
- but other nearby choices (like AR(1)) might be competitive and simpler.
This highlights a common modeling tradeoff:
- fit versus complexity
A simpler model may be preferred if it performs nearly as well.
10) Practical takeaway
- Sample ACF summarizes “overall” correlation patterns across lags.
- Sample PACF isolates “direct” lag relationships after accounting for intermediates.
- With a sufficiently long dataset, these estimates become more stable and closer to the underlying truth (when an ARMA-style approximation is reasonable).
- The shapes of ACF/PACF plots provide a strong initial signal for whether an AR, MA, or mixed ARMA model is a reasonable starting point, and for guessing the order in the pure AR or pure MA cases.
