Time series analysis often relies on the idea of stationarity, which describes how the statistical behavior of data changes—or does not change—over time. There are two commonly used notions of stationarity: strong (strict) stationarity and weak (wide-sense) stationarity. Understanding the difference between them helps clarify how time series models work and why some models are easier to use than others.


1. Strong (Strict) Stationarity

What strong stationarity means

Consider a stochastic process (Xt)tZ(X_t)_{t \in \mathbb{Z}}​, which is a sequence of random variables indexed by time.
This process is called strongly stationary if its entire probabilistic structure is unchanged by shifting time.

More precisely:

  • For any collection of time points t1,t2,,tnt_1, t_2, \dots, t_n​,
  • and for any time shift τ\tau,
  • the joint distribution of
    (Xt1,Xt2,,Xtn)(X_{t_1}, X_{t_2}, \dots, X_{t_n})
    is exactly the same as the joint distribution of
    (Xt1+τ,Xt2+τ,,Xtn+τ)(X_{t_1+\tau}, X_{t_2+\tau}, \dots, X_{t_n+\tau}).

In simple terms:

The probabilistic behavior of the process looks exactly the same, no matter when you observe it.

Consequences of strong stationarity

If a process is strongly stationary, then automatically:

  • All XtX_t​ have the same distribution
  • The mean and variance do not depend on time
  • Any dependence structure between time points depends only on how far apart they are, not on their absolute positions in time

Why this definition is demanding

Strong stationarity requires knowledge of all joint distributions of all orders. In real applications, this level of information is rarely available, which makes strong stationarity difficult to verify or use directly.


2. Covariance and Dependence Over Time

Before introducing weak stationarity, it helps to review how dependence between random variables is measured.

Covariance and correlation

For two random variables XX and YY:

  • Covariance measures linear dependence: Cov(X,Y)=E[(XμX)(YμY)]\text{Cov}(X,Y) = \mathbb{E}[(X – \mu_X)(Y – \mu_Y)]
  • Variance is a special case: Var(X)=Cov(X,X)\text{Var}(X) = \text{Cov}(X,X)
  • Correlation is a normalized version of covariance: corr(X,Y)=Cov(X,Y)σXσY\text{corr}(X,Y) = \frac{\text{Cov}(X,Y)}{\sigma_X \sigma_Y}​ and always lies between −1 and 1.

If two variables are independent, both covariance and correlation are zero.


3. The Covariance Function of a Process

To describe how a time series depends on itself across time, we use the covariance function.

For a process (Xt)(X_t):γX(s,t)=Cov(Xs,Xt)\gamma_X(s,t) = \text{Cov}(X_s, X_t)

This function tells us how strongly values at time sss and time ttt are linearly related.

Examples

White noise

  • Mean = 0
  • Variance = constant
  • No correlation between different time points

Covariance structure:

  • γ(0)=σ2\gamma(0) = \sigma^2
  • γ(h)=0\gamma(h) = 0 for h0h \neq 0

White noise represents pure randomness with no memory.

Random walk

A random walk is formed by summing white noise terms over time.

Key properties:

  • Mean stays at zero
  • Variance grows with time
  • Covariance depends on the time index itself

This dependence on absolute time means a random walk is not stationary.


4. Weak (Wide-Sense) Stationarity

Motivation

Strong stationarity is often too strict. A weaker and much more practical concept focuses only on means and covariances, rather than full distributions.

Definition of weak stationarity

A process (Xt)(X_t) is weakly stationary if:

  1. Constant mean E[Xt]=μfor all t\mathbb{E}[X_t] = \mu \quad \text{for all } t
  2. Time-invariant covariance structure Cov(Xt,Xt+h)=γ(h)\text{Cov}(X_t, X_{t+h}) = \gamma(h) depends only on the lag hh, not on the specific time tt.

This means:

The average level is stable over time, and the way observations relate across time depends only on how far apart they are.

Relationship to strong stationarity

  • Strong stationarity ⇒ weak stationarity (if moments exist)
  • Weak stationarity does not require full distributional invariance

Because it is much easier to verify and still very useful, weak stationarity is the standard assumption in most practical time series modeling.


5. Autocovariance and Autocorrelation

For a stationary process, the covariance function simplifies.

Autocovariance function (ACVF)

γ(h)=Cov(Xt,Xt+h)\gamma(h) = \text{Cov}(X_t, X_{t+h})

Key properties:

  • γ(0)\gamma(0) equals the variance
  • γ(h)=γ(h)\gamma(h) = \gamma(-h)

Autocorrelation function (ACF)

ρ(h)=γ(h)γ(0)\rho(h) = \frac{\gamma(h)}{\gamma(0)}

Properties:

  • ρ(0)=1\rho(0) = 1
  • Values lie between −1 and 1

The ACF shows how strongly a time series is correlated with its past values at different lags.


6. Examples of Stationary and Non-Stationary Processes

White noise

  • Weakly stationary
  • Zero autocorrelation at all non-zero lags

Random walk

  • Not weakly stationary
  • Variance and covariance grow over time

7. Moving Average Process: MA(1)

A simple but important stationary model is the MA(1) process:Xt=Wt+θWt1X_t = W_t + \theta W_{t-1}

where:

  • WtW_t​ is white noise
  • θ\theta is a constant

Key results:

  • Mean is constant
  • Variance is constant
  • Covariance is non-zero only at lags 0 and ±1

This makes MA(1) weakly stationary and easy to analyze.


8. Sample Versions: Working with Real Data

In practice, we do not observe the underlying process—only a finite dataset:x1,x2,,xnx_1, x_2, \dots, x_n

From this data, we compute:

  • Sample mean
  • Sample autocovariance
  • Sample autocorrelation

These sample quantities estimate their theoretical counterparts and are meaningful primarily when the data plausibly come from a stationary process.


9. Correlograms (ACF Plots)

A correlogram is a plot of the sample autocorrelation function.

Why it is useful:

  • Helps identify whether observations are correlated
  • Provides clues about suitable models
  • Indicates whether data resemble white noise or structured dependence

Interpreting the confidence bands

Typical ACF plots include horizontal dashed lines around zero. These indicate approximate bounds for random variation when no true correlation exists.

If most points lie within these bounds:

  • The series is consistent with having little or no autocorrelation

If clear patterns appear (e.g., significant spikes at certain lags):

  • The data likely contain temporal dependence

10. Practical Takeaway

  • Strong stationarity is a mathematically complete but rarely practical concept
  • Weak stationarity focuses on stable mean and covariance and is widely used
  • Autocovariance and autocorrelation summarize temporal dependence
  • Sample ACF plots are essential tools for understanding real time series data

In short:

Weak stationarity provides a realistic balance between mathematical structure and practical usability, making it the foundation of most time series analysis.