1) What an ARMA model is trying to do
A time series is a sequence of observations over time (sales per month, temperature per day, etc.).
An ARMA model is a mathematical way to describe how today’s value relates to past values and past random shocks, so that we can:
- explain the “memory” or dependence in the data, and
- forecast future values in a principled way.
ARMA is short for:
- AR = AutoRegressive: today depends on past values of the series itself
- MA = Moving Average: today depends on past random shocks (“innovations”)
2) The ARMA(p, q) definition, in simple terms
An ARMA(p, q) process is a stationary time series that satisfies:
Key objects
(a) White noise
Think of as the “new surprise” at time :
- mean 0
- constant variance
- no correlation across time
In applied terms: is the unpredictable part.
(b) The backshift operator
is a compact way to denote “one step back in time”:
(c) The AR polynomial
When you replace with , this becomes an operator:
So the AR part says: “today’s value minus a weighted sum of the last values…”
(d) The MA polynomial
Similarly,
So the MA part says: “…equals today’s new shock plus a weighted sum of the last shocks.”
Expanded form (what it really means)
Interpretation:
- The left side captures “predictable structure” based on past ’s.
- The right side captures “random disturbances” and how they may persist for a short time.
3) Why “no common factors / no common roots” matters
The definition requires that and have no common roots (equivalently no common factor).
Why this is necessary
If they share a factor, then part of the model is redundant and can be cancelled out, meaning:
- you might write it like ARMA(p+r, q+r)
- but after simplification it is really a smaller ARMA(p, q)
This is about identifying the true orders and .
Without this condition, many different-looking equations could describe the same process.
Example idea (conceptual)
If both sides contain a factor like , you can divide both sides by and get a simpler equivalent model.
4) Worked example: why an apparent ARMA(2,2) becomes ARMA(1,1)
You are given:
This corresponds to:
- AR polynomial:
- MA polynomial:
Both can be factored:
They share the common factor , so cancel it:
That is ARMA(1,1).
Meaning
Even though the original equation included terms up to lag 2 on both sides, one “chunk” was duplicated on both sides. After removing that duplication, the true memory lengths are 1 and 1.
5) Factoring polynomials in practice (why software is used)
To check redundancy and to test causality/invertibility, you need the roots of and .
High-order polynomials are tedious to factor by hand, so in practice you compute roots numerically (e.g., with a function like polyroot in R).
If the same root appears in both and , you have a common factor.
6) How the sign of parameters affects “shape” of the series (MA(1) and AR(1))
MA(1):
- If : consecutive values tend to move in the same direction more often (looks smoother).
- If : consecutive values tend to alternate direction (looks “jumpy” or zig-zag).
This connects to the lag-1 autocorrelation:
So the sign of directly controls whether lag-1 correlation is positive or negative.
AR(1):
- If and close to 1: strong persistence; values drift slowly (high momentum).
- If : alternation; values tend to flip sign/direction (more jagged).
Lag- autocorrelation behaves like:
7) Causality and invertibility for ARMA(p, q)
These two properties sound abstract but are extremely practical.
7.1 Causality (for forecasting usefulness)
An ARMA process is causal if it can be written as:
Meaning: can be expressed using only current and past shocks—no future information.
This is what you want for real forecasting: you can generate today’s value using past surprises.
7.2 Invertibility (for interpreting shocks)
It is invertible if you can rewrite shocks in terms of observed ’s:
Meaning: from the observed series, you can recover the underlying “innovation shocks” in a stable way.
Invertibility is important because many different MA representations can generate the same autocorrelation structure. Invertibility picks a standard, well-behaved representation.
8) The “unit circle” test (the practical criterion)
This is the main operational result:
- Causal if and only if all roots of $\phi(z)$ lie outside the unit circle: .
- Invertible if and only if all roots of $\theta(z)$ lie outside the unit circle: .
Why “outside the unit circle”?
Because when roots are outside , certain infinite series expansions converge, which makes:
- the causal expansion (in past shocks) stable, and
- the invertible expansion (recovering shocks from data) stable.
If a root is inside the unit circle, the corresponding infinite expansion “blows up” or becomes unstable.
9) Example: determine (p, q), causality, invertibility
Given:
Candidate polynomials:
- (degree 3)
- (degree 2)
The computed roots show one common root appears in both, so there is redundancy. After removing the common factor, the true degrees become:
- AR degree 2
- MA degree 1
So the process is ARMA(2,1).
Causality check
Remaining AR roots: and
Therefore causal.
Invertibility check
Remaining MA root:
Therefore not invertible.
Simplified final equation
After cancellation, the model can be written without redundancy as:
10) Practical intuition: what ARMA(p, q) “means” in one sentence
An ARMA(p, q) model says:
- Today’s value is partly explained by a weighted combination of the last values (AR),
- plus a weighted combination of the last surprises (MA),
- and the model is set up so this structure is stable over time (stationary),
- and, ideally, usable in the real world (causal) and uniquely interpretable (invertible).
