1) What is a “linear process”?
A linear process is a way to build a time-ordered random sequence by taking white noise (pure random shocks) and mixing many time-shifted copies of it with fixed weights.
The general form is:
- is white noise: random values with mean 0, constant variance, and no correlation across time.
- are numbers (weights) that determine how strongly each shock influences .
- The sum uses shocks from many time positions relative to .
Why “linear”?
Because is a linear combination (weighted sum) of the noise terms .
Why do we need a condition on ?
To make this infinite sum mathematically meaningful and stable, we require:
This “absolute summability” condition ensures:
- The infinite sum behaves nicely (does not depend on how you group/reorder terms),
- has finite variance (it does not “blow up”).
2) Why can the formula involve “past, present, and future”?
Look at the term :
- If , then : this uses past shocks (normal and natural).
- If , it uses the current shock .
- If , then : this uses future shocks relative to time .
Using future information to define is usually not physically realistic for forecasting or real-time systems. So people often focus on causal models.
3) Causality: “Depends only on present and past”
A linear process is called causal if it uses only , meaning:
This is also called an MA($\infty$) form (moving average of infinite order).
Important: despite the name “moving average,” the weights do not have to:
- sum to 1,
- be nonnegative.
Those restrictions are used in smoothing averages for data visualization, not in general stochastic modeling.
4) A quick practical intuition for linear processes
Think of as “new random news” or “shocks” arriving at time .
Then a linear process says:
- Today’s value is the result of today’s shock plus echoes of earlier shocks (and possibly, in non-causal forms, future shocks).
- The weights control how long shocks persist and how large their influence is.
If the weights decay quickly, older shocks matter less.
5) Numerical series: why absolute convergence matters
For ordinary sums like , “converges” means partial sums approach a finite value.
“Converges absolutely” means converges.
Absolute convergence is important because it guarantees:
- convergence,
- and that rearranging terms doesn’t change the sum.
For two-sided infinite sums , absolute convergence makes the sum unambiguous.
That same stability idea is used when summing in linear processes.
6) The geometric series: the key tool behind inverses
A classic fact is:
This is the mathematical reason you can “invert” certain operators like when .
7) The backshift operator : a compact way to talk about time shifts
Define the backshift operator by:
So:
-
- (a forward shift)
Differencing in this language
The first difference is:
Seasonal (lag-) differencing is:
So lets you write time-series operations as algebra.
8) Linear filters: “operators that transform a series”
A linear filter is an operator:
Applying it to a time series produces:
This is exactly the same pattern as a linear process, except:
- a linear process applies the filter to white noise ,
- a general filter can apply to any series .
Example: smoothing (two-sided moving average)
A simple smoothing filter replaces each value by the average of itself and nearby values:
This uses both past and future points of the observed data. That is fine for smoothing a recorded dataset, even though it would be unsuitable for real-time forecasting.
9) Inverse filters: undoing a transformation
Consider:
Question: can we recover from ?
If : causal inverse exists
Using the geometric series idea, the inverse is:
So:
This uses only present and past values (causal).
If : inverse exists but is non-causal
You can still invert, but the inverse involves negative powers of , meaning it uses future values.
If : no inverse
This is the “boundary case” where the geometric-series approach fails, and the operator cannot be undone in a stable way.
10) A central result: linear processes are weakly stationary
If:
- is mean-zero white noise with variance ,
- ,
then the linear process:
is weakly stationary, meaning:
- constant mean over time,
- autocovariance depends only on lag , not on .
Autocovariance formula
The autocovariance function is:
Interpretation:
- Correlation at lag comes from overlap between the weight sequence and a shifted copy of itself.
11) MA() processes: finite moving averages
An MA($q$) process is:
This is causal because it uses only present/past shocks.
Key feature:
- Its autocovariance is zero beyond lag $q$.
Formally:
- If , then .
So MA models create short memory (dependence only up to a fixed lag).
12) AR(1) as a linear process via inversion
An AR(1) model satisfies:
Equivalently:
If : causal AR(1) exists and becomes MA()
Invert the operator:
So AR(1) can be viewed as a linear process with weights for .
Autocovariance and autocorrelation
For :
Meaning:
- correlation decays geometrically with lag.
If : stationary solution is non-causal
A stationary solution exists, but it depends on future noise terms, which is generally undesirable for forecasting.
If : no stationary AR(1)
When , you get a random walk, which is not stationary.
13) ARMA(1,1): combining AR and MA
An ARMA(1,1) satisfies:
Operator form:
If , the model is causal and can be written as:
This yields an MA() representation where the effect of a shock decays over time like , but with a modified first step due to .
Invertibility (separate concept)
- “Causal” means can be written using past ’s.
- “Invertible” means can be written using past ’s.
For ARMA(1,1), invertibility holds when .
Invertibility matters because it makes the model identifiable and estimation more stable in practice.
Summary (plain-English takeaway)
- A linear process builds a time series by adding up many time-shifted “random shocks” with weights.
- The backshift operator is a clean algebraic way to describe time shifts and filters.
- Linear filters transform a series via weighted combinations of shifted values.
- Some filters have stable inverses, strongly tied to the geometric series.
- Under a mild condition on weights (), linear processes are weakly stationary.
- MA, AR, and ARMA models can all be understood inside this “linear process + filter” framework.
- Causality and invertibility tell you whether a model depends only on past information and whether you can recover shocks from observed data.
