A Linear Congruential Random Number Generator (LCG) is one of the oldest and simplest algorithms for generating a sequence of pseudo-random numbers. It is “pseudo-random” because the sequence is entirely deterministic once the starting value (the seed) is fixed.
1) Core idea and recurrence
An LCG generates an integer sequence $X_0, X_1, X_2, \dots$ using the recurrence
$X_{n+1} = (aX_n + c) \bmod m$
Where:
- $m$ = modulus (a positive integer, typically large)
- $a$ = multiplier ($0 \le a < m$)
- $c$ = increment ($0 \le c < m$)
- $X_0$ = seed / initial state ($0 \le X_0 < m$)
- “$\bmod m$” means “take the remainder after dividing by $m$”
The output is often converted to a real number in $[0,1)$ by
$U_n = \frac{X_n}{m}$
So you can treat $U_n$ as a pseudo-random uniform sample on $[0,1)$.
2) Two common types: mixed vs multiplicative
Mixed LCG (full form)
$X_{n+1} = (aX_n + c) \bmod m$ with $c \ne 0$
This is the general case.
Multiplicative LCG
$X_{n+1} = (aX_n) \bmod m$ with $c = 0$
This is simpler, but the period behavior differs (and $X_n = 0$ becomes an absorbing state if it ever occurs).
3) State space, determinism, and period
Because $X_n$ always stays in ${0,1,\dots,m-1}$, there are only $m$ possible states. Therefore, the sequence must eventually repeat, forming a cycle.
- The period is the length of the cycle before values repeat.
- The maximum possible period is $m$ (for mixed LCG) or at most $m-1$ (common for multiplicative LCG under certain moduli).
In practice, the quality of an LCG is heavily determined by whether it attains a long period and whether the generated points behave “random enough” for the intended application.
4) Full-period conditions (Hull–Dobell theorem)
For the mixed LCG $X_{n+1} = (aX_n + c)\bmod m$, the generator has full period $m$ (i.e., it cycles through all residues $0,1,\dots,m-1$ exactly once) if and only if all of the following hold:
- $\gcd(c,m) = 1$
(the increment is coprime with the modulus) - For every prime factor $p$ of $m$, we have $(a-1) \equiv 0 \pmod p$
equivalently: $p \mid (a-1)$ for each prime divisor $p$ of $m$ - If $4 \mid m$, then $4 \mid (a-1)$
(needed when $m$ is divisible by $4$)
These conditions are extremely important: they tell you how to choose $(a,c,m)$ so that the period is maximal.
5) Parameter meanings and intuition
Modulus $m$
- Controls the size of the state space.
- Larger $m$ allows longer periods and finer resolution in $U_n = X_n/m$.
- Many implementations use $m=2^{32}$ or $m=2^{48}$ because modulo arithmetic is cheap on binary computers, but those choices can have statistical pitfalls depending on how output bits are used.
Multiplier $a$
- Controls how the state “mixes.”
- Even with full period, poor choices of $a$ can create strong patterns (especially in higher dimensions).
Increment $c$
- In mixed LCG, $c$ helps avoid certain degeneracies and can help achieve full period $m$ (via $\gcd(c,m)=1$).
- However, “full period” does not automatically mean “high statistical quality.”
Seed $X_0$
- Determines which point in the cycle you start from.
- Two runs with the same $(a,c,m)$ and same $X_0$ produce identical sequences (reproducibility).
6) Why LCGs can look random in 1D but fail in higher dimensions
A famous weakness of LCGs is their lattice structure in multiple dimensions.
If you take consecutive outputs and form points like:
$(U_n, U_{n+1})$ in 2D, or $(U_n, U_{n+1}, U_{n+2})$ in 3D, etc.
those points do not fill space uniformly. Instead, they lie on a relatively small number of hyperplanes. This is a structural artifact of the linear modular recurrence.
This weakness is why LCGs can be unacceptable for:
- serious Monte Carlo simulations (especially in high dimension),
- cryptography,
- applications requiring strong independence properties.
Quality evaluation methods include the spectral test, which measures how many hyperplanes the points fall on and how “coarse” that lattice is.
7) Bit-level issues (especially when $m$ is a power of 2)
When $m = 2^k$, the recurrence behaves in a particularly structured way:
- The least significant bits can have very short cycles.
- For example, the lowest bit may alternate in a predictable pattern depending on $a$ and $c$.
Practical implication:
- If you generate integers via $X_n$ and then do something like “take $X_n \bmod 10$” or use only low bits, you can get visibly non-random behavior.
- Many historically problematic generators were “good enough” only if you used the high bits.
8) A classic “good” example (multiplicative LCG)
One well-known multiplicative LCG is the Park–Miller generator:
- $m = 2^{31}-1 = 2147483647$ (a prime, a Mersenne prime)
- $a = 16807$
- $c = 0$
Recurrence: $X_{n+1} = (16807 X_n)\bmod (2^{31}-1)$
It has period $m-1$ for any nonzero seed (since $m$ is prime and $a$ is chosen as a primitive element modulo $m$ in that setting).
This generator is historically important and still useful for teaching and for some low-stakes simulations, but modern standards often prefer stronger PRNGs.
9) Implementation considerations
Overflow
Computing $aX_n + c$ can overflow standard integer types if you do it naively.
Common approaches:
- Use a larger integer type (e.g., 64-bit to compute a 32-bit modulus).
- Use specialized modular multiplication methods when $m$ is large (e.g., Schrage method for Park–Miller to avoid overflow).
Reproducibility
Fix $X_0$ (seed) explicitly to reproduce results.
“Skipping ahead”
Because the recurrence is linear modulo $m$, you can jump ahead by $t$ steps using modular exponentiation / linear recurrence algebra. For multiplicative LCG ($c=0$):
$X_{n+t} \equiv a^t X_n \pmod m$
For mixed LCG ($c\ne 0$), there is an affine form; you can still jump ahead efficiently using exponentiation on an augmented matrix or closed-form sums modulo $m$.
10) Strengths and weaknesses
Strengths
- Extremely simple and fast.
- Minimal memory (just store $X_n$).
- Easy to reproduce sequences.
- Easy to analyze mathematically (period, structure).
Weaknesses
- Not cryptographically secure (state can be recovered from outputs in many settings).
- Can have strong correlations and lattice artifacts in multiple dimensions.
- Quality depends heavily on parameter choice.
- Low bits can be especially weak when $m=2^k$.
11) When an LCG is acceptable vs not acceptable
Often acceptable
- Teaching PRNG concepts (period, seeding, modular arithmetic).
- Simple simulations where quality demands are low.
- Deterministic test data generation.
Often not acceptable
- Cryptography / security tokens / passwords.
- High-dimensional Monte Carlo integration.
- Financial risk simulation where subtle correlations matter.
- Any application requiring strong randomness guarantees.
Modern alternatives commonly used in practice include Mersenne Twister, PCG, xoshiro/xoroshiro families, Philox/Threefry (counter-based), etc., depending on the environment and requirements.
