1. Definition
- Demographic Parity (also called Statistical Parity) is a fairness criterion that says:
- A model’s decisions should be independent of sensitive attributes (like gender, race, or age).
Formally, for a binary decision (positive outcome = 1):
$P(\hat{Y} = 1 \mid A = a) = P(\hat{Y} = 1 \mid A = b) \quad \forall \, a,b$
- $\hat{Y}$ = model prediction
- $A$ = protected attribute (e.g., gender, race)
The probability of getting a “positive” outcome should be the same across all demographic groups.
2. Example
Suppose a hiring model predicts whether to interview candidates.
- Group A (men): 60% predicted as interview-worthy.
- Group B (women): 40% predicted as interview-worthy.
If demographic parity were satisfied, both should have ~the same rate (say, 50%).
Since 60% ≠ 40%, demographic parity is violated.
3. Demographic Parity Difference / Ratio
Metrics to measure how far off a model is:
- Difference:
$DPD = P(\hat{Y}=1 \mid A=a) – P(\hat{Y}=1 \mid A=b)$
- Ratio:
$DPR = \frac{P(\hat{Y}=1 \mid A=a)}{P(\hat{Y}=1 \mid A=b)}$
Often compared against the “80% Rule” (four-fifths rule) in law: if the ratio < 0.8, there may be adverse impact.
4. Why It’s Important
- Ensures equal treatment in outcomes across demographic groups.
- Widely used in compliance (e.g., employment law, lending).
- Simple to calculate.
5. Limitations
- May ignore differences in actual qualification/label distribution.
- Example: If one group historically has higher loan default rates, enforcing equal approval rates could increase risk.
- Can conflict with other fairness notions (e.g., Equal Opportunity, Equalized Odds).
- Does not consider true labels (Y) — it only looks at predicted outcomes ($\hat{Y}$).
Summary:
Demographic Parity = predictions are independent of sensitive attributes.
Every demographic group should have the same positive outcome rate.
It’s simple and intuitive, but can conflict with accuracy or other fairness definitions.
