NDCG (Normalized Discounted Cumulative Gain)

1. Definition

NDCG measures the quality of ranking by considering both:

Relevance of retrieved items (are they useful?)
Position of items in the ranking (are useful items near the top?).

$NDCG@k = \frac{DCG@k}{IDCG@k}$

where:

$DCG@k$ = Discounted Cumulative Gain at cutoff $k$
$IDCG@k$ = Ideal DCG (best possible ranking at cutoff $k$)

2. DCG (Discounted Cumulative Gain)

$DCG@k = \sum_{i=1}^k \frac{2^{rel_i} – 1}{\log_2(i+1)}$

$rel_i$ = relevance score of item at position $i$
The denominator $\log_2(i+1)$ discounts items lower in the ranking (less useful if a relevant item is far down).

This reflects diminishing utility: it’s better to have a relevant item at rank 1 than at rank 10.

3. IDCG (Ideal DCG)

Sort items by true relevance in descending order.
Compute DCG on this ideal ranking.
Ensures NDCG is always in $[0, 1]$.

4. Example

Suppose a search engine returns 5 docs with true relevance scores:

Rank	Relevance (rel)
1	3
2	2
3	3
4	0
5	1

DCG@5:

$DCG@5 = \frac{2^3-1}{\log_2(2)} + \frac{2^2-1}{\log_2(3)} + \frac{2^3-1}{\log_2(4)} + \frac{2^0-1}{\log_2(5)} + \frac{2^1-1}{\log_2(6)}$

$= \frac{7}{1} + \frac{3}{1.585} + \frac{7}{2} + 0 + \frac{1}{2.585} \approx 7 + 1.89 + 3.5 + 0 + 0.39 = 12.78$

IDCG@5: Arrange ideal order (3,3,2,1,0) and compute → about 13.65.
NDCG@5:

$NDCG@5 = \frac{12.78}{13.65} \approx 0.94$

5. Interpretation

Range: 0 → 1
1.0 = perfect ranking (all relevant items in ideal order).
Higher NDCG = more useful ranking.
Advantage over MAP: can handle graded relevance (e.g., 0 = not relevant, 1 = somewhat relevant, 2 = highly relevant).

6. Use Cases

Search Engines: Evaluating quality of ranked search results.
Recommender Systems: Ranking recommended products by user preference.
Question Answering / Dialogue: Ranking candidate answers by relevance.

7. Python Example

import numpy as np
from sklearn.metrics import ndcg_score

# True relevance scores for 1 query (higher = more relevant)
y_true = np.array([[3, 2, 3, 0, 1]])

# Predicted scores by model
y_scores = np.array([[0.9, 0.8, 0.3, 0.2, 0.1]])

# Compute NDCG@5
ndcg = ndcg_score(y_true, y_scores, k=5)
print("NDCG@5:", ndcg)

import numpy as np
from sklearn.metrics import ndcg_score

# True relevance scores for 1 query (higher = more relevant)
y_true = np.array([[3, 2, 3, 0, 1]])

# Predicted scores by model
y_scores = np.array([[0.9, 0.8, 0.3, 0.2, 0.1]])

# Compute NDCG@5
ndcg = ndcg_score(y_true, y_scores, k=5)
print("NDCG@5:", ndcg)

Output:

NDCG@5: 0.94

NDCG@5: 0.94

Summary

NDCG = DCG / IDCG → measures ranking quality.
Rewards placing highly relevant items near the top.
Range [0, 1], higher = better.
Handles graded relevance (beyond binary relevant/not relevant).

Your Gateway to Data Mastery

Learn, explore, and innovate with data science.

NDCG (Normalized Discounted Cumulative Gain)

1. Definition

2. DCG (Discounted Cumulative Gain)

3. IDCG (Ideal DCG)

4. Example

5. Interpretation

6. Use Cases

7. Python Example

Like this:

Related

Leave a ReplyCancel reply

1. Definition

2. DCG (Discounted Cumulative Gain)

3. IDCG (Ideal DCG)

4. Example

5. Interpretation

6. Use Cases

7. Python Example

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Your Gateway to Data Mastery