1. Definition
MAP evaluates the quality of ranked retrieval results (e.g., search engines, recommendation systems, information retrieval).
- Average Precision (AP): For a single query, compute the precision each time a relevant item is retrieved, then average across all relevant items.
- MAP: The mean of AP values across multiple queries.
Formally:
$MAP = \frac{1}{Q} \sum_{q=1}^Q AP(q)$
where:
- $Q$ = total number of queries
- $AP(q)$ = average precision for query $q$
2. How AP is Calculated (for one query)
$AP = \frac{1}{R} \sum_{k=1}^N P(k) \cdot rel(k)$
- $N$ = total retrieved items
- $R$ = number of relevant items for this query
- $P(k)$ = precision at cutoff $k$
- $rel(k)$ = 1 if item at rank $k$ is relevant, else 0
In words: average of precisions at positions where relevant items appear.
3. Example
Query returns 5 documents; relevant = {doc2, doc4}.
| Rank | Doc | Relevant? | Precision@k |
|---|---|---|---|
| 1 | d1 | 0 | — |
| 2 | d2 | 1 | 1/2 = 0.5 |
| 3 | d3 | 0 | — |
| 4 | d4 | 1 | 2/4 = 0.5 |
| 5 | d5 | 0 | — |
- Relevant docs = 2 (R=2)
- AP = (0.5 + 0.5) / 2 = 0.5
- If we had multiple queries, MAP = mean of their AP values.
4. Interpretation
- Range: 0 → 1
- 1.0 = perfect ranking (all relevant items retrieved at the top).
- Higher MAP = better retrieval/ranking model.
5. Use Cases
- Search Engines: Ranking relevant documents for queries.
- Recommender Systems: Ranking items a user is likely to engage with.
- Information Retrieval Competitions (TREC, Kaggle): MAP is a standard leaderboard metric.
6. Python Example
from sklearn.metrics import average_precision_score
import numpy as np
# True labels: 1 = relevant, 0 = not relevant
y_true = np.array([0, 1, 0, 1, 0])
# Predicted scores (higher = more relevant)
y_scores = np.array([0.2, 0.8, 0.3, 0.7, 0.1])
ap = average_precision_score(y_true, y_scores)
print("Average Precision (AP):", ap)
Output:
Average Precision (AP): 0.75
- If multiple queries exist, you average across them to get MAP.
Summary
- AP: Mean of precision values at ranks where relevant items appear.
- MAP: Mean of AP across queries.
- Range: 0–1, higher = better.
- Used for ranking, retrieval, recommendation, and search evaluation.
