1. Definition
Hit Rate measures whether the set of recommended items contains at least one of the items that the user actually interacted with (e.g., purchased, clicked, watched).
For a single user:
$HR@K = \begin{cases} 1 & \text{if at least one relevant item is in the top K recommendations} \\ 0 & \text{otherwise} \end{cases}$
For multiple users:
$HR@K = \frac{1}{N} \sum_{i=1}^N HR_i@K$
where $N$ = number of users.
2. Intuition
- It checks “Did the recommender system hit the target at least once within the top K results?”
- Binary per user: either a hit (1) or a miss (0).
3. Example
Suppose a recommender system shows Top-3 recommendations for a user:
- Predicted = [Movie A, Movie B, Movie C]
- Actual watched = Movie C
Since Movie C is in the top 3, HR@3 = 1.
If Actual watched = Movie D (not in top 3), then HR@3 = 0.
For multiple users, average across them.
4. Interpretation
- Range: 0 → 1
- 1 = perfect (every user’s relevant item is included in top K).
- Higher Hit Rate means the system is good at surfacing at least one relevant recommendation.
5. Use Cases
- Recommender Systems (Netflix, Amazon):
- HR@10 → checks if the true item is somewhere in the top 10 recommendations.
- Information Retrieval:
- Similar to Recall@K, but more lenient since it doesn’t care how many relevant items are retrieved, just whether at least one is present.
6. Comparison with Other Metrics
- Recall@K: Counts all relevant items in top K (not just one).
- Precision@K: Focuses on proportion of recommended items that are relevant.
- Hit Rate@K: Only checks if at least one relevant item is present.
Example:
- Top-5 results contain 3 relevant items → Recall@5 = 3 / Total relevant.
- Precision@5 = 3/5.
- Hit Rate@5 = 1.
7. Python Example
import numpy as np
def hit_rate_at_k(y_true, y_pred, k=5):
"""
y_true: list of true relevant items (e.g., items the user actually interacted with)
y_pred: list of predicted ranked items
k: cutoff
"""
top_k = y_pred[:k]
return int(any(item in y_true for item in top_k))
# Example for 3 users
y_true_list = [[3], [2], [5]] # actual items
y_pred_list = [[1,3,4], [2,6,7], [1,4,8]] # predicted top-K lists
hits = [hit_rate_at_k(y_true, y_pred, k=3)
for y_true, y_pred in zip(y_true_list, y_pred_list)]
hit_rate = np.mean(hits)
print("Hit Rate@3:", hit_rate)
Output:
Hit Rate@3: 0.67
→ The system got 2 out of 3 users correct.
Summary
- Hit Rate@K = did the top K recommendations include at least one relevant item?
- Range = [0,1], higher is better.
- Simple and intuitive but does not measure how many relevant items were recommended.
