1. Binary AUROC Recap
- AUROC (Area Under ROC Curve) = probability that the model ranks a randomly chosen positive sample higher than a randomly chosen negative sample.
- Well-defined for binary classification.
2. Multiclass Extension
When we have K classes, AUROC needs to be extended. Two common strategies are:
- Macro AUROC: average AUROC for each class (OvR).
- Micro AUROC: aggregate all predictions across classes, then compute AUROC globally.
3. Definition of Micro AUROC
- Flatten all class predictions into a single pool of binary decisions (one-vs-rest, across all classes).
- Then compute TP, FP, TN, FN globally across all classes.
- Finally, calculate AUROC as if it were a single binary task.
Formally:
$AUROC_{micro} = AUROC\left(\sum_{i=1}^{K} \text{positives for class } i, \; \sum_{i=1}^{K} \text{negatives for class } i \right)$
4. Key Characteristics
- Micro AUROC = sample-level discrimination ability.
- Large (majority) classes dominate, because counts are pooled.
- If your dataset is imbalanced, Micro AUROC may look high even if rare classes perform poorly.
- Complements Macro AUROC, which treats all classes equally.
5. Example
Suppose we have 3 classes (A, B, C):
- AUROC(A vs rest) = 0.90
- AUROC(B vs rest) = 0.70
- AUROC(C vs rest) = 0.60
- Macro AUROC = (0.90 + 0.70 + 0.60) / 3 = 0.733
- Micro AUROC is computed by pooling all predictions: if class A has many more samples, Micro AUROC will be closer to 0.90 (dominated by majority class).
6. Interpretation
- Micro AUROC: best when you want to evaluate the overall ability of the classifier across all samples.
- Macro AUROC: best when you want fairness across classes, especially minority ones.
Summary
- Micro AUROC = AUROC computed after pooling predictions across all classes.
- Dominated by majority classes.
- Useful for overall performance measurement, but hides minority class weaknesses.
- Always report both Macro and Micro AUROC for a balanced evaluation.
