1. Binary AUROC Recap

  • AUROC (Area Under ROC Curve) = probability that the model ranks a randomly chosen positive sample higher than a randomly chosen negative sample.
  • Well-defined for binary classification.

2. Multiclass Extension

When we have K classes, AUROC needs to be extended. Two common strategies are:

  • Macro AUROC: average AUROC for each class (OvR).
  • Micro AUROC: aggregate all predictions across classes, then compute AUROC globally.

3. Definition of Micro AUROC

  • Flatten all class predictions into a single pool of binary decisions (one-vs-rest, across all classes).
  • Then compute TP, FP, TN, FN globally across all classes.
  • Finally, calculate AUROC as if it were a single binary task.

Formally:

$AUROC_{micro} = AUROC\left(\sum_{i=1}^{K} \text{positives for class } i, \; \sum_{i=1}^{K} \text{negatives for class } i \right)$


4. Key Characteristics

  • Micro AUROC = sample-level discrimination ability.
  • Large (majority) classes dominate, because counts are pooled.
  • If your dataset is imbalanced, Micro AUROC may look high even if rare classes perform poorly.
  • Complements Macro AUROC, which treats all classes equally.

5. Example

Suppose we have 3 classes (A, B, C):

  • AUROC(A vs rest) = 0.90
  • AUROC(B vs rest) = 0.70
  • AUROC(C vs rest) = 0.60
  • Macro AUROC = (0.90 + 0.70 + 0.60) / 3 = 0.733
  • Micro AUROC is computed by pooling all predictions: if class A has many more samples, Micro AUROC will be closer to 0.90 (dominated by majority class).

6. Interpretation

  • Micro AUROC: best when you want to evaluate the overall ability of the classifier across all samples.
  • Macro AUROC: best when you want fairness across classes, especially minority ones.

Summary

  • Micro AUROC = AUROC computed after pooling predictions across all classes.
  • Dominated by majority classes.
  • Useful for overall performance measurement, but hides minority class weaknesses.
  • Always report both Macro and Micro AUROC for a balanced evaluation.