1) Definition
- Multiclass classification = predicting one label out of 3 or more possible classes for each input.
- Each sample belongs to exactly one class (unlike multilabel classification, where samples can have multiple labels).
Example:
- Handwritten digit recognition (0–9).
- Animal classification (cat, dog, horse).
2) Formal Setup
- Input space: $X \in \mathbb{R}^d$.
- Label space: $Y \in \{1, 2, …, K\}$, where $K > 2$.
- Goal: learn a function $f: X \to Y$.
3) Common Approaches
a) One-vs-Rest (OvR)
- Train $K$ binary classifiers, one per class (“is it class i or not?”).
- Choose class with highest confidence.
b) One-vs-One (OvO)
- Train $K(K-1)/2$ classifiers for every pair of classes.
- Majority voting to decide final prediction.
- Used in SVMs.
c) Softmax classifiers (direct multiclass)
- Single model outputs probability distribution over classes.
- Example: logistic regression (softmax), neural networks.
4) Evaluation Metrics
- Accuracy: proportion of correct predictions.
- Precision, Recall, F1: extended with macro, micro, weighted averaging.
- ROC-AUC: extended via macro/micro averaging or one-vs-rest curves.
- Confusion matrix: shows per-class performance.
5) Example
Digit classification (0–9):
- Model outputs probability vector: [0.01, 0.03, …, 0.92 (class 7), …, 0.01].
- Prediction = class 7.
6) Challenges
- Imbalanced classes: some classes much rarer than others → accuracy misleading.
- Overlapping classes: harder to separate if features aren’t distinctive.
- Evaluation: need per-class metrics (macro vs micro).
- Scalability: training OvO/OvR classifiers when $K$ is large (e.g., 1000s of classes).
7) Applications
- Digit recognition (MNIST).
- ImageNet object classification (1000 classes).
- Document categorization (topics).
- Sentiment classification (positive/neutral/negative).
- Medical diagnosis (disease type prediction).
Summary
- Multiclass classification = predict exactly one class out of >2.
- Approaches: OvR, OvO, softmax.
- Metrics: accuracy, precision/recall/F1 (macro/micro/weighted), AUC variants.
- Applications span vision, NLP, healthcare, finance.
