Multi-label Classification

Date: August 30, 2025Author: Ju Yeon Eum 0 Comments

1. Definition

Multi-label classification = each instance (sample) can be assigned multiple labels simultaneously.
Unlike multi-class classification (where exactly one label is chosen among many), here labels are not mutually exclusive.

$f: X \;\; \rightarrow \;\; \{0,1\}^K$

For $K$ classes, each class has a binary decision (0 or 1) whether the instance belongs to it.

2. Examples

Movies: A film can belong to multiple genres → Action + Comedy + Romance.
News articles: An article can be tagged as Politics + Economy + International.
Medical diagnosis: A patient may have multiple conditions (e.g., Diabetes + Hypertension).
Image recognition: A picture may contain dog + car + tree.

3. How It Differs from Multi-class

Feature	Multi-class	Multi-label
Labels per sample	Exactly 1	One or more
Class exclusivity	Mutually exclusive	Not mutually exclusive
Typical output	Softmax (probabilities sum to 1)	Sigmoid (independent probability per class)
Example	Animal = {Cat, Dog, Horse} (one choice)	Tags = {Cat, Dog, Horse} (any combination)

4. Modeling

Output layer:
- Multi-class → Softmax activation (chooses one)
- Multi-label → Sigmoid activation (thresholded independently for each label)
Loss function:
- Multi-class → Categorical cross-entropy
- Multi-label → Binary cross-entropy (per class, summed/averaged)

5. Evaluation Metrics

Since predictions are multiple binary decisions per sample, standard metrics differ from multi-class:

Per-label Precision, Recall, F1 (then averaged: micro, macro, weighted)
Hamming loss (fraction of misclassified labels)
Subset accuracy (strict: all labels correct = 1, else 0)
Jaccard similarity (intersection over union of predicted vs true labels)

6. Example

Suppose true labels for an image: {Cat, Dog}
Model prediction: {Cat, Horse}

Precision = 1 / (1+1) = 0.5 (only Cat correct, Horse is FP)
Recall = 1 / (1+1) = 0.5 (missed Dog, so 1 FN)
Jaccard = |{Cat} ∩ {Cat, Dog}| / |{Cat, Dog, Horse}| = 1/3 ≈ 0.33

Summary

Multi-label classification allows assigning multiple labels to each sample.
Labels are independent, not mutually exclusive.
Requires sigmoid outputs and metrics like Hamming loss, Jaccard, and micro/macro F1.

Related

Leave a ReplyCancel reply