1) Definition

  • One-vs-Rest (OvR) = a strategy to extend binary classifiers to multiclass problems.
  • For $K$ classes, train K separate binary classifiers:
    • Classifier $i$: “Is it class $i$ or not?”

At prediction time:

  • Each classifier outputs a score/probability.
  • Pick the class with the highest score.

2) Example

Suppose you want to classify fruits: {apple, banana, orange}.

  • OvR trains 3 classifiers:
    • Apple vs (banana + orange)
    • Banana vs (apple + orange)
    • Orange vs (apple + banana)

At prediction time, if an input looks like an orange:

  • Apple model → 0.1
  • Banana model → 0.2
  • Orange model → 0.9
    → Final prediction = orange.

3) Advantages

Simple and widely used.
Works with any binary classifier (SVM, logistic regression, etc.).
Scales linearly with number of classes ($K$ models).
Easy to interpret.


4) Disadvantages

Can give ambiguous results (multiple classifiers may be confident at once).
Requires calibrated probabilities for fair comparison.
May struggle with imbalanced data (positive class vs all negatives).
Less efficient than direct multiclass methods (like softmax).


5) Use Cases

  • Logistic regression with OvR = common baseline for multiclass problems.
  • Support Vector Machines (SVMs) often implemented with OvR (or OvO).
  • Scikit-learn uses OvR by default for many classifiers.

6) Python Example

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.multiclass import OneVsRestClassifier

X, y = load_iris(return_X_y=True)

# Wrap logistic regression in OvR
clf = OneVsRestClassifier(LogisticRegression(max_iter=1000))
clf.fit(X, y)

print(clf.predict(X[:5]))
print(clf.decision_function(X[:5]))  # per-class scores

Summary

  • OvR = train one binary classifier per class: “class vs rest.”
  • At inference: pick class with highest score.
  • Pros: simple, flexible, works with any binary classifier.
  • Cons: requires calibration, may misbehave on imbalanced data.