Log Loss (also called Logarithmic Loss or Cross-Entropy Loss)

1. Definition

Log Loss measures the performance of a classification model where the output is a probability between 0 and 1.
It penalizes wrong and overconfident predictions more heavily than mild mistakes.

For binary classification:

$LogLoss = -\frac{1}{N} \sum_{i=1}^N \Big[ y_i \cdot \log(\hat{p}_i) + (1 – y_i) \cdot \log(1 – \hat{p}_i) \Big]$

where:

$N$ = number of samples
$y_i$ = true label (0 or 1)
$\hat{p}_i$ = predicted probability for the positive class

2. Interpretation

Range: [0, ∞)
- 0 = perfect predictions (probabilities exactly match outcomes)
- Higher values = worse predictions
Strong penalty for being confident and wrong:
- Predicting 0.99 when actual = 0 → heavy penalty
- Predicting 0.6 when actual = 0 → smaller penalty

3. Example

Suppose we have 3 predictions:

True Label	Predicted Prob ($\hat{p}$)	Contribution to Log Loss
1	0.9	$-\log(0.9) = 0.105$
0	0.2	$-\log(0.8) = 0.223$
1	0.1	$-\log(0.1) = 2.303$

Average Log Loss = (0.105 + 0.223 + 2.303) / 3 = 0.877

Notice how the last prediction (confident but wrong) dominates the loss.

4. Why It’s Useful

Captures Confidence: Unlike accuracy, log loss punishes “confident but wrong” predictions.
Probabilistic Evaluation: Essential for tasks where calibrated probabilities matter (fraud, medical diagnosis, weather).
Model Training: Many algorithms (e.g., logistic regression, neural networks) optimize log loss directly.

5. Comparison with Brier Score

Brier Score: Mean squared error of predicted probabilities. Treats 0.9 vs. 1.0 and 0.6 vs. 1.0 more evenly.
Log Loss: Stronger penalty for overconfident wrong predictions, making it harsher.

Example: Predicting 0.01 when the true label is 1 →

Brier error = (0.01 – 1)² = 0.9801
Log Loss error = –log(0.01) = 4.605 (much harsher penalty)

6. Example in Python

from sklearn.metrics import log_loss

y_true = [1, 0, 1]
y_prob = [0.9, 0.2, 0.1]

loss = log_loss(y_true, y_prob)
print("Log Loss:", loss)

from sklearn.metrics import log_loss

y_true = [1, 0, 1]
y_prob = [0.9, 0.2, 0.1]

loss = log_loss(y_true, y_prob)
print("Log Loss:", loss)

Output:

Log Loss: 0.877

Log Loss: 0.877

Summary

Log Loss = average negative log likelihood of the true class.
Range = [0, ∞), lower is better.
Strongly penalizes overconfident wrong predictions.
Common in classification tasks and directly optimized by many ML algorithms.

Your Gateway to Data Mastery

Learn, explore, and innovate with data science.

Log Loss (also called Logarithmic Loss or Cross-Entropy Loss)

1. Definition

2. Interpretation

3. Example

4. Why It’s Useful

5. Comparison with Brier Score

6. Example in Python

Like this:

Related

Leave a ReplyCancel reply

1. Definition

2. Interpretation

3. Example

4. Why It’s Useful

5. Comparison with Brier Score

6. Example in Python

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Your Gateway to Data Mastery