Loss Functions

Definition

A loss function measures how well (or poorly) a machine learning model’s predictions match the true target values.

Input: predicted value ($\hat{y}$) and true value ($y$)
Output: a single number (loss)
Goal: minimize this number during training

Think of it as the “error signal” that guides model learning.

Properties of a Good Loss Function

Differentiable (so we can use gradient descent)
Sensitive to errors (larger errors → larger loss)
Aligned with task (classification vs regression)

Common Loss Functions

1. Regression (continuous targets)

Mean Squared Error (MSE):
- $L = \frac{1}{n}\sum_{i=1}^n (y_i – \hat{y}_i)^2$
- Penalizes large errors more (squared).
Mean Absolute Error (MAE):
- $L = \frac{1}{n}\sum_{i=1}^n |y_i – \hat{y}_i|$
- More robust to outliers.
Huber Loss:
- Combines MSE and MAE (quadratic for small errors, linear for large ones).

2. Classification (categorical targets)

Binary Cross-Entropy (Log Loss):
- $L = -\frac{1}{n}\sum_{i=1}^n \big[ y_i \log(\hat{p}_i) + (1-y_i)\log(1-\hat{p}_i) \big]$
- Used in binary classification (logistic regression, neural nets).
Categorical Cross-Entropy:
- $L = -\sum_{i=1}^K y_i \log(\hat{p}_i)$
- Used with softmax for multi-class classification.
Hinge Loss:
- $L = \max(0, 1 – y \cdot \hat{y})$
- Used in Support Vector Machines (SVMs).

3. Ranking / Structured Tasks

Contrastive Loss: Used in Siamese networks to compare embeddings.
Triplet Loss: Ensures an anchor is closer to positive than negative samples.

Loss vs Cost vs Objective

Loss function: Error for one sample.
Cost function: Average loss over all samples.
Objective function: The function being optimized (usually cost + regularization).

Example

Suppose true label $y = 1$, model predicts $\hat{p} = 0.9$.

Binary cross-entropy loss:

$L = -[1 \cdot \log(0.9) + (0) \cdot \log(0.1)] = -\log(0.9) \approx 0.105$

Good prediction → small loss.

If $\hat{p} = 0.1$, $L = -\log(0.1) \approx 2.30$

Bad prediction → large loss.

Applications

Regression tasks: MSE, MAE, Huber
Classification tasks: Cross-entropy, Hinge
Neural embeddings: Contrastive, Triplet
Generative models: Adversarial losses, KL divergence

In short:
A loss function measures prediction error.

Regression → MSE, MAE
Classification → Cross-entropy, Hinge
Representation learning → Contrastive, Triplet

Your Gateway to Data Mastery

Learn, explore, and innovate with data science.

Loss Functions

Definition

Properties of a Good Loss Function

Common Loss Functions

1. Regression (continuous targets)

2. Classification (categorical targets)

3. Ranking / Structured Tasks

Loss vs Cost vs Objective

Example

Applications

Like this:

Related

Leave a ReplyCancel reply

Definition

Properties of a Good Loss Function

Common Loss Functions

1. Regression (continuous targets)

2. Classification (categorical targets)

3. Ranking / Structured Tasks

Loss vs Cost vs Objective

Example

Applications

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Your Gateway to Data Mastery