1. Introduction
Logistic regression is a fundamental algorithm used for binary classification problems, where the output label can only take two values: 0 or 1.
The goal of logistic regression is not just to predict a class, but to estimate:
The probability that $y = 1$ given input $x$
For example:
- Input: an image
- Output: probability that the image is a cat
2. Problem Setup
We are given:
- Input vector:
- Parameters:
- (weights)
- (bias)
We want to compute:
3. Why Linear Model is Not Enough
A natural first idea is:
This is linear regression, but it has a serious problem:
- Output can be < 0 or > 1
- Not valid as a probability
Example:
- → impossible
- → meaningless
4. Solution: Sigmoid Function
To fix this, logistic regression uses the sigmoid function.
4.1 Definition
Where:
So the final model becomes:
5. Intuition of Sigmoid
The sigmoid function converts any real number into a value between 0 and 1.
Case 1:
Model is confident that
Case 2:
- is very large
Model is confident that
Case 3:
Model is uncertain
Core Intuition:
z = “score” → sigmoid → “probability”
6. Interpretation of z
This is:
a linear combination of input features
You can think of it as:
- a weighted sum of features
- a decision score
7. Model Summary
The full logistic regression pipeline:
Step 1: Compute score
Step 2: Convert to probability
Final meaning:
= probability that the input belongs to class 1
8. Alternative Notation (Important but Optional)
Some courses combine and into one vector:
- Add
- Define parameter vector
Then:
But in deep learning:
- we usually keep $w$ and $b$ separate
- easier for implementation
9. Learning Objective
The goal of training is:
Find and such that is close to the true label
This will be done using:
- cost function (next step)
- gradient descent
10. Big Picture Connection
Now connect everything:
Input
- image → vector
Model
- logistic regression
Computation
Output
- probability of class (cat vs not cat)
11. Key Insights
Logistic regression outputs probability, not just class
Linear function alone is not enough
Sigmoid converts score → probability
is the “confidence score”
are learned from data
Final One-Line Summary
Logistic regression computes a linear score from input features and transforms it into a probability using the sigmoid function to perform binary classification.
