1. Introduction

Logistic regression is a fundamental algorithm used for binary classification problems, where the output label $y$ can only take two values: 0 or 1.

The goal of logistic regression is not just to predict a class, but to estimate:

The probability that $y = 1$ given input $x$

For example:

Input: an image
Output: probability that the image is a cat

2. Problem Setup

We are given:

Input vector: $x \in \mathbb{R}^{n_x}$
Parameters:
- $w \in \mathbb{R}^{n_x}$ (weights)
- $b \in \mathbb{R}$ (bias)

We want to compute:

$\hat{y} = P(y = 1 \mid x)$

3. Why Linear Model is Not Enough

A natural first idea is:

$\hat{y} = w^T x + b$

This is linear regression, but it has a serious problem:

Output can be < 0 or > 1
Not valid as a probability

Example:

$\hat{y} = 3.2$ → impossible
$\hat{y} = -1.5$ → meaningless

4. Solution: Sigmoid Function

To fix this, logistic regression uses the sigmoid function.

4.1 Definition

$\sigma(z) = \frac{1}{1 + e^{-z}}$

Where:

$z = w^T x + b$

So the final model becomes:

$\hat{y} = \sigma(w^T x + b)$

5. Intuition of Sigmoid

The sigmoid function converts any real number into a value between 0 and 1.

Case 1: $z \gg 0$

$e^{-z} \approx 0$
$\hat{y} \approx 1$

Model is confident that $y = 1$

Case 2: $z \ll 0$

$e^{-z}$ is very large
$\hat{y} \approx 0$

Model is confident that $y = 0$

Case 3: $z = 0$

$\hat{y} = 0.5$

Model is uncertain

Core Intuition:

z = “score” → sigmoid → “probability”

6. Interpretation of z

$z = w^T x + b$

This is:

a linear combination of input features

You can think of it as:

a weighted sum of features
a decision score

7. Model Summary

The full logistic regression pipeline:

Step 1: Compute score

$z = w^T x + b$

Step 2: Convert to probability

$\hat{y} = \sigma(z)$

Final meaning:

$\hat{y}$ = probability that the input belongs to class 1

8. Alternative Notation (Important but Optional)

Some courses combine $w$ and $b$ into one vector:

Add $x_0 = 1$
Define parameter vector $\theta$

Then:

$\hat{y} = \sigma(\theta^T x)$

But in deep learning:

we usually keep $w$ and $b$ separate
easier for implementation

9. Learning Objective

The goal of training is:

Find $w$ and $b$ such that $\hat{y}$ is close to the true label $y$

This will be done using:

cost function (next step)
gradient descent

10. Big Picture Connection

Now connect everything:

Input

image → vector $x$

Model

logistic regression

Computation

$z = w^T x + b$
$\hat{y} = \sigma(z)$

Output

probability of class (cat vs not cat)

11. Key Insights

Logistic regression outputs probability, not just class

Linear function alone is not enough

Sigmoid converts score → probability

$z$ is the “confidence score”

$w, b$ are learned from data

Final One-Line Summary

Logistic regression computes a linear score from input features and transforms it into a probability using the sigmoid function to perform binary classification.

Your Gateway to Data Mastery

Learn, explore, and innovate with data science.

Logistic Regression (Binary Classification Model)

1. Introduction

2. Problem Setup

3. Why Linear Model is Not Enough

4. Solution: Sigmoid Function

4.1 Definition

5. Intuition of Sigmoid

Case 1: $z \gg 0$

Case 2: $z \ll 0$

Case 3: $z = 0$

6. Interpretation of z

7. Model Summary

Step 1: Compute score

Step 2: Convert to probability

8. Alternative Notation (Important but Optional)

9. Learning Objective

10. Big Picture Connection

Input

Model

Computation

Output

11. Key Insights

Final One-Line Summary

Like this:

Related

1. Introduction

2. Problem Setup

3. Why Linear Model is Not Enough

4. Solution: Sigmoid Function

4.1 Definition

5. Intuition of Sigmoid

Case 1: z≫0z \gg 0

Case 2: z≪0z \ll 0

Case 3: z=0z = 0

6. Interpretation of z

7. Model Summary

Step 1: Compute score

Step 2: Convert to probability

8. Alternative Notation (Important but Optional)

9. Learning Objective

10. Big Picture Connection

Input

Model

Computation

Output

11. Key Insights

Final One-Line Summary

Share this:

Like this:

Related

Discover more from Your Gateway to Data Mastery

Case 1: $z \gg 0$

Case 2: $z \ll 0$

Case 3: $z = 0$