Logistic Regression

Definition

Logistic Regression is a classification model, not a regression model (despite the name).
It predicts the probability that an observation belongs to a particular class (usually the positive class).
Most common use: binary classification (e.g., spam vs not spam, churn vs retain).

Core Idea

Instead of predicting $y$ directly, logistic regression predicts the log-odds of the positive class as a linear function of features:

$\ln\left(\frac{p}{1-p}\right) = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_n x_n$

Solving for $p$:

$p = P(y=1 \mid X) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x_1 + \dots + \beta_n x_n)}}$

This is the sigmoid function.

Why Logistic Regression?

Linear regression can predict values outside [0,1] (invalid as probabilities).
Logistic regression “squashes” predictions into [0,1].
Decision rule: predict class 1 if p≥0.5p \geq 0.5p≥0.5, else class 0.

Training (How Parameters are Estimated)

Uses Maximum Likelihood Estimation (MLE), not least squares.
Likelihood: maximize the probability of observing the actual labels given the model’s parameters.
Optimization: usually solved with gradient descent or variants.

Example

Suppose we want to predict if a student passes an exam (yes/no) based on hours studied.

Model:

$p = \frac{1}{1+e^{-(\beta_0 + \beta_1 \cdot \text{hours})}}$

If $\beta_1 > 0$, more study hours → higher probability of passing.
If predicted $p = 0.8$, we interpret: “This student has an 80% chance of passing.”

Extensions

Multinomial Logistic Regression – for more than two classes.
Regularized Logistic Regression – Ridge (L2), Lasso (L1) to prevent overfitting.
Ordinal Logistic Regression – when classes have an order (e.g., low, medium, high).

Applications

Healthcare: Disease diagnosis (disease vs no disease).
Finance: Credit scoring (default vs no default).
Marketing: Churn prediction, ad click prediction.
NLP: Sentiment classification (positive vs negative).

Advantages & Limitations

Advantages:

Simple and interpretable.
Outputs probabilities, not just class labels.
Works well with small/medium datasets.

Limitations:

Assumes a linear relationship in log-odds.
Not good for complex nonlinear patterns (trees or neural nets are better).
Sensitive to multicollinearity in features.

In short:
Logistic Regression = a simple, interpretable classification model that predicts probabilities using the sigmoid function applied to a linear model of features.

Your Gateway to Data Mastery

Learn, explore, and innovate with data science.

Logistic Regression

Definition

Core Idea

Why Logistic Regression?

Training (How Parameters are Estimated)

Example

Extensions

Applications

Advantages & Limitations

Like this:

Related

Leave a ReplyCancel reply

Definition

Core Idea

Why Logistic Regression?

Training (How Parameters are Estimated)

Example

Extensions

Applications

Advantages & Limitations

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Your Gateway to Data Mastery