Linear Models

1) Definition

Linear models = models that make predictions as a linear combination of input features.
General form:

$\hat{y} = w_1 x_1 + w_2 x_2 + \dots + w_d x_d + b$

where

$x_i$: input features
$w_i$: weights/coefficients
$b$: bias/intercept

“Linear” means linear in the parameters (weights), not necessarily in the raw features (you can transform inputs).

2) Types of Linear Models

a) Regression

Linear Regression: predicts continuous values.
- $y = Xw + b + \epsilon$

b) Classification

Logistic Regression: models probability via sigmoid.
- $P(y=1|x) = \sigma(w^T x + b)$
Linear SVM: uses a linear decision boundary.

c) Regularized Linear Models

Ridge Regression (L2): penalizes large weights.
Lasso Regression (L1): promotes sparsity (feature selection).
Elastic Net: combination of L1 + L2.

3) Why Linear Models Are Important

Simplicity: easy to train, fast to run.
Interpretability: coefficients tell you how features influence predictions.
Baseline models: good starting point before deep learning.
Robustness: with regularization, they generalize well.

4) Example in Python (Logistic Regression)

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

clf = LogisticRegression(max_iter=1000)
clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Coefficients:", clf.coef_)

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

clf = LogisticRegression(max_iter=1000)
clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Coefficients:", clf.coef_)

5) Limitations

Can only capture linear decision boundaries.
Performance suffers if relationships are highly non-linear.
Sensitive to outliers (unless robust variants used).
Need feature engineering (polynomials, interactions, kernel tricks) to capture complexity.

6) Extensions

Polynomial regression: add polynomial features, still linear in parameters.
Kernel methods: implicitly map to high-dimensional space (e.g., kernel SVM).
Generalized Linear Models (GLMs): extend linear models to different output distributions (Poisson regression, etc.).

Summary

Linear models predict via weighted sums of features.
Types: regression (linear), classification (logistic, SVM), regularized (ridge, lasso).
Pros: simple, interpretable, efficient.
Cons: limited expressiveness for non-linear data.

Your Gateway to Data Mastery

Learn, explore, and innovate with data science.

Linear Models

1) Definition

2) Types of Linear Models

a) Regression

b) Classification

c) Regularized Linear Models

3) Why Linear Models Are Important

4) Example in Python (Logistic Regression)

5) Limitations

6) Extensions

Summary

Like this:

Related

Leave a ReplyCancel reply

1) Definition

2) Types of Linear Models

a) Regression

b) Classification

c) Regularized Linear Models

3) Why Linear Models Are Important

4) Example in Python (Logistic Regression)

5) Limitations

6) Extensions

Summary

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Your Gateway to Data Mastery