Support Vector Machines (SVMs)

Definition

A Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification (mainly) and regression.

Core idea: Find the best separating hyperplane that maximizes the margin between different classes.
It uses support vectors (the critical data points closest to the decision boundary) to define that hyperplane.

Intuition

Suppose you have two classes of points in 2D. Many lines can separate them.
SVM chooses the one that maximizes the margin (distance between boundary and nearest points of each class).
This leads to better generalization.

Mathematics

Decision Function (linear case):

$f(x) = w \cdot x + b$

$w$: weight vector (normal to the hyperplane)
$b$: bias (offset)

Classification Rule:

$\hat{y} = \text{sign}(w \cdot x + b)$

Margin:

$\text{Margin} = \frac{2}{\|w\|}$

SVM maximizes this margin while correctly classifying the training data (or with minimal errors).

Support Vectors

The data points closest to the hyperplane.
They determine the position and orientation of the decision boundary.
Other points don’t matter once the hyperplane is set.

Kernel Trick

When data is not linearly separable, SVM uses a kernel function to map data into a higher-dimensional space where it becomes separable.
Common kernels:
- Linear
- Polynomial
- Radial Basis Function (RBF / Gaussian)
- Sigmoid

Example: In 2D, a circle vs background is not linearly separable. Mapping to higher dimension (radius) makes it separable.

Types of SVMs

Hard-Margin SVM
- Assumes perfect separation (no misclassification).
- Works only if data is clean and separable.
Soft-Margin SVM
- Allows some misclassification (controlled by penalty parameter $C$).
- Balances margin maximization with classification errors.
Support Vector Regression (SVR)
- Extension of SVM for regression tasks.
- Fits within an ϵ\epsilonϵ-tube around the true values.

Advantages

Works well with high-dimensional data.
Effective when classes are separable or nearly separable.
Kernel trick allows flexible decision boundaries.
Uses only support vectors → efficient in memory.

Disadvantages

Training can be slow on very large datasets.
Choice of kernel and parameters is critical.
Less interpretable than linear regression or logistic regression.

Applications

Text classification (spam detection, sentiment analysis)
Image recognition (face detection, handwriting recognition)
Bioinformatics (protein classification, cancer detection)
Finance (fraud detection, credit scoring)

In short:
SVMs classify data by finding the maximum-margin hyperplane that separates classes, using only the critical points (support vectors). With the kernel trick, they handle nonlinear boundaries very effectively.

Your Gateway to Data Mastery

Learn, explore, and innovate with data science.

Support Vector Machines (SVMs)

Definition

Intuition

Mathematics

Support Vectors

Kernel Trick

Types of SVMs

Advantages

Disadvantages

Applications

Like this:

Related

Leave a ReplyCancel reply

Definition

Intuition

Mathematics

Support Vectors

Kernel Trick

Types of SVMs

Advantages

Disadvantages

Applications

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Your Gateway to Data Mastery