1. Introduction

In neural network programming, it is important to understand how data is represented and how computations are organized. Instead of processing training examples one by one using loops, neural networks are designed to process data in a vectorized way, which allows for efficient computation.

Another key idea is that neural networks are trained using two main steps:

  • Forward propagation (compute predictions)
  • Backward propagation (update parameters)

To introduce these concepts, we start with logistic regression, which is the simplest form of a neural network for binary classification.


2. What is Binary Classification?

Binary classification is a task where the output can only take two values:

  • y=1y = 1: positive class (e.g., cat)
  • y=0y = 0: negative class (e.g., not a cat)

The goal is:

Given an input xx, predict whether y=0y = 0 or y=1y = 1


3. How Images Are Represented

Computers do not understand images directly. Instead, an image is represented as numerical values.

For a color image:

  • It is stored as three matrices:
    • Red channel
    • Green channel
    • Blue channel

If the image size is 64×6464 \times 64, then:

  • Each channel has 64×6464 \times 64 values
  • Total values = 64×64×3=12,28864 \times 64 \times 3 = 12,288

4. Converting Image to Feature Vector

To use this data in machine learning, we convert the image into a feature vector xx.

This is done by:

  • taking all pixel values
  • stacking them into one long vector

So:

xRnx,nx=12,288x \in \mathbb{R}^{n_x}, \quad n_x = 12,288

Key idea:

An image becomes a long list of numbers


5. Training Example and Dataset

A single training example is written as:

(x,y)(x, y)

  • xx: input feature vector
  • yy: label (0 or 1)

If we have mmm examples, the dataset is:

(x(1),y(1)),(x(2),y(2)),,(x(m),y(m))(x^{(1)}, y^{(1)}), (x^{(2)}, y^{(2)}), \dots, (x^{(m)}, y^{(m)})

Here:

  • mm = number of training examples

6. Matrix Representation (Very Important)

Instead of handling each example separately, we organize data into matrices.

6.1 Input Matrix XX

We stack all input vectors as columns:

X=[x(1)  x(2)    x(m)]X = [x^{(1)} \; x^{(2)} \; \dots \; x^{(m)}]

So:

  • shape of XX: (nx,m)(n_x, m)

Meaning:

  • rows = features
  • columns = training examples

6.2 Output Matrix YY

Similarly, labels are stored as:

Y=[y(1)  y(2)    y(m)]Y = [y^{(1)} \; y^{(2)} \; \dots \; y^{(m)}]

  • shape of YY: (1,m)(1, m)

7. Why This Representation Matters

This matrix form allows us to:

  • process all examples at once
  • avoid slow loops
  • use efficient linear algebra operations

This is critical for deep learning performance.


8. Logistic Regression as a Neural Network

Logistic regression is the simplest neural network:

  • input: xx
  • output: probability y^\hat{y}

It learns:

y^=σ(wTx+b)\hat{y} = \sigma(w^T x + b)

Where:

  • ww: weights
  • bb: bias
  • σ\sigma: sigmoid function

9. Forward and Backward Propagation

9.1 Forward Propagation

  • compute prediction y^\hat{y}

9.2 Backward Propagation

  • compute gradients
  • update parameters

This is the foundation of all neural networks.


10. Big Picture Connection

Now connect everything you’ve learned so far:

Step 1: Input

  • image → vector xx

Step 2: Model

  • neural network / logistic regression

Step 3: Output

  • prediction y^\hat{y}

Step 4: Learning

  • compare with true label yy
  • update model using backprop

11. Key Insights

1. Data must be converted into numerical vectors

2. Training set size is mm

3. Use matrix XX instead of loops

4. Logistic regression = simplest neural network

5. Forward + backward propagation = learning process


Final One-Line Summary

Binary classification in deep learning transforms raw data (like images) into vectors, organizes them into matrices, and uses models like logistic regression to learn mappings from inputs to outputs efficiently.