1. Introduction
Deep learning refers to training neural networks, often very large ones. Although the term “neural network” may sound complex, the core idea is actually quite intuitive. A neural network is simply a function that maps an input to an output , but it does so in a flexible and powerful way.
2. Starting with a Simple Example: Housing Price Prediction
To understand neural networks, it is helpful to start with a simple example.
Suppose we want to predict the price of a house based only on its size. We are given a dataset with:
- input : size of the house
- output : price of the house
If you are familiar with linear regression, you might try to fit a straight line to this data. However, there is one issue: house prices can never be negative.
To fix this, instead of using a pure linear function, we can modify the function so that:
- the output is zero when the input is small
- the output increases linearly when the input becomes larger
This type of function can be seen as the simplest form of a neural network.
3. A Single Neuron
This simple model can be represented as a single neuron.
- Input: (house size)
- Output: (predicted price)
The neuron takes the input, applies a mathematical function, and produces the output. In this case, the function is similar to:
“Take a linear function, but never allow the output to go below zero.”
This function is called ReLU (Rectified Linear Unit).
4. ReLU Function (Intuition)
ReLU works as follows:
- If input < 0 → output = 0
- If input ≥ 0 → output increases linearly
The key idea is that ReLU introduces non-linearity, which allows neural networks to model more complex relationships.
5. From One Neuron to Many Neurons
A real neural network is not just one neuron. Instead, it is built by stacking many neurons together, similar to building something with Lego blocks.
Each neuron performs a small computation, and when combined, they can model very complex functions.
6. Adding More Features
Now, consider a more realistic scenario where house price depends on multiple factors:
- size
- number of bedrooms
- zip code
- neighborhood wealth
Instead of using only one input, we now have multiple inputs:
The neural network uses these inputs to learn intermediate concepts such as:
- family size
- walkability
- school quality
These intermediate values are not explicitly given — the network learns them automatically.
7. Hidden Units and Layers
In a neural network:
- The input layer receives the features (x)
- The hidden layer contains neurons that compute intermediate representations
- The output layer produces the final prediction (y)
Each neuron in the hidden layer is called a hidden unit.
An important property is that:
Each hidden unit can use all input features
This is called a fully connected (dense) layer.
8. Automatic Feature Learning
One of the most powerful aspects of neural networks is that we do not need to manually define intermediate features like:
- “family size”
- “school quality”
Instead, the neural network learns these automatically from data.
This is a key difference from traditional machine learning, where feature engineering is often required.
9. Training the Neural Network
To train a neural network, we only need:
- input data
- output labels
We do NOT need to define:
- intermediate features
- internal structure of representations
The network learns all of this by itself during training.
10. Why Neural Networks are Powerful
Neural networks are powerful because they can:
- model complex, non-linear relationships
- automatically learn useful features
- scale with more data
Given enough training data, they can learn highly accurate mappings from input to output.
11. Key Takeaways
- A neural network is a function that maps
- A single neuron is a simple function (often using ReLU)
- Large networks are built by stacking many neurons
- Hidden layers learn intermediate representations automatically
- Neural networks are especially effective in supervised learning
