The concept of mean normalization and feature scaling is often overlooked. Mean normalization calculates and subtracts the mean for every feature in a machine learning model and is a way to implement feature scaling. Feature scaling brings all the features of a machine learning model within a range. By the end of this article, you will have a clear understanding of these two concepts.

## Mean Normalization Definition

Mean normalization is a process used in machine learning that calculates and subtracts the mean for every feature. It’s used to implement feature scaling, which brings all features in a model to a similar range.

To understand these two concepts, we’ll cover the basics:

- What is feature scaling?
- What is mean normalization?
- When do we use these concepts?
- Why do we need these techniques?

Let’s go over each of these questions one-by-one.

## What Is Feature Scaling?

Feature scaling is the process of bringing all of the features of a machine learning problem to a similar scale or range. The definition is as follows

Feature scaling is a method used to normalize the range of independent variables or features of data.

Feature scaling can have a significant effect on a machine learning model’s training efficiency and can improve the time taken to train a model.

## What Is Mean Normalization?

Mean normalization is a way to implement feature scaling. Mean normalization calculates and subtracts the mean for every feature. A common practice is also to divide this value by the range or the standard deviation.

When the same process is done, and the standard deviation is used as the denominator, then this process is called standardization.

## When Do You Use Mean Normalization and Feature Scaling?

Generally, feature scaling is used when the features don’t have the same range of values. To explain this, let’s look at an example of housing prices. In this problem, there might be many features to consider, but we’ll focus on two of them for simplicity.

Now, the range of `x1`

can be from two-to-five and the range of `x2`

can be from 2,500-to-5,000. When we look at the ranges, we can see that there is a huge difference. This difference can slow down the learning of a model.

## Why Is Mean Normalization Important?

Now, to the most important question: Why do we need these concepts and techniques? It has been partially answered in the previous section. To discuss in detail we need to understand a data visualization graph called Contours.

Contour plots are a way to show a three-dimensional surface on a two-dimensional plane.

In an unnormalized contour chart, the graph is skewed and takes on an oval shape. A normalized contour takes up the shape of a circle and is evenly spaced.

When we apply gradient descent in both situations, the gradient descent converges to the minimum faster if the input is normalized. Whereas, if the input isn’t normalized, gradient descent takes up a lot of steps before it converges to a minimum, which can slow down the learning process of the model.

To summarize, gradient descent converges to a minimum faster, which is directly related to the learning of the model if the inputs are normalized. Feature scaling is advised if the range of the features vastly differ.

## Frequently Asked Questions

### What is mean normalization?

Mean normalization is a process used to implement feature scaling and involves calculating and subtracting the mean of every feature. This helps to bring data within a normalized range of probabilities and speed up learning.

### What is the mean normalization equation?

The mean normalization equation is `x' = x - μ / max(x) - min(y)`

.