Normalise Data

What is Normalise Data,… how important is it for backtesting?

How to do Normalisation of data?

Hey Kee,



Normalisation is a technique used for scaling the features within our dataset to a fixed range.



For a better understanding, assume you are working with two features for an asset, namely the RSI and Volatility.

The RSI values would usually range anywhere between 0 to 100, whereas all the Volatility values may approximately lie in a tiny range of 0 to 1. This is because the two features are based on different scales.



Most machine learning algorithms, while working with such datasets will always be biased towards the features that have values of a greater magnitude.



Thus, in our case, the machine learning algorithm might become very sensitive to changes in RSI and at the same time, may not give much weightage to the changes in the values of Volatility.



To avoid such a scenario, we ensure that our ML algorithms give equal weightage and importance to each of the features within our dataset, by using scaling techniques such as Normalisation or Standardisation.

You can learn about the difference between them, by visiting this page here.



I would also strongly recommend that you go through this article linked here.

This article will surely help you learn more about the intuition and the need behind scaling your feature dataset in a very detailed manner.



I hope this helps!