Course Name: Backtesting Trading Strategies, Section No: 4, Unit No: 9, Unit type: Exercise
Is it better to remove the rows with NaN or to fill them with previous / later or zero value?
Course Name: Backtesting Trading Strategies, Section No: 4, Unit No: 9, Unit type: Exercise
Is it better to remove the rows with NaN or to fill them with previous / later or zero value?
Hello Apostolos,
To answer your question in short,
Removal of rows/columns:
It is a quick and convenient solution, especially if the amount of missing data is very insignificant.
A downside here is that you may also delete useful information in the process.
Imputing the missing value (or simply, replacing the missing value):
If you feel that you can make a pretty good approximation about what the missing value would be, you can choose to replace it with some arbitrary value.
Now there are many ways to do this and some popular methods include:
Thank you, Kevin.