Hi, i did not understand completely the logic behind creating the lags (so called features) by simply shifting the original data by 1,2,3…(n days)
The algorithms use these lags as features and learn from the data. But these lags or features are actually created randomly and have no relation to the original returns.
As far as i understand the features (or Xs) are supposed to be real data and the algorithms will identify the patterns among them and highlight the significant features.
Please some one explain this concept.
Yes, it is correct that features should be something that adds predictive power to the ML algorithm.
However, to explain the concept of regression and classification ML, the instructor used the approach known as feature extraction to create multiple features from the existing one.
The idea here was to demonstrate the usage of ML algorithms for backtesting.
It is also noteworthy to see that whether or not, features that were created actually helped in predicting the direction.
Also, what features need to be created and how they can be used will be (known as feature engineering) covered in detail in machine learning lectures.
Thank you.
Thank you.