This is a two part question:
1) Why do we have to shift our signal by 1 period when calculating all of the features for our model? For example, in the SVM course, we created the SMA feature as follows:
df['sma'] = df.close.shift(1).rolling(window=n).mean()
Why do we do this? Why would we want to use lagged data for our SMA calculation when we have access to the most recent data?
2) For live trading when I am getting price updates every minute from my broker, how do I use this SVM model if my goal is to predict the next minute’s return classification? To be clear we will know the updated price of the asset 1 minute from now. Why would we want to use an SVM which produces a forecast right at the same time when the most recent price is updated? If I extend this analogy to 1 hour/1day etc.. how we would use any form of supervised learning if our prediction will be calculated at the same time when the new price is updated.
Thank you for the help.