AI for portfolio management - Doubt on asset returns calculations for backtest

Hi,



in the course "AI for portfolio management: LSTM networks" section 11, notebook "Implementing LSTM For Portfolio Weights Optimisation" you shift backward by one period assets' returns with ".shift(-1)":

 

# Calculate the daily returns of assets in 'tech'
asset_returns = tech.pct_change().fillna(0).shift(-1)

# Plot the returns
plt.figure(figsize=[15, 7])
plot_returns_optimised(asset_returns.mul(asset_weights), tech, c='b', aclass='', plot_title= 'Strategy Vs Benchmark Returns')


This looks wrong to me, because this way you are introducing look-ahead bias. In fact, if you're using close prices of stocks, you woud know each daily return exclusively at the end of every trading day. By introducing shift(-1), it's like knowing a day before the closing price of the following day.

An example can demonstrate why I am asking a clarification on this point:

Let's suppose that we are on 2022-02-01, and the price of a stock is 100 at the close. The following day (2022-02-02), the price of the same asset closes at 101, for a return of 1%. If I had invested 5% in this asset on 2022-02-01, I would have relalized my return of 0.05% only at the close of 2022-02-02. If I apply shift(-1) to the backtest, it is like I have realized this return on 2022-02-01, when we did not know the closing price of 2022-02-02. It's like if I am attributing returns one period earlier, than what would have been in reality. 

Can you please clarify this?

Thanks in advance!

Luca     

Hi Luca,



Thank you for your query.



There are two things in this. Firstly, we are calculating the weights on the train data and then applying the same weight to the entire dataset (train + test).



We should look only at the test data performance as the training data performance part would have a look-ahead bias. This bias occurs since the full train dataset is used for calculating the weights. That should be okay as we are learning what should be optimal weights using train data.



Secondly, coming to the point you raised, the weights are decided using the train dataset. These weights are locked or kept the same during the entire test period. Therefore, shift(-1) wouldn't introduce look-ahead bias in test dataset.



We see your point, even though it is mathematically same. It is more natural to shift the weights (1 day forward) and multiply it by returns compared to shifting the future returns (1 day backward) and multiplying with weights.



Thanks once again for raising the question. We would discuss internally and make changes accordingly.