Course Name: Quant Investing for Portfolio Managers, Section No: 19, Unit No: 10, Unit type: Notebook
Regarding calculation of portfolio_returns, the original code is:
portfolio_returns = returns_data * selection_data * weights_data
Because closing price is used and we use closing price to calculate signal. If signal is 1, we buy immediately before market close. However the returns between today’s close and tomorrow’s close would be stored in tomorrow’s row.
Therefore, should the code be corrected as follows?
portfolio_returns = returns_data * selection_data.shift(1) * weights_data.shift(1)
Hi H.E.R. Cheung,
Thanks for the thoughtful question. This is a very important point about signal timing vs. return attribution.
Is the existing code a mistake?
Not exactly. The original line:
portfolio_returns = returns_data * selection_data * weights_data
is kept in the course as a simplified, vectorized formulation to demonstrate how selection and weights translate into portfolio returns. In many educational backtests, this convention is used to keep the focus on portfolio construction mechanics without introducing execution-timing complexity in the first pass.
Why your suggested change is a strong improvement
You are absolutely right that in a realistic implementation, if the signal is computed using today’s close, then the earliest you can place the trade is after the close, and the position should earn returns from close(t) → close(t+1).
In that case, shifting the selection and weights by one day is the more execution-aware and conservative approach:
portfolio_returns = returns_data * selection_data.shift(1) * weights_data.shift(1)
(You can also add .fillna(0) after shifting to avoid initial NaNs.)
In a nutshell:
- The existing code is a teaching simplification for clarity.
- Your approach is a better practice for avoiding look-ahead bias and aligning with real-world execution assumptions.
We appreciate you pointing this out. It’s a valuable enhancement and we shall consider updating the notebook in the future to clarify the timing assumption explicitly.