Course Name: Trading Alphas: Mining, Optimisation, and System Design, Section No: 15, Unit No: 6, Unit type: Notebook
The paper
https://arxiv.org/pdf/1601.00991
states alpha 57 to be:
Alpha#57: (0 - (1 * ((close - vwap) / decay_linear(rank(ts_argmax(close, 30)), 2))))
I note the 0 is superfluous and we can simplify it slightly to:
Alpha#57: - 1 * ((close - vwap) / decay_linear(rank(ts_argmax(close, 30)), 2)))
The paper defines the functions in the denominator as follows:
rank(x) = cross-sectional rank
ts_argmax(x, d) = which day ts_max(x, d) occurred on
decay_linear(x, d) = weighted moving average over the past d days with linearly decaying weights d, d – 1, …, 1 (rescaled to sum up to 1)
Now, the denominator is the difficult bit to code up.
The implementation in the Jupyter notebook suggests:
denom = data.Close.rolling(30).apply(np.argmax).rank(axis=1).rolling(
2).apply(lambda x: np.average(x, weights=np.linspace(0, 1, 2)))
But
.rank(axis=1)
returns ranks in 1…N.
In the 101 Alphas, rank(x) is, I believe, interpreted as a cross sectional rank [scaled to 0…1; percentile style)]. In Pandas that is:
.rank(axis=1, pct=True, method=‘average’)
If we do not use pct=True, the denominator scale changes with universe size, which changes the alpha magnitude mechanically.
Also, the paper definition for decay_linear(x, 2) is weights 2, 1 (today gets 2, yesterday gets 1), rescaled to sum to 1.
np.linspace(0, 1, 2) # [0, 1]
That means:
• yesterday weight 0
• today weight 1
So the code, as is, effectively suggests no 2 day decay, and instead just takes today’s value, which is wrong.
Moreover, ts_argmax returns positions 0…window-1, instead of 1…window, which is probably condonable, but, still I wish to flag it.
Could someone, please verify these observations, and implement the correct interpretation of the paper in numpy and, ideally pandas?