Hi I'm finding the RL model not producing the same returns when running example solutions on my local jupyter notebook, maybe somebody can tell me if I'm missing something…
I have run the backtests on 'Apply RL on Mixed Pattern Wave' and 'RL_Model_on_Price_Data' and both of these have given worse returns on the default hyperparameters. I've adjusted test_mode = false, have I missed something?
Also in terms of increasing the backtesting speed of RL does anyone have experience running models with Azure, were you able to achieve a significant increase in training speed at a resonable cost for this size of problem?
Thank you
Hi Joseph,
The difference in returns is due to the presence of a random action generator in the code. So basically, there is a parameter EPSILON in the rl_config dictionary, which is the exploration/exploitation threshold. When the model is run, a random number is generated between 0 and 1. If the generated number is less than or equal to EPSILON, then the course of action would be exploration and hence a random action (buy, sell, hold) is selected randomly. As this randomness is involved in every run of the code, the strategy returns might vary.
Hope this helps!
Thanks,
Akshay