How to correctly create a strategy out of this

Hi everyone!



I have applied many things from this courses, plus others in many readings I have mande that I would love to have an advice on how to correctly proceed.



This is my output of a LSTM neural network. My target output is the close price shift(-1).



df_train["Target"] = df_train.Close.shift(-1)



So what I am trying to predict is the close price one candles ahead, right? Why I'm doing this? Because more than the numeric value, I want the direction of the price. So, this are the charts I have come up with:



Trainning:

Training set



Validation:

Validation set




Things to have into consideration:

  • I did not log the data or used fracc diferentiation (didn't knew this back then), just normalized the data wth a minmaxscaler
  • The minmaxscaler has given me some trouble when I get prices that are outside of the original scaler range. Still don't know how to deal with that.
  • The last layed of the NN is a dense that has relu for activation function.
  • I'm working with tick bars (1000 ticks tick bars). Is 1000 too small? What is a common number? (didn't how to make other type of bars back then)
  • It looks very very accurate, but if you zoom in its not. The direction is pretty sweet tho.



    So, at first glance, I thought that creating a strategy with this outcomes would be quite simple, but its not! First I thought of making a trade if the pct change from the previous predicted point to the actual predicted value is greater than X. My backtesting says that that strategy is a freepass directly to the underworld.



    So, how do I correctly manage this? How do correctly create a strategy? And how often should I retrain the model? In the neural network course its said that I should retrain the model evey weekend, but with how much data? Actually I'm retrainning with 3 month old data, so I can have the most recent and forget the past (I don't think its relevant because I'm working with this tick bars).



    I really appreciate your advices!!

Hi Mario,

Thank you for posting your query on the community.

So what I am trying to predict is the close price one candles ahead, right? Why I'm doing this? Because more than the numeric value, I want the direction of the price. 
 

Since you are interested in the direction of the price, you can change your target variable to the classification problem. It would be interesting to compute the classification report in the validation set.
 

- I'm working with tick bars (1000 ticks tick bars). Is 1000 too small? What is a common number? (didn't how to make other type of bars back then)

1000 should work well. A more systematic approach would be splitting data into three parts: train - test - validation. You can pass varying length of data to your model and freeze on length with the highest accuracy in the test dataset. And validate that in the validation dataset. This might result in overfitting.

So, how do I correctly manage this? How do correctly create a strategy? And how often should I retrain the model? In the neural network course its said that I should retrain the model evey weekend, but with how much data? Actually I'm retrainning with 3 month old data, so I can have the most recent and forget the past (I don't think its relevant because I'm working with this tick bars).
 

If you are live trading this model, then you should retrain your model if the accuracy in the live trading drops below what was seen in the validation dataset minus 1 standard deviation and/or you can decide a fixed interval after which you would like to retrain your model. 

The amount of data used to retrain the model should be the same as you would have originally trained with. And so would be the process of training. You can think of this as you are shifting the data window. This will help to incorporate the most recent data into your model.

I hope this helps.

Hi Ishan and thank you for your reply!

Since you are interested in the direction of the price, you can change your target variable to the classification problem. It would be interesting to compute the classification report in the validation set.

How do I change that? As I said, the actual target of the NN is the next candle close, but how do I change the target to be the direction? And since I'm using LSTM for this, should I change them for just dense layers?

 

One way of doing that would be using softmax in the last layer.



Other recommended and the easier way would be based on what the price is predicted by your LSTM model. For example, if the predicted price is higher than the current price, just BUY, and if less than the current price, just SELL. It would seem this is a simpler usage of LSTM.



You can enhance this strategy as below:

  1. If predicted price > current price * (1+x%) –> buy

    2. If predicted price < current price * (1-x%) –> sell
  2. Else no position



    Thanks

Thank you Ishan for your reply, I will test it and comment you what the results are.



Why using softmax would give me a good classification approach?

You can enhance this strategy as below:
1. If predicted price > current price * (1+x%) --> buy
2. If predicted price < current price * (1-x%) --> sell
3. Else no position


Btw, do you think that this approach will work? If you see in the first image I posted (the training set), you can see that at the start of the period, the predicted price is greater than the real price, but it went down. That's why I was looking more of the direction of the predicted price rather than the value. What I was actually doing was calculating the predicted pct_change(), but it is not a good idea. Even tried the predicted diff(). My idea is...if this is predicting the direction of the price in such good way, that would be more than enough, but it seems not. :(

Once again, thank you for your reply!

Mario, thanks for posting a detailed question.



The objective of designing the LSTM was to predict the next tick bar price. If the predicted price is very far from the actual price then maybe the model needs to be redesigned as the original objective is not met. Ideally, prediction for next step should be close to actual because you are predicting for the next tick bar.



For other application, you can do a correlation analysis of percentage change of actual price and the percentage change of predicted price. If they are highly correlated either in positive or negative direction then only it is recommended to use the model for predicting direction.



Note: Softmax works on probabilities, so it may not give you the best of the results.



I hope this helps.



Thanks,

Ishan