Hello everyone! I'm a little confused with the triple barrier horizon subject that is treated in section 15 of the feature engineering course.
Lets suppose I have a machine algo that I want to use to predict the next move or action I will take in the market. How I actually do the labeling in the triple barrier horizon method? I mean…I am trying to predict the labeling outcome? And this should be applied to a regression problem, right? Because the path of the price is needed and also I want to see the cumulative pct change, right?
Hello Mario,
So, Triple barrier method is basically for creating labels corresponding to price data. Labels are needed because all supervised learning happens using examples/labels. This is only for a classification problem, not a regression problem. The labels, in this case, will be signals: Buy(1), Sell(-1) or Hold(0).
Now imagine a continuous window of price data. The first N bars/candles will be your input window. The remaining M bars/candles will be the window you will use to figure out a label. ML algorithms like say LSTM will take the first N bars as input and train them for the label we generate using the remaining M bars.
Now, how do we generate a label using the remaining M bars? Triple barrier method is one method. How does it do that?
It uses 3 boundaries to figure out the label. Two horizontal and one vertical.
If the price path in the remaining M bars touches the upper horizontal barrier it is assumed that given the previous N bars ( the ones for input ) the price will go up in the next M. Hence we label it as 1 or buy.
Similarly, if it touches the lower horizontal barrier it is assumed that given the N bars, in the beginning, the price will start falling in the following M bars. Hence, we label it as -1 or sell.
If it touches neither horizontals till the end of M bars it is assumed that given the first N bars the price won't either fall or rise for the following M bars. Hence, we label it as 0.
Do get back if further clarification is needed.
Hello Akshay, thank you for your nice explanation.
So, if I got it right, my targets are this -1,0 and 1 labels. So, what I would want to do is to my ML algo to give me as output -1,0 or 1, right? I'm assuming that I won't get exactly those numbers, so I would need to define some range. For example, between 0.5 and 1 I should consider it as a 1, right?
Or I need to define the labels depending on the path that the price takes? I mean, I do a triple barrier horizon for my inputs and label them depending of which barrier they touched first and that would be the output?
The thing that confuses me is if the label is what I am actually trying to predict or the label is a result of the output prediction.
Lets say I use a LSTM and I define a 5 bars horizon. Then, I give the last 20 bars as inputs, but I would get only 1 output that corresponds to the prediction of the 5th candle. How do I calculate the path there, if I'm having only 1 number? That's because I would do candles.Close.shift(-5) as my desired target, but I think I'm missing something here.
Please, help me understand what I'm missing
Hello Mario,
Will answer your questions sequentially.
"So, if I got it right, my targets are this -1,0 and 1 labels. So, what I would want to do is to my ML algo to give me as output -1,0 or 1, right?" - That's right.
"I'm assuming that I won't get exactly those numbers, so I would need to define some range. For example, between 0.5 and 1 I should consider it as a 1, right?" - That is right. It's a simple pandas conditional operation.
Or I need to define the labels depending on the path that the price takes? I mean, I do a triple barrier horizon for my inputs and label them depending of which barrier they touched first and that would be the output? - no. We don't touch the input window at all. There is a sperate label window or horizon window which will give us the label.
The thing that confuses me is if the label is what I am actually trying to predict or the label is a result of the output prediction. - So, these labels are for training. If in the future you get the input data similar to the one in the training data your model should predict the label you found using the triple barrier method in the training data.
"Lets say I use a LSTM and I define a 5 bars horizon. Then, I give the last 20 bars as inputs, but I would get only 1 output that corresponds to the prediction of the 5th candle." - That is right you will get only one label for the 5 bar horizon.
"How do I calculate the path there, if I'm having only 1 number?" - So, generally, we look for a flag by iterating over each bar/candle in the horizon. If any of the bars touch the upper or lower horizontal barrier the label will be 1 or -1. If not, it will be 0.