Hi,
I have some questions regarding the feature engineering course, and was hoping for someone to clarify them to me. In the feature extraction section, we saw different types of bars. It is supposed that those bars are my features or I should recalculate my old features (and maybe new ones) with this new type of bars? And how do I specifically extract new features from those bars? What features should I look for? Or I stick with time based bars and use the other bars as features? I'm really confused with this point :(.
Also, how can I make imbalance or dollar bars in forex? As far as I know, brokers doesn't give us the actual volume. For FX we can only make tick bars? (which btw, we also lose volume).
And how should be the process of making features in this case? Resample bars into tick/imbalance/dollar/etc bars, make resample stationary and calculate features? (ie: moving average, fourier transform, wavelet, etc).
Thank you in advance!
Hello Mario,
I will address your queries one-by-one.
"It is supposed that those bars are my features or I should recalculate my old features (and maybe new ones) with this new type of bars?" - the reason for creating dollar bars is to sample better and get returns as closer to a normal distribution as possible. As you can expect, the prices of dollar bars won't be stationary. Hence, you will have to make them stationary using a method like fractional differentiation. Yes, you will have to recalculate features using these bars.
"And how do I specifically extract new features from those bars? What features should I look for? - You can use features like n bar returns, high minus low. You can use indicators like ATR, Bollinger bands etc
"Or I stick with time based bars and use the other bars as features?" - I would suggest you try out both and observe empirically.
"Also, how can I make imbalance or dollar bars in forex?As far as I know, brokers doesn't give us the actual volume. For FX we can only make tick bars? (which btw, we also lose volume)." - if your data vendor collates volume data you should go for it. Else stick to tick and time bars.
"And how should be the process of making features in this case? Resample bars into tick/imbalance/dollar/etc bars, make resample stationary and calculate features? (ie: moving average, fourier transform, wavelet, etc)." - I would convert them to tick bars. Then make them partially stationary ( fractional differentiation ) and then apply transforms and indicators.
Do connect if further assistance is needed.
Thank you very much for taking your time for answering my questions, Akshay!
Really helped me out :).
When you tell me "my data vendor collates volume data", I am extracting my data from MT4 broker, using ZeroMQ. Real volume does is not included, at least in FX market and, as far as I know, in the FX market there is no way to get real volume. Is there data vendors that gives you real volume with each tick in the FX market? I'm asking because I haven't had any experience with any data vendor. Just my broker feed, which I store in an SQL DDBB, tick by tick. Could you recommend me a data vendor that gives volume?
Thank you very much!
You're welcome, Mario!
Since Forex trades are OTC, volume data doesn't really mean anything. Even the ones provided by brokers are generally not useful as they are broker specific.
There is however one database I know about by CLS. CLS is an FX settlement service.
Hi Mario,
I'd like to confirm my agreement with Akshay's answer.
Volume information for forex might be useful if you have it for the specific venue where you're trading, but it is almost never made available. We know of one MT4 broker that provides number of trades but not dollar volume.
I agree with the suggestion to try out combinations of time-based bars and indicators with information bars as features and vice-versa, but of course you need to avoid look-ahead bias.