Features to be used in machine learning and the VWAP technical indicator

Hi, there…:



I recently took the course about trading with machine learning using regression  and wondered:



Are there any features that can be used to predict's Stock prices that involves volume ??    in the course GLD prices for the low and the High of the next day are estimated… using yd = Open - Low  and yU = High - Open…  in that case  Volume wasn't used in the features…



What if I want yo use the same variable yU and yD to predict  the low and the high given the open… but I want to incorporate the Volume in my features… can the VWAP indicator be used??  what about the VWAP - previous day's VWAP ??  and how many periods can you recommend me to evaluate the VWAP… ??



Thanks

Hello Ghery, 



Yes, incorporating volume as a feature in predicting stock prices can be valuable. The Volume-Weighted Average Price (VWAP) is one such indicator that incorporates volume in its calculation. As you might have already known, VWAP is calculated by multiplying the volume of each trade by the price and then dividing the sum of these values by the total volume. So, it provides a measure of the average price weighted by the volume traded.



Using VWAP as a feature in your prediction model can potentially provide useful information about the average price at which trading has occurred during a given period. By comparing the current VWAP with the previous day's VWAP, you can capture the relative change in average price. This can help you assess whether the current price is trending higher or lower compared to the previous day.



As for the number of periods to evaluate the VWAP, common choices range from 10 to 30 periods. Shorter periods capture more recent price action 



In summary, incorporating the VWAP or the relative change in VWAP as features in your prediction model can potentially provide additional insights into stock price movements, especially when combined with other relevant features.



I hope this helps!

 

Hi there:



 Well… thanks for the answer… I suspected the same…  but unfortunately…  I computed the VWAP and took the difference with the previous's day's VWAP… and I computed the correlation between these two… with my target variable (just like the example provided in the course… (the target variable here is the difference between the Open and the low )… the correlation turned out to be very poor for the stocks that I intended to use  (  the correlation in all cases was less than 0,4 … as far as I know… these correlation needs to be as close to 1 as possible , right??, there needs to be a strong correlation between the input variables an the target variable… right ??) 



So What do I  now ?? What can you suggest tome to incorporate the Volume as a variable ??   ( what about the float ??   how can the float be incorporated here  ??  (with float I mean the number of stocks available for trading)… what about dilution… ?? dilution is also going to affect the prices right??



Thanks in advance





 

Hello Ghery, 

it's important to note that correlation alone does not determine the effectiveness of a feature in predicting the target variable. While a high correlation is generally favorable, there are cases where variables with relatively low correlation can still contribute valuable information to the prediction model.



There could be several reasons why the correlation between the VWAP difference and the target variable turned out to be poor for the stocks you analyzed. For example, the VWAP-related features, in fact volume itself may not have a strong linear relationship with the target variable for the stocks you selected.  



And also, the relationship between the VWAP difference and the target variable might not be strictly linear. It's possible that there is a non-linear relationship.



Since your objective is to include volume-based features, it's worth exploring and including additional features such as On-Balance Volume (OBV), rolling volume, Money Flow Index (MFI), Accumulation/Distribution Line (A/D Line), Volume Ratio (The ratio of the current day's volume to the average volume over a specific period can provide insights into unusual trading activity or significant deviations from the norm),  that capture different aspects of volume behaviour.



In addition to the features listed above, as you mentioned, incorporating the float and considering dilution can indeed be relevant factors in predicting stock prices. 



Remember, the effectiveness of these features may vary depending on the specific stock and market conditions. It's essential to test and validate different combinations of features and evaluate their performance in your linear regression model. I hope this clears your doubts.