A question concerning input variables for stocks

Hello there:



                    I have noticed in the course  "Trading with machine learning: regression", that several input variables are used to predict the next day0's low and the next day's high for GLD price…



the nvput variables used in the course were 3 moving averages, the correlation between the close and its 3 moving averages, the difference between the open and the previous open, and finally the difference between the previous day's close and the open …



Now … If you calculate the Pearson correlation coefficient for those… variables with the difference between the open and the close… you'll seee that they are not stringly correlated…



Why are these variables used ??



And by the way… what variables are good as input variables for stocks ??



Thanks a lot…

Hello Ghery,

When selecting features ( input variables as you mentioned) for a machine learning model, it is generally better to choose features that are not strongly correlated with each other.



Features with low correlation among themselves can provide independent and diverse information to the model, leading to improved generalization and performance. This is the reason for selecting the features that are not strongly correlated



The features to use for stocks depends on what you are planning to predict. 

The following are the most commonly used features for cases like predicting the prices of stocks. 

  • Historical price data (open, high, low, close)
  • Technical indicators ( moving averages, Relative Strength Index, Bollinger Bands, MACD etc) 
  • Market indicators (Market indices, sector indices, interest rates, exchange rates etc)
  • Volume data (average volume, volume ratios, or volume-based indicators like On-Balance Volume or Accumulation/Distribution)
  • News and sentiment analysis (news articles, social media sentiment, or financial reports)
  • Fundamental indicators (earnings per share, price-to-earnings ratio, dividend yield, or financial ratios specific to the industry or sector)
  • Market Sentiment (investor sentiment indices, surveys, or options market data)
  • Economic Indicators (GDP growth, inflation rates, interest rates, unemployment data etc)
It's important to note that the selection of features depends on the specific stock, market conditions, and the availability of data. Feature engineering and domain expertise play a crucial role in identifying the most informative features for accurate predictions.

I hope this helps, please let us know if you have any more queries or requires additional details on the above response.
Happy learning!!