Hi, I have been fairly new and I have attended few courses already. In all the courses for loading the data, only a single Symbol or a set of symbols at max is used. While training the data for backtesting/Machine learning models, shouldn't we use all the Symbols which we have in market or atleast most of those which are needed for testing? If that is the case, how to extract the data for all the symbols with Symbols as a seperate column along with the dataset?
Hello Vignesh,
There are certain strategies that run on an individual instrument and we demonstrate the strategy on particular instruments. For other strategies that work on a cross-section of stocks, we use a larger universe. For example, you can get data for an entire index as follows:
sp_assets = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')[0]
assets = sp_assets.Symbol.tolist()
data = yf.download(assets, start=start, end=end, as_panel=False)
You can then filter these based on various criteria and run your strategies on each simultaneously.
Thank you Akshay for a quick turnaround. I will definetely try that.