Query on data manipulation

import datetime

import pandas as pd

import math

import numpy as np

import os

stockticker='birlamoney'

timestamp1=str(int(datetime.datetime(2011,6,16).timestamp())) 

timestamp2=str(int(datetime.datetime(2021,6,16).timestamp()))





time_interval="1d"

#time_interval="1wk"

#time_interval="1mo"



stock_events='history'

#stock_events='div'

#stock_events='split'



locator='https://query1.finance.yahoo.com/v7/finance/download/'\

  • stockticker +'.NS?period1=' + timestamp1 + '&period2=' + timestamp2 + '&interval=' + time_interval + '&events=' + stock_events 



    print(locator)

    print(timestamp1,timestamp2)

    ticker_data= pd.read_csv(locator)

    ticker_data.head()

    df = pd.DataFrame(ticker_data)

    df

    df.to_csv(r'C:\Users\Kirti\Desktop\phd_2021\stock_dataset\birla.csv')

    price= pd.read_csv(r'C:\Users\Kirti\Desktop\phd_2021\stock_dataset\birla.csv', index_col=0)



    price

    data_path = 'C:/Users/Kirti/Desktop/phd_2021/stock_dataset'

    #close_df=pd.DataFrame()

    #for data_path in range(1,10):

    for i in os.listdir(data_path):

        print('scripts',i)

        ind_scripts=i



        ind_scripts    



    multi_stocks = pd.concat(

        map(pd.read_csv,['C:/Users/Kirti/Desktop/phd_2021/stock_dataset/WIPRO.csv', 'C:/Users/Kirti/Desktop/phd_2021/stock_dataset/birla.csv','C:/Users/Kirti/Desktop/phd_2021/stock_dataset/INFY.csv']), ignore_index=True)

    print(multi_stocks)



    wanted to combine all  csv files together with their ticker name  on the top and want  to  make seperate csv file s  of closing price and volume 

Hi Keerti,



Before combining the multiple datasets, I would like to mention yinance library. You can download the library by typing:

!pip install yfinance

on the jupyter code line. For details, check this blog.



Now you can use the library for fetching historical data easily. Here is an example for your ticker and time interval:

 

import datetime
import pandas as pd
import yfinance as yf
df_birla=yf.download('BIRLAMONEY.NS',datetime.datetime(2011,6,16),
   datetime.datetime(2021,6,16))
df_birla

To store OHLCV data of multiple stocks, you can use the same method with adjustments. For example, assume you will use two tickers: 'BIRLAMONEY.NS' and 'WIT'. First, you need to create a list for these tickers and then define a dictionary type variable to store all data. After that, you can download the data for all tickers within a for loop. Here is the implementation:

 

tickers=['BIRLAMONEY.NS', 'WIT']
data_dict = {}
for ticker in tickers:
    data_dict[ticker] = yf.download(ticker,start=datetime.datetime(2011,6,16),
      end=datetime.datetime(2021,6,16))

Now you have a dictionary for OHLCV data of two stocks. To reach out the data of one stock, you can type:
data_dict['WIT']
Now you need to convert data_dict to a data frame. This is tricky part requires data manipulation for two-level headers. If you would like to create a data frame like this:

                               BIRLAMONEY             WIT
date 1 adj close        
            close
            high
            low
            open
date 2 adj close
            close
            high
            low
            open
Then the below line would be sufficient for converting the dictionary:
data_df = pd.concat([data_dict[key].stack() for key in data_dict], 1)
However, if you would like to create a data frame like:
                         BIRLAMONEY                                            WIT
               adj close    close   high  low  open     adj close    close   high  low  open
date 1      ....    .....      .....    .....   ....      ....         ....   .......      ......     .......   .....   ....
date 2      ....    .....      .....    .....   ....      ....         ....   .......      ......     .......   .....   ....   

Then you need to implement further data manipulation by:
data_df = pd.concat([data_dict[key].stack() for key in data_dict], 1)
i=0
for ticker in tickers:
    data_df.rename(columns={i: ticker}, inplace=True)
    i = i + 1
data_df.reset_index(inplace=True)
data_df = data_df.set_index(['Date','level_1']).unstack(level=1)
data_df.head()
Now you can save this data to a csv file. Please keep in mind that to read this kind of data with multiple headers, you need to use header within the read_csv function, like:
pd.read_csv('../Downloads/trial.csv',header=[0,1])

For creating a separate data frame for close and volume, you can filter from the last data frame or alternatively, you can repeat the same process above by downloading only these two features. All you need to do is add [['Close', 'Volume']] at the end of yfinance line.

data_dict[ticker] = yf.download(ticker,start,end)[['Close','Volume']]

Hope this helps.