Dropping rows with Zero value or backfill

Hi,

The data set for GC1 from quandl contains zero values in OHL columns and wrong values in Close column from Oct-2017-Dec-2017 which is skewing the data graph. I'm trying to drop the rows where any column value is Zero but the code is not executing correctly. Can you point our where I'm going wrong? Below is the code snippet:



Data = quandl.get("CHRIS/MCX_GC1", start_date="2017-1-1", api_key=api_key)   # get Gold prices from Quandl

Data[(Data['Open'] != 0)]



Sample Data output which is causing issues:

27-10-2017 0 0 0 1462 0 0
30-10-2017 0 0 0 47 0 0

Thanks for your query Mukesh. Try accessing like this: 

 

Data[Data.High!=0.0].High.plot() 

Do tell me if further assitance is needed. 
 

Hi,



The code is not working. The problem is that although OHL columns have 0 values,  values in Close column are non-zero but they are incorrect. I want to either backfill/forward-fill the values in OHLC columns (including Close column) where either of OHL value is zero. I've tried the "replace" method also but the same is also not working.



Thanks,

Mukesh Kumar Gupta

Set Close to zero when Open is 0. You can do similar exercise when other columns are zero

Data.loc[Data.Open==0.0, 'Close'] = 0.0


Replace zeros in the dataframe with NaN or blank values

Data = Data.replace(0, np.nan)


Replace NaN values with previous values

Data = Data.fillna(method='ffill')


Print the top 5 rows

Data.head()



I would recommend that you print the Data after each step to see what is changing. Also, instead of forward filling the values you can also consider to drop the values. 



I hope this helps.

Yes, this works. However, ffill and dropping rows containing 0 values have little effect on the graph. It still smoothens out the portion of the graph where values are missing. But it is much better than having a graph with an abnormal dip in the plot.



Thanks for your help!

Great to hear that.