A problem with a script of Hurst exponent with 9 periods

Hello there:



 I was actually writing a code in order to  evaluate the Hurst exponent of 9 periods in python, after that the code reads an excel sheet and puts the data into a pandas dataframe,  and evaluates the hurst exponent for the column of the lows and the column of the highs (using only 9 periods of data)  but I get this error:





  File "C:\Users\Ghery\anaconda3\lib\site-packages\pandas\core\indexing.py", line 761, in _validate_key_length

    raise IndexingError("Too many indexers")



IndexingError: Too many indexers



The code I used was this:



from scipy import stats

import pandas as pd

import numpy as np

from scipy import stats







def Hurst9(df):

    # calculate returns and eliminate the first row

    df = df.pct_change()    

    df = df.iloc[1:]

    # split the dataframe in 2 dataframes each one with the first and the last rows

    df1 = df.iloc[:4,:]

    df2 = df.iloc[4:,:]

    # split the later dataframes in 2 other dataframes 

    df1_1 = df1.iloc[:2,:]

    df1_2 = df1.iloc[2:,:]

    df2_1 = df2.iloc[:2,:]

    df2_2 = df2.iloc[2:,:]

    # calculate the standard deviation of every dataframe

    stdev = df.std()

    stdev1 = df1.std()

    stdev2 = df2.std()

    stdev1_1 = df1_1.std()

    stdev1_2 = df1_2.std()

    stdev2_1 = df2_1.std()

    stdev2_2 = df2_2.std()

    # Rest the mean for every column

    df = df.add(-df.mean())

    df1 = df1.add(-df1.mean())

    df2 = df2.add(-df2.mean())

    df1_1 = df1_1.add(-df1_1.mean())

    df1_2 = df1_2.add(-df1_2.mean())

    df2_1 = df2_1.add(-df2_1.mean())

    df2_2 = df2_2.add(-df2_2.mean())

    # Calculate the cumulative sum of every dataframe

    df = df.cumsum()

    df1 = df1.cumsum()

    df2 = df2.cumsum()

    df1_1 = df1_1.cumsum()

    df1_2 = df1_2.cumsum()

    df2_1 = df2_1.cumsum()

    df2_2 = df2_2.cumsum()

    # Calculate the range for each column

    r = df.max() - df.min()

    r1 = df1.max() - df1.min()

    r2 = df2.max() - df2.min()

    r1_1 = df1_1.max() - df1_1.min()

    r1_2 = df1_2.max() - df1_2.min()

    r2_1 = df2_1.max() - df2_1.min()

    r2_2 = df2_2.max() - df2_2.min()

    # Calculate the rescaled range

    rs = r/stdev 

    rs1 = r1 / stdev1

    rs2 = r2 / stdev2

    rs1_1 = r1_1 / stdev1_1

    rs1_2 = r1_2 / stdev1_2

    rs2_1 = r2_1 / stdev2_1

    rs2_2 = r2_2 / stdev2_2

    # Calculate average rescaled range for each chunk

    ave_RS = float(rs) 

    ave_RS_1 = float(0.5*(rs1 + rs2))

    ave_RS_2 = float((0.25*(rs1_1 + rs1_2 + rs2_1 + rs2_2)))

    # Evaluate the natural logarithm for each size and each average rescaled range

    x = np.log(np.array([8,4,2]))

    y = np.log(np.array([ave_RS,ave_RS_1,ave_RS_2]))

    slope, intercept, r_value, p_value, std_err = stats.linregress(x,y)

    return slope





data = pd.read_excel('file.xlsx')

Hurst_low = Hurst9(data['low'])

Hurst_high = Hurst9(data['high'])



print('Hurst low is',  Hurst_low)



print('Hurst high is',  Hurst_high)





Can Anyone tell me how to solve this???  whats wrong with my code??









 

Hi Ghery,



This may happen sometimes when indexes are similar or contain similar values.

For more details, you can also refer to the following thread - 



https://stackoverflow.com/questions/30781037/too-many-indexers-with-dataframe-loc





Regards,

Akshay

Ok, thanks… but I reviewed the link and I am still not sure how to fix

Hi Ghery,



One possible solution can be splitting the rows simply using df.iloc[:X]

For example - 



df1 = df.iloc[:4]

df2 = df.iloc[4:]



I hope this helps.



Thanks,

Akshay