A problem with a script of Hurst exponent with 9 periods

Ghery_Cardenas · February 21, 2022, 2:48am

Hello there:

I was actually writing a code in order to evaluate the Hurst exponent of 9 periods in python, after that the code reads an excel sheet and puts the data into a pandas dataframe, and evaluates the hurst exponent for the column of the lows and the column of the highs (using only 9 periods of data) but I get this error:

File "C:\Users\Ghery\anaconda3\lib\site-packages\pandas\core\indexing.py", line 761, in _validate_key_length

raise IndexingError("Too many indexers")

IndexingError: Too many indexers

The code I used was this:

from scipy import stats

import pandas as pd

import numpy as np

from scipy import stats

def Hurst9(df):

# calculate returns and eliminate the first row

df = df.pct_change()

df = df.iloc[1:]

# split the dataframe in 2 dataframes each one with the first and the last rows

df1 = df.iloc[:4,:]

df2 = df.iloc[4:,:]

# split the later dataframes in 2 other dataframes

df1_1 = df1.iloc[:2,:]

df1_2 = df1.iloc[2:,:]

df2_1 = df2.iloc[:2,:]

df2_2 = df2.iloc[2:,:]

# calculate the standard deviation of every dataframe

stdev = df.std()

stdev1 = df1.std()

stdev2 = df2.std()

stdev1_1 = df1_1.std()

stdev1_2 = df1_2.std()

stdev2_1 = df2_1.std()

stdev2_2 = df2_2.std()

# Rest the mean for every column

df = df.add(-df.mean())

df1 = df1.add(-df1.mean())

df2 = df2.add(-df2.mean())

df1_1 = df1_1.add(-df1_1.mean())

df1_2 = df1_2.add(-df1_2.mean())

df2_1 = df2_1.add(-df2_1.mean())

df2_2 = df2_2.add(-df2_2.mean())

# Calculate the cumulative sum of every dataframe

df = df.cumsum()

df1 = df1.cumsum()

df2 = df2.cumsum()

df1_1 = df1_1.cumsum()

df1_2 = df1_2.cumsum()

df2_1 = df2_1.cumsum()

df2_2 = df2_2.cumsum()

# Calculate the range for each column

r = df.max() - df.min()

r1 = df1.max() - df1.min()

r2 = df2.max() - df2.min()

r1_1 = df1_1.max() - df1_1.min()

r1_2 = df1_2.max() - df1_2.min()

r2_1 = df2_1.max() - df2_1.min()

r2_2 = df2_2.max() - df2_2.min()

# Calculate the rescaled range

rs = r/stdev

rs1 = r1 / stdev1

rs2 = r2 / stdev2

rs1_1 = r1_1 / stdev1_1

rs1_2 = r1_2 / stdev1_2

rs2_1 = r2_1 / stdev2_1

rs2_2 = r2_2 / stdev2_2

# Calculate average rescaled range for each chunk

ave_RS = float(rs)

ave_RS_1 = float(0.5*(rs1 + rs2))

ave_RS_2 = float((0.25*(rs1_1 + rs1_2 + rs2_1 + rs2_2)))

# Evaluate the natural logarithm for each size and each average rescaled range

x = np.log(np.array([8,4,2]))

y = np.log(np.array([ave_RS,ave_RS_1,ave_RS_2]))

slope, intercept, r_value, p_value, std_err = stats.linregress(x,y)

return slope

data = pd.read_excel('file.xlsx')

Hurst_low = Hurst9(data['low'])

Hurst_high = Hurst9(data['high'])

print('Hurst low is', Hurst_low)

print('Hurst high is', Hurst_high)

Can Anyone tell me how to solve this??? whats wrong with my code??

Akshay_Choudhary · February 22, 2022, 10:24am

Hi Ghery,

This may happen sometimes when indexes are similar or contain similar values.

For more details, you can also refer to the following thread -

https://stackoverflow.com/questions/30781037/too-many-indexers-with-dataframe-loc

Regards,

Akshay

Ghery_Cardenas · February 28, 2022, 11:30pm

Ok, thanks… but I reviewed the link and I am still not sure how to fix

Akshay_Choudhary · March 2, 2022, 10:40am

Hi Ghery,

One possible solution can be splitting the rows simply using df.iloc[:X]

For example -

df1 = df.iloc[:4]

df2 = df.iloc[4:]

I hope this helps.

Thanks,

Akshay