Fitting the spread to OU process

Hi , 

I have the following questions after watching mean-reversion strategy lectures  :

  1. I see code for calculating theta and half-life in section 4 of mean reversion strategies lecture.

Get historical data for the instruments

x = pd.read_csv('GDX2.csv',index_col=0)

y = pd.read_csv('GLD2.csv',index_col=0)    

Hedge Ratio

model = sm.OLS(y.iloc[:90], x.iloc[:90])

model = 

Spread GLD - hedge ratio * GDX

spread = -model.params[0]*x['Close'] + y['Close']

spread = spread.iloc[:90]

Spread and differenence between spread

spread_x = np.mean(spread) - spread 

spread_y = spread.shift(-1) - spread

spread_df = pd.DataFrame({'x':spread_x,'y':spread_y})

spread_df = spread_df.dropna()

Theta as regression beta between spread and difference between spread

model_s = sm.OLS(spread_df['y'], spread_df['x'])

model_s = 

theta=  model_s.params[0]

Type your code below



Where here mentioned about fitting the spread to OU process ?

model_s = sm.OLS(spread_df['y'], spread_df['x'])

model_s = 

these lines ? or this part (about fitting the spread) is absense in the following code ?

I know that the task of fitting the spread to OU process can be made via Vector Auto Regression. 

You create the function , for instance - def VAR(data, p): 

and then call this function 

OU = VAR(e, 1)

Who knows whether this way of solving is more accurate ? 

Thank's in advance.

Hello Skinner,

  1. "Where here mentioned about fitting the spread to OU process?

    model_s = sm.OLS(spread_df['y'], spread_df['x'])

    model_s = 

    these lines ? or this part (about fitting the spread) is absense in the following code ?" - right! That is where the fitting is happening. 

  2. "I know that the task of fitting the spread to OU process can be made via Vector Auto Regression. 

    You create the function , for instance - def VAR(data, p): 

    and then call this function 

    OU = VAR(e, 1)

    Who knows whether this way of solving is more accurate ?" 

    That is surely possible. But what are the variates you intend to use for VAR? Spread and spread difference? Intuitively,  it does seem like the right thing to do … to model spread using both past values of spread and the past values of spread difference. To get objective results on this do try coding it. Will be happy to help with implementation issues.