Determining the optimal lag period using unrestricted VAR

Jane_D · June 14, 2022, 4:00pm

Im trying to determining the optimal lag period using unrestricted VAR for a multivariate time serries dataframe. I tested it out on test data but when using actual asset data Im not sure why its giving me the error:

File "TEST.py", line 81, in rebalance
KeyError: 'aic'

line 81:res = mod.select_order(trend="c")

Please see code below.


# Import libraries

import numpy as np
import pandas as pd

import numpy as np
import statsmodels.tsa.api as tsa
# Import blueshift libraries
from blueshift.pipeline import Pipeline
from blueshift.api import(
symbol,
order_target_percent,
schedule_function,
date_rules,
time_rules,
attach_pipeline,
pipeline_output,
get_datetime
)
# The strategy requires context.lookback number of days. 
def make_strategy_pipeline(context):
"""
A function to make the pipeline.
""" 
pipe = Pipeline()
return pipe
def initialize(context):
context.lookback = 50
attach_pipeline(make_strategy_pipeline(context), name='strategy_pipeline')
# Rebalance every day
schedule_function(rebalance,
date_rules.every_day(),
time_rules.market_open(hours=0, minutes=1))
def rebalance(context, data):
# Get the pipeline results
pipeline_results = pipeline_output('strategy_pipeline')

# Get the data for the pipeline output
try:
data = data.history(
pipeline_results.index, 'close', context.lookback, '1d') 
except IndexError:
return


IC = "aic" # or "bic", "fpe", "hqic"
print("data",data)
mod = tsa.VAR(data)
res = mod.select_order(trend="c")
print(f"{IC} selects {res.ics[IC][res.selected_orders[IC]]}")

Jose_Carlos_Gonzales_Tanaka · June 14, 2022, 8:33pm

Hello Jenny:

We're glad to know you're learning how to use blueshift.

Let's make some comments regarding your code:

Line 81 is returning a keyerror named "aic" because while runing the function, it doesn't produce any model information criteria outcome, meaning, there are no model estimations running inside the function select_model. This is happening because you aren't actually selecting any asset to run the code. If you see your function "make_strategy_pipeline", you will encounter there is no applied filter to get any asset from this pipeline. Please refer to this documentation to get to know more how to apply the Pipeline function and come back to your code later to get the assets you want for your code.
Another possibility to get your data is to get rid of the Pipeline function and just call a group of assets of your preference. You can redefine the below two functions like this:

def initialize(context):
    context.lookback = 252
    context.secList = [symbol('AMZN'),
                       symbol('AAPL'),
                       symbol('WMT'),
                       symbol('MU'),
                       symbol('BAC'),
                       symbol('KO'),
                       symbol('BA'),
                       symbol('AXP'),
                       symbol('XOM')]
    # Rebalance every day
    schedule_function(rebalance, date_rules.every_day(), time_rules.market_open(hours=0, minutes=1))

def rebalance(context, data):
    # Get the pipeline results
    # Get the data for the pipeline output
    try:
        data = data.history(context.secList, 'close', context.lookback, '1d') 
    except IndexError:
        return

    IC = "aic" # or "bic", "fpe", "hqic"
    data.dropna(inplace=True)
    print("data",len(data.index))
    mod = VAR(data)
    res = mod.select_order(trend='c')

Please, in the initialize function, please change your "context.lookback" to more than 50 observations. I recommend this because the VAR estimation with a great number of lags might not be possible when there are so few observations (Remember from a simple univariate regression model that you have to have the number of observations higher than the number of betas, they same rule applies for the VAR). You can specify 252 for context.lookback as a decent number of observations to run your VAR select_order.
Remember that in econometrics, you want a parsimonious model, instead of a model that describes "perfectly" the data. Consequently, don't try to run select_order with too much lags to consider. You can estimate with maxlags 5 or 7.
You dataframe "data" are "close" values. This is wrong. Please remember that a VAR must be stable. Meaning, the VAR must be stationary. VAR estimation with close prices might give you an unstable VAR. You have to work with stationary time series when you want to estimate an unrestricted VAR.

I hope this comments help you,

In case you have more questions, please let us know,

Thanks and regards,

Jose Carlos

Jane_D · June 15, 2022, 1:15am

For example, consider a portfolio of

GLD and GDX formed with the

hedge ratio -1.63 :

the zScore > entryZscore

? * GLD ? * GDX.

Also if

the zScore < entryZscore

? * GLD ? * GDX.

Coversley GLD and GDX formed with the

hedge ratio 1.63 :

the zScore > entryZscore

? * GLD ? * GDX.

Also if

the zScore < entryZscore

? * GLD ? * GDX.

Can you answer this please?

Thanks.

Jose_Carlos_Gonzales_Tanaka · June 15, 2022, 1:46am

Hello Jenny,

So I'm guessing you want to know what numbers must be multiplied to each stock when the hedge ratio is 1.63 or -1.63, right?

If that's the case, please remember always that the dependent variable number is always normalized to one. The independent variable will be the one which will be multiplied by the hedge ratio.

So, for the first case:

? * GLD ? * GDX. = 1 * GLD -1.63 * GDX.

For the second case:

? * GLD ? * GDX. = 1 * GLD +1.63 * GDX

Instead of assigning the GDX as the independent variable, you might want to assign it as the dependent variable. The model will look like this:

1 * GDX (+/-)hedge ratio *GLD.

You might ask: So, how should you specify the model? Which stock would you need to have as an independent and dependent variable?

Well, the answer is that the researcher is in charge of choosing which stock goes as independent or dependent variable. As a trader, you need to choose. Whatever the case, remember that the dependent variable number of shares must be always 1, it's the independent variable that must have the hedge ratio as number of shares.

I hope these comments might help you.

Thanks and regards,

José Carlos

Jane_D · June 15, 2022, 3:50am

entryZscore = 0.25

So if the zScore < - entryZscore or zScore > entryZscore

Meaning if the spread has crossed the lower limit or the spread has crossed the upper limit

and the hedge is -1.63

It doesnt matter

? * GLD ? * GDX. = 1 * GLD -1.63 * GDX.

Both situations

? * GLD ? * GDX. = 1 * GLD -1.63 * GDX.

Is that what you are saying?

Jose_Carlos_Gonzales_Tanaka · June 16, 2022, 12:20pm

Hello Jenny,

The hedge ratio computation is done before you compute the entry scores. The hedge ratio value doesn't depend on the entryZscore.

The entryZscore is computed to check whether we should square off the position or not.

I hope this helps,

Thanks and regards,

José Carlos

Jane_D · June 16, 2022, 2:44pm

When I want to enter a trade using the spread's diviation form the mean. What are the weights of the 2 assets found to be 'corelated'? This is what Im asking.

Threfore:-

Consider a portfolio of

GLD and GDX formed with the

hedge ratio -1.63 :

the zScore > entryZscore

? * GLD ? * GDX.

Also if

the zScore < entryZscore

? * GLD ? * GDX.

Coversley GLD and GDX formed with the

hedge ratio 1.63 :

the zScore > entryZscore

? * GLD ? * GDX.

Also if

the zScore < entryZscore

? * GLD ? * GDX.

Can you answer this please?

This would mean you give 4 answers and each answer would consist of 2 parts the GLD and GDX weight.

Jose_Carlos_Gonzales_Tanaka · June 16, 2022, 6:49pm

Hello Jenny,

I'm sorry but I don't understand so well what you are trying to answer.

Let me guess what you are tryint to say and then answer based on that.

You say:

1) When I want to enter a trade using the spread's diviation form the mean.

Answer: This sentence is well understood.

2) What are the weights of the 2 assets found to be 'corelated'? This is what Im asking.

Answer, in pairs trading, you don't talk about correlated assets. You actually talk about cointegrated assets. Cointegration and correlation don't mean the same.

3) I believe you are trying to ask what positions you must take when you decide to go long on the spread when hedge ratio is -1.63 and when hedge ratio is +1.63.

a) When hedge ratio is -1.63, then spread = GLD - hedge_ratio * GDX, then you write the spread like this: spread = GLD - (-1.63) * GDX, which is equal to spread = GLD + 1.63 * GDX.

If you decide to go long on the spread, this implies that you will go long on both stocks: You will buy one share of GLD and 1.63 shares on GDX. If you want to close the position, either because the stop loss triggers or the exit condition triggers, then you just square off the position, i.e., you sell your GLD stock share and you sell your GDX 1.63 shares.

a) When hedge ratio is 1.63, then spread = GLD - hedge_ratio * GDX, then you write the spread like this: spread = GLD - (+1.63) * GDX, which is equal to spread = GLD - 1.63 * GDX.

If you decide to go long on the spread, this implies that you will go long on GLD and short sell GDX: You will buy one share of GLD and short sell 1.63 shares on GDX. If you want to close the position, either because the stop loss triggers or the exit condition triggers, then you just square off the position, i.e., you sell your GLD stock share and you buy back your GDX 1.63 shares.

We go long the spread when the zScore goes below the entryZscore, no matter what value the hedge ratio has. That's why I told you that the hedge ratio computation is a step done before the entryZscore seeting. The hedge ratio value don't depend on the entryZscore. You just go long or short the spread depending on whether the zScore goes below the entryZscore.

I hope this piece of information helps,

Thanks,

José Carlos

Jane_D · June 16, 2022, 10:33pm

What happens when we go short on the spread (the zScore goes above the entryZscore)?

Jose_Carlos_Gonzales_Tanaka · June 19, 2022, 10:55pm

Hello Jane,

Let me answer your question with our above example.

When hedge ratio is 1.63, then spread = GLD - hedge_ratio * GDX, then you write the spread like this: spread = GLD - (+1.63) * GDX, which is equal to spread = GLD - 1.63 * GDX.

If you decide to go long on the spread, this implies that you will go long on GLD and short sell GDX: You will buy one share of GLD and short sell 1.63 shares on GDX. If you want to close the position, either because the stop loss triggers or the exit condition triggers, then you just square off the position, i.e., you sell your GLD stock share and you buy back your GDX 1.63 shares.

When you go short the spread, this implies that you will go short on GLD and buy GDX: You will sell one share of GLD and buy 1.63 shares on GDX. If you want to close the position, either because the stop loss triggers or the exit condition triggers, then you just square off the position, i.e., you buy back GLD stock share and you seel your GDX 1.63 shares.

Note: I copied again the text on going long the spread so you can compare easily both cases.

Great to help you Jane,

Thanks,

José Carlos

Jane_D · July 5, 2022, 12:33am

Can you use: When hedge ratio is -1.63,

As an example?

Jose_Carlos_Gonzales_Tanaka · July 6, 2022, 12:32am

Hello Jane,

Sure, let's explain:

When hedge ratio is -1.63, then spread = GLD - hedge_ratio * GDX, then you write the spread like this: spread = GLD - (-1.63) * GDX, which is equal to spread = GLD + 1.63 * GDX.

If you decide to go long on the spread, this implies that you will go long on both GLD and GDX: You will buy one share of GLD and buy 1.63 shares on GDX. If you want to close the position, either because of the stop-loss triggers or the exit condition triggers, then you just square off the position, i.e., you sell your GLD stock share and you sell your GDX 1.63 shares.

When you go short the spread, this implies that you will short-sell both GLD and GDX: You will short-sell one share of GLD and short-sell 1.63 shares on GDX. If you want to close the position, either because of the stop-loss triggers or the exit condition triggers, then you just square off the position, i.e., you buy back GLD stock share and you buy back your GDX 1.63 shares.

I hope this helps,

Thanks,

José Carlos