Im trying to determining the optimal lag period using unrestricted VAR for a multivariate time serries dataframe. I tested it out on test data but when using actual asset data Im not sure why its giving me the error:
File "TEST.py", line 81, in rebalance
KeyError: 'aic'
line 81:res = mod.select_order(trend="c")
Please see code below.
# Import libraries
import numpy as np
import pandas as pd
import numpy as np
import statsmodels.tsa.api as tsa
# Import blueshift libraries
from blueshift.pipeline import Pipeline
from blueshift.api import(
symbol,
order_target_percent,
schedule_function,
date_rules,
time_rules,
attach_pipeline,
pipeline_output,
get_datetime
)
# The strategy requires context.lookback number of days.
def make_strategy_pipeline(context):
"""
A function to make the pipeline.
"""
pipe = Pipeline()
return pipe
def initialize(context):
context.lookback = 50
attach_pipeline(make_strategy_pipeline(context), name='strategy_pipeline')
# Rebalance every day
schedule_function(rebalance,
date_rules.every_day(),
time_rules.market_open(hours=0, minutes=1))
def rebalance(context, data):
# Get the pipeline results
pipeline_results = pipeline_output('strategy_pipeline')
# Get the data for the pipeline output
try:
data = data.history(
pipeline_results.index, 'close', context.lookback, '1d')
except IndexError:
return
IC = "aic" # or "bic", "fpe", "hqic"
print("data",data)
mod = tsa.VAR(data)
res = mod.select_order(trend="c")
print(f"{IC} selects {res.ics[IC][res.selected_orders[IC]]}")
Hello Jenny:
We're glad to know you're learning how to use blueshift.
Let's make some comments regarding your code:
- Line 81 is returning a keyerror named "aic" because while runing the function, it doesn't produce any model information criteria outcome, meaning, there are no model estimations running inside the function select_model. This is happening because you aren't actually selecting any asset to run the code. If you see your function "make_strategy_pipeline", you will encounter there is no applied filter to get any asset from this pipeline. Please refer to this documentation to get to know more how to apply the Pipeline function and come back to your code later to get the assets you want for your code.
- Another possibility to get your data is to get rid of the Pipeline function and just call a group of assets of your preference. You can redefine the below two functions like this:
def initialize(context):
context.lookback = 252
context.secList = [symbol('AMZN'),
symbol('AAPL'),
symbol('WMT'),
symbol('MU'),
symbol('BAC'),
symbol('KO'),
symbol('BA'),
symbol('AXP'),
symbol('XOM')]
# Rebalance every day
schedule_function(rebalance, date_rules.every_day(), time_rules.market_open(hours=0, minutes=1))
def rebalance(context, data):
# Get the pipeline results
# Get the data for the pipeline output
try:
data = data.history(context.secList, 'close', context.lookback, '1d')
except IndexError:
return
IC = "aic" # or "bic", "fpe", "hqic"
data.dropna(inplace=True)
print("data",len(data.index))
mod = VAR(data)
res = mod.select_order(trend='c')
- Please, in the initialize function, please change your "context.lookback" to more than 50 observations. I recommend this because the VAR estimation with a great number of lags might not be possible when there are so few observations (Remember from a simple univariate regression model that you have to have the number of observations higher than the number of betas, they same rule applies for the VAR). You can specify 252 for context.lookback as a decent number of observations to run your VAR select_order.
- Remember that in econometrics, you want a parsimonious model, instead of a model that describes "perfectly" the data. Consequently, don't try to run select_order with too much lags to consider. You can estimate with maxlags 5 or 7.
- You dataframe "data" are "close" values. This is wrong. Please remember that a VAR must be stable. Meaning, the VAR must be stationary. VAR estimation with close prices might give you an unstable VAR. You have to work with stationary time series when you want to estimate an unrestricted VAR.
I hope this comments help you,
In case you have more questions, please let us know,
Thanks and regards,
Jose Carlos
For example, consider a portfolio of
GLD and GDX formed with the
hedge ratio -1.63 :
the zScore > entryZscore
? * GLD ? * GDX.
Also if
the zScore < entryZscore
? * GLD ? * GDX.
Coversley GLD and GDX formed with the
hedge ratio 1.63 :
the zScore > entryZscore
? * GLD ? * GDX.
Also if
the zScore < entryZscore
? * GLD ? * GDX.
Can you answer this please?
Thanks.
Hello Jenny,
So I'm guessing you want to know what numbers must be multiplied to each stock when the hedge ratio is 1.63 or -1.63, right?
If that's the case, please remember always that the dependent variable number is always normalized to one. The independent variable will be the one which will be multiplied by the hedge ratio.
So, for the first case:
? * GLD ? * GDX. = 1 * GLD -1.63 * GDX.
For the second case:
? * GLD ? * GDX. = 1 * GLD +1.63 * GDX
Instead of assigning the GDX as the independent variable, you might want to assign it as the dependent variable. The model will look like this:
1 * GDX (+/-)hedge ratio *GLD.
You might ask: So, how should you specify the model? Which stock would you need to have as an independent and dependent variable?
Well, the answer is that the researcher is in charge of choosing which stock goes as independent or dependent variable. As a trader, you need to choose. Whatever the case, remember that the dependent variable number of shares must be always 1, it's the independent variable that must have the hedge ratio as number of shares.
I hope these comments might help you.
Thanks and regards,
José Carlos
entryZscore = 0.25
So if the zScore < - entryZscore or zScore > entryZscore
Meaning if the spread has crossed the lower limit or the spread has crossed the upper limit
and the hedge is -1.63
It doesnt matter
? * GLD ? * GDX. = 1 * GLD -1.63 * GDX.
Both situations
? * GLD ? * GDX. = 1 * GLD -1.63 * GDX.
Is that what you are saying?
Hello Jenny,
The hedge ratio computation is done before you compute the entry scores. The hedge ratio value doesn't depend on the entryZscore.
The entryZscore is computed to check whether we should square off the position or not.
I hope this helps,
Thanks and regards,
José Carlos
When I want to enter a trade using the spread's diviation form the mean. What are the weights of the 2 assets found to be 'corelated'? This is what Im asking.
Threfore:-
Consider a portfolio of
GLD and GDX formed with the
hedge ratio -1.63 :
the zScore > entryZscore
? * GLD ? * GDX.
Also if
the zScore < entryZscore
? * GLD ? * GDX.
Coversley GLD and GDX formed with the
hedge ratio 1.63 :
the zScore > entryZscore
? * GLD ? * GDX.
Also if
the zScore < entryZscore
? * GLD ? * GDX.
Can you answer this please?
This would mean you give 4 answers and each answer would consist of 2 parts the GLD and GDX weight.
Hello Jenny,
I'm sorry but I don't understand so well what you are trying to answer.
Let me guess what you are tryint to say and then answer based on that.
You say:
1) When I want to enter a trade using the spread's diviation form the mean.
Answer: This sentence is well understood.
2) What are the weights of the 2 assets found to be 'corelated'? This is what Im asking.
Answer, in pairs trading, you don't talk about correlated assets. You actually talk about cointegrated assets. Cointegration and correlation don't mean the same.
3) I believe you are trying to ask what positions you must take when you decide to go long on the spread when hedge ratio is -1.63 and when hedge ratio is +1.63.
a) When hedge ratio is -1.63, then spread = GLD - hedge_ratio * GDX, then you write the spread like this: spread = GLD - (-1.63) * GDX, which is equal to spread = GLD + 1.63 * GDX.
If you decide to go long on the spread, this implies that you will go long on both stocks: You will buy one share of GLD and 1.63 shares on GDX. If you want to close the position, either because the stop loss triggers or the exit condition triggers, then you just square off the position, i.e., you sell your GLD stock share and you sell your GDX 1.63 shares.
a) When hedge ratio is 1.63, then spread = GLD - hedge_ratio * GDX, then you write the spread like this: spread = GLD - (+1.63) * GDX, which is equal to spread = GLD - 1.63 * GDX.
If you decide to go long on the spread, this implies that you will go long on GLD and short sell GDX: You will buy one share of GLD and short sell 1.63 shares on GDX. If you want to close the position, either because the stop loss triggers or the exit condition triggers, then you just square off the position, i.e., you sell your GLD stock share and you buy back your GDX 1.63 shares.
We go long the spread when the zScore goes below the entryZscore, no matter what value the hedge ratio has. That's why I told you that the hedge ratio computation is a step done before the entryZscore seeting. The hedge ratio value don't depend on the entryZscore. You just go long or short the spread depending on whether the zScore goes below the entryZscore.
I hope this piece of information helps,
Thanks,
José Carlos
What happens when we go short on the spread (the zScore goes above the entryZscore)?
Hello Jane,
Let me answer your question with our above example.
When hedge ratio is 1.63, then spread = GLD - hedge_ratio * GDX, then you write the spread like this: spread = GLD - (+1.63) * GDX, which is equal to spread = GLD - 1.63 * GDX.
If you decide to go long on the spread, this implies that you will go long on GLD and short sell GDX: You will buy one share of GLD and short sell 1.63 shares on GDX. If you want to close the position, either because the stop loss triggers or the exit condition triggers, then you just square off the position, i.e., you sell your GLD stock share and you buy back your GDX 1.63 shares.
When you go short the spread, this implies that you will go short on GLD and buy GDX: You will sell one share of GLD and buy 1.63 shares on GDX. If you want to close the position, either because the stop loss triggers or the exit condition triggers, then you just square off the position, i.e., you buy back GLD stock share and you seel your GDX 1.63 shares.
Note: I copied again the text on going long the spread so you can compare easily both cases.
Great to help you Jane,
Thanks,
José Carlos
Can you use: When hedge ratio is -1.63,
As an example?
Hello Jane,
Sure, let's explain:
When hedge ratio is -1.63, then spread = GLD - hedge_ratio * GDX, then you write the spread like this: spread = GLD - (-1.63) * GDX, which is equal to spread = GLD + 1.63 * GDX.
If you decide to go long on the spread, this implies that you will go long on both GLD and GDX: You will buy one share of GLD and buy 1.63 shares on GDX. If you want to close the position, either because of the stop-loss triggers or the exit condition triggers, then you just square off the position, i.e., you sell your GLD stock share and you sell your GDX 1.63 shares.
When you go short the spread, this implies that you will short-sell both GLD and GDX: You will short-sell one share of GLD and short-sell 1.63 shares on GDX. If you want to close the position, either because of the stop-loss triggers or the exit condition triggers, then you just square off the position, i.e., you buy back GLD stock share and you buy back your GDX 1.63 shares.
I hope this helps,
Thanks,
José Carlos