Hello everybody
I tried to backtest the ML SVC Classifier strat but it generates no signals.
It is originally configured to run on minute timeframe and rebalance every hour.
I am trying to get the same strat or a similar ML classification strat that would do daily history (1D) and that would be able to generate signals.
Anyone can please help?
Have a nice weekend
Hi there,
Could you please provide more information about the platform (Blueshift or IBridgePy) you are using to backtest your ML SVC Classifier strategy? This will help us provide a more tailored response to your question.
Thank you.
Hi
I am using the Blueshift backtester.
Kind Regards
Hi,
In order to generate signals, we will have to tweak the Blueshift code and configure it to run for daily data.
- We will first need to edit the schedule function to rebalance the data. Since we are now considering the daily data, we can remove the for loop and write the schedule function as it is, outside of the loop. The function will look like this:
schedule_function(
retrain_model,
date_rule=date_rules.week_start(),
time_rule=time_rules.market_open()
)
# Schedule the rebalance function to run every hour daily
schedule_function(
rebalance,
date_rule=date_rules.every_day(),
time_rule=time_rules.market_open(hours=hours, minutes=minutes)
)
# Schedule the end_of_day_squareoff function every day
schedule_function(
end_of_day_squareoff,
date_rule=date_rules.every_day(),
time_rule=time_rules.market_close(minutes=5)
)
2. In the rebalance function, we need to take care of 2 things:
a. We will change '1m' to '1D' in the try block since we are using daily data.
b. We will tweak the 'if' condition, by replacing context.data_lookback with context.lookback and multiply it with 1.1. The reason for multiplying with 1.1 is to create a buffer or margin in order to accommodate any missing data points in the data frame Df. For example, if context.lookback is 100, the if condition will be executed only when the length of the data frame is less than 110.
Thus, the code will look like this:
def rebalance(context, data):
"""
A function to rebalance the portfolio. This function is called by the
schedule_function in the initialize function.
"""
# Fetch lookback no. days data for the security
try:
Df = data.history(
context.security,
['open', 'high', 'low', 'close', 'volume'],
context.data_lookback,
'1D')
Df = Df[Df.volume!=0]
if len(Df) < (context.lookback) * 1.1:
return
except IndexError:
return</code></pre>
Note: You may also edit the variables context.lookback and context.data_lookback as per your requirements to generate the signals.
Hope this helps. Please feel free to reach out in case of any further query.
Thanks.
thanks a lot!
I tried modifying my code but I think I am still doing it wrong:
# Machine learning libraries
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import TimeSeriesSplit
# Import talib
import talib as ta
# Import Pipeline
from sklearn.pipeline import Pipeline
# Import numpy and pandas
import pandas as pd
import numpy as np
# Import blueshift libraries
from blueshift.api import (
symbol,
order_target_percent,
schedule_function,
date_rules,
time_rules,
get_datetime
)
# The strategy requires context.lookback number of days.
def initialize(context):
# Define Symbols
context.security = symbol('SPY')
# The lookback for the indicators
context.lookback = 20
# The lookback for historical data
context.data_lookback = 1000
# The variable to store the randomised cross vaidation search
context.rcv = None
# The variable to store the classifier
context.cls = None
# Instantiate the StandardScaler
context.ss1 = StandardScaler()
# The variable to store the model accuracy
context.accuracy = 0
# The parameters for SAR
context.acceleration = 0.2
context.maximum = 0.2
# The variable to store the pipeline steps
context.steps = [('scaler', StandardScaler()), ('svc', SVC())]
# The variable to store the Pipeline function
context.pipeline = Pipeline(context.steps)
# Test variables for 'c' and 'g' parameters for the SVC
context.c = [10, 100, 1000, 10000]
context.g = [1e-2, 1e-1, 1e0]
# Intialize the parameters for the SVC
context.parameters = {'svc__C': context.c,
'svc__gamma': context.g,
'svc__kernel': ['rbf']
}
# The train-test split
context.split_ratio = 0.75
# The flag variable is used to check whether to retrain the model or not
context.retrain_flag = True
# Schedule the retrain_model function every week
schedule_function(
retrain_model,
date_rule=date_rules.month_start(),
time_rule=time_rules.market_open()
)
# Schedule the rebalance function to run every hour daily
schedule_function(
rebalance,
date_rule=date_rules.every_day(),
time_rule=time_rules.market_open()
)
# Schedule the end_of_day_squareoff function every day
schedule_function(
end_of_day_squareoff,
date_rule=date_rules.every_day(),
time_rule=time_rules.market_close(minutes=5)
)
def retrain_model(context, data):
"""
A function to retrain the classification model. This function is called by
the schedule_function in the initialize function.
"""
context.retrain_flag = True
def end_of_day_squareoff(context, data):
"""
A function to square-off trades at the end of the day. This function is
called by the schedule_function in the initialize function.
"""
order_target_percent(context.security, 0)
def rebalance(context, data):
"""
A function to rebalance the portfolio. This function is called by the
schedule_function in the initialize function.
"""
# Fetch lookback no. days data for the security
try:
Df = data.history(
context.security,
['open', 'high', 'low', 'close', 'volume'],
context.lookback,
'1D')
Df = Df[Df.volume!=0]
if len(Df) < context.lookback*1.1:
return
except IndexError:
return
# Drop the rows with zero volume traded
Df = Df.drop(Df[Df['volume'] == 0].index)
# Calculate the RSI
Df['RSI'] = ta.RSI(
np.array(Df['close'].shift(1)),
timeperiod=context.lookback
)
# Calculate the SMA
Df['SMA'] = Df['close'].shift(1).rolling(window=context.lookback).mean()
# Calculate the correlation
#Df['Corr'] = Df['close'].shift(1).rolling(
# window=context.lookback).corr(Df['SMA'].shift(1))
# Create a column by name, SAR and assign the SAR calculation to it
#Df['SAR'] = ta.SAR(
# np.array(Df['high'].shift(1)),
# np.array(Df['low'].shift(1)),
# context.acceleration,
# context.maximum
# )
# Create a column by name, ADX and assign the ADX calcul
Hi,
You can replace context.lookback in the Df.history with context.data_lookback.
try:
Df = data.history(
context.security,
['open', 'high', 'low', 'close', 'volume'],
context.data_lookback,
'1D')
If you use context.lookback here, then the following "if" condition
if len(Df) < context.lookback*1.1:
would always get True and hence execute the return statement.The above "if" logic is to make sure that the number of rows in the DataFrame Df is greater than 110% of the specified context.lookback. It determines if the DataFrame has enough data points for further computations and indicator calculations.
Hope this helps!
Thanks,
Akshay
Thanks Akshay now I have signals!
The problem is that it gives really poor results and a constantly negative equity curve…
Hi,
You can try optimising the SVC hyperparameters and indicator parameters to improve the results and the equity curve. You can refer to this blog to know more about hyperparameter optimisation.
Thanks,
Akshay