Turn ML SVC Classifier strat into daily timeframe / no signals

Yfa_L_CxRz1 · June 23, 2023, 7:50pm

Hello everybody

I tried to backtest the ML SVC Classifier strat but it generates no signals.

It is originally configured to run on minute timeframe and rebalance every hour.

I am trying to get the same strat or a similar ML classification strat that would do daily history (1D) and that would be able to generate signals.

Anyone can please help?

Have a nice weekend

Sanskriti_Jain_DoSvL · June 26, 2023, 10:00am

Hi there,

Could you please provide more information about the platform (Blueshift or IBridgePy) you are using to backtest your ML SVC Classifier strategy? This will help us provide a more tailored response to your question.

Thank you.

Yfa_L_CxRz1 · June 26, 2023, 10:58am

Hi

I am using the Blueshift backtester.

Kind Regards

Sanskriti_Jain_DoSvL · June 28, 2023, 11:19am

Hi,

In order to generate signals, we will have to tweak the Blueshift code and configure it to run for daily data.

We will first need to edit the schedule function to rebalance the data. Since we are now considering the daily data, we can remove the for loop and write the schedule function as it is, outside of the loop. The function will look like this:

    schedule_function(
                        retrain_model,
                        date_rule=date_rules.week_start(),
                        time_rule=time_rules.market_open()
                     )

    # Schedule the rebalance function to run every hour daily
    schedule_function(
                        rebalance,
                        date_rule=date_rules.every_day(),
                        time_rule=time_rules.market_open(hours=hours, minutes=minutes)
                            
                     )
                        

    # Schedule the end_of_day_squareoff function every day
    schedule_function(
                        end_of_day_squareoff,
                        date_rule=date_rules.every_day(),
                        time_rule=time_rules.market_close(minutes=5)
                     )

2. In the rebalance function, we need to take care of 2 things:

a. We will change '1m' to '1D' in the try block since we are using daily data.

b. We will tweak the 'if' condition, by replacing context.data_lookback with context.lookback and multiply it with 1.1. The reason for multiplying with 1.1 is to create a buffer or margin in order to accommodate any missing data points in the data frame Df. For example, if context.lookback is 100, the if condition will be executed only when the length of the data frame is less than 110.
Thus, the code will look like this:

def rebalance(context, data):
    """
        A function to rebalance the portfolio. This function is called by the
        schedule_function in the initialize function.
    """
# Fetch lookback no. days data for the security
try:
    Df = data.history(
        context.security,
        ['open', 'high', 'low', 'close', 'volume'],
        context.data_lookback,
        '1D')
    Df = Df[Df.volume!=0]
    if len(Df) &lt; (context.lookback) * 1.1:
        return 
except IndexError:
    return</code></pre>



Note: You may also edit the variables context.lookback and context.data_lookback as per your requirements to generate the signals.



Hope this helps. Please feel free to reach out in case of any further query.



Thanks.

Yfa_L_CxRz1 · June 30, 2023, 11:40am

thanks a lot!

I tried modifying my code but I think I am still doing it wrong:

# Machine learning libraries
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import TimeSeriesSplit

# Import talib
import talib as ta

# Import Pipeline
from sklearn.pipeline import Pipeline

# Import numpy and pandas
import pandas as pd
import numpy as np

# Import blueshift libraries
from blueshift.api import (
                            symbol,
                            order_target_percent,
                            schedule_function,
                            date_rules,
                            time_rules,
                            get_datetime
                        )

# The strategy requires context.lookback number of days. 

def initialize(context):
    # Define Symbols
    context.security = symbol('SPY')

    # The lookback for the indicators
    context.lookback = 20

    # The lookback for historical data
    context.data_lookback = 1000

    # The variable to store the randomised cross vaidation search
    context.rcv = None

    # The variable to store the classifier
    context.cls = None

    # Instantiate the StandardScaler
    context.ss1 = StandardScaler()

    # The variable to store the model accuracy
    context.accuracy = 0

    # The parameters for SAR
    context.acceleration = 0.2
    context.maximum = 0.2

    # The variable to store the pipeline steps
    context.steps = [('scaler', StandardScaler()), ('svc', SVC())]

    # The variable to store the Pipeline function
    context.pipeline = Pipeline(context.steps)

    # Test variables for 'c' and 'g' parameters for the SVC
    context.c = [10, 100, 1000, 10000]
    context.g = [1e-2, 1e-1, 1e0]

    # Intialize the parameters for the SVC
    context.parameters = {'svc__C': context.c,
                          'svc__gamma': context.g,
                          'svc__kernel': ['rbf']
                          }

    # The train-test split
    context.split_ratio = 0.75

    # The flag variable is used to check whether to retrain the model or not
    context.retrain_flag = True

    # Schedule the retrain_model function every week
    schedule_function(
                        retrain_model,
                        date_rule=date_rules.month_start(),
                        time_rule=time_rules.market_open()
                     )

    # Schedule the rebalance function to run every hour daily
    schedule_function(
                            rebalance,
                            date_rule=date_rules.every_day(),
                            time_rule=time_rules.market_open()
                        )
                        

    # Schedule the end_of_day_squareoff function every day
    schedule_function(
                        end_of_day_squareoff,
                        date_rule=date_rules.every_day(),
                        time_rule=time_rules.market_close(minutes=5)
                     )


def retrain_model(context, data):
    """
        A function to retrain the classification model. This function is called by
        the schedule_function in the initialize function.
    """
    context.retrain_flag = True


def end_of_day_squareoff(context, data):
    """
        A function to square-off trades at the end of the day. This function is
        called by the schedule_function in the initialize function.
    """
    order_target_percent(context.security, 0)


def rebalance(context, data):
    """
        A function to rebalance the portfolio. This function is called by the
        schedule_function in the initialize function.
    """

    # Fetch lookback no. days data for the security
    try:
        Df = data.history(
            context.security,
            ['open', 'high', 'low', 'close', 'volume'],
            context.lookback,
            '1D')
        Df = Df[Df.volume!=0]
        if len(Df) < context.lookback*1.1:
            return 
    except IndexError:
        return

    # Drop the rows with zero volume traded
    Df = Df.drop(Df[Df['volume'] == 0].index)

    # Calculate the RSI
    Df['RSI'] = ta.RSI(
                        np.array(Df['close'].shift(1)),
                        timeperiod=context.lookback
                      )

    # Calculate the SMA
    Df['SMA'] = Df['close'].shift(1).rolling(window=context.lookback).mean()

    # Calculate the correlation
    #Df['Corr'] = Df['close'].shift(1).rolling(
    #    window=context.lookback).corr(Df['SMA'].shift(1))

    # Create a column by name, SAR and assign the SAR calculation to it
    #Df['SAR'] = ta.SAR(
    #                    np.array(Df['high'].shift(1)),
    #                    np.array(Df['low'].shift(1)),
    #                    context.acceleration,
    #                    context.maximum
    #                  )

    # Create a column by name, ADX and assign the ADX calcul

Akshay_Choudhary · July 4, 2023, 2:47am

Hi,

You can replace context.lookback in the Df.history with context.data_lookback.

try:
Df = data.history(
context.security,
['open', 'high', 'low', 'close', 'volume'],
context.data_lookback,
'1D')

If you use context.lookback here, then the following "if" condition

if len(Df) < context.lookback*1.1:

would always get True and hence execute the return statement.

The above "if" logic is to make sure that the number of rows in the DataFrame Df is greater than 110% of the specified context.lookback. It determines if the DataFrame has enough data points for further computations and indicator calculations.

Hope this helps!

Thanks,
Akshay

Yfa_L_CxRz1 · July 4, 2023, 11:16am

Thanks Akshay now I have signals!

The problem is that it gives really poor results and a constantly negative equity curve…

Akshay_Choudhary · July 6, 2023, 12:16pm

Hi,

You can try optimising the SVC hyperparameters and indicator parameters to improve the results and the equity curve. You can refer to this blog to know more about hyperparameter optimisation.

Thanks,

Akshay