BaggingRegressor

Hi i attemp to decison tree in trading and I am workind BaggingRegressor. I try to fit like that: 

bagging_reg.fit(x_train,y_train)

but throw me error :

 raise ValueError(msg_err.format(type_err, X.dtype))



ValueError: Input contains NaN, infinity or a value too large for dtype('float32').



i sent my all code below :



import pandas as pd

import numpy as np

#%%

#İMPORT DATA AND DROP UNUSEFUL COLUMNS



data=pd.read_csv("AAPL.csv")

data.info()

data.drop(["Adj_Volume","Adj_Low","Adj_High","Adj_Open","Split","Dividend"],axis=1,inplace=True)



#%% 

#DEFİNE PREDİCTOR AND VARAIBALES AN A TARGET VARIABLE



#RETURNS



data["ret1"]=data.Adj_Close.pct_change()



data["ret5"]=data.ret1.rolling(5).sum()

data["ret10"]=data.ret1.rolling(10).sum()

data["ret20"]=data.ret1.rolling(20).sum()

data["ret40"]=data.ret1.rolling(40).sum()



#STANDART DEVIATION



data["std5"]=data.ret1.rolling(5).std()

data["std10"]=data.ret1.rolling(10).std()

data["std20"]=data.ret1.rolling(20).std()

data["std40"]=data.ret1.rolling(40).std()



#Target veriable is going to be feture return so that we use  shift() function.

data["retFut1"]=data.ret1.shift(-1)



#Drop nan

data.dropna()

predictor_list=["ret5","ret10","ret20","ret40","std5","std10","std20","std40","ret1","Volume"]

x=data[predictor_list]

y=data.retFut1

#SPLİT DATA

train_lenght=(int(len(data)*0.8))

x_train=x[:train_lenght]

x_test=x[train_lenght:]

y_train=y[:train_lenght]

y_test=y[train_lenght:]



#CREATE REGRESSIN MODEL

#Base Estimator remember firt of all each subset  make a decision tree regression 

from sklearn.tree import DecisionTreeRegressor



#Improt the BaggingRegressor

from sklearn.ensemble import BaggingRegressor



#Her bir altküme için sample sayısı belirleyelim

seed=42



#BaggingRegressor modelimiz oluşturalım

bagging_reg=BaggingRegressor(base_estimator=DecisionTreeRegressor(min_samples_leaf=400),

                             n_estimators=10,

                             random_state=seed)

bagging_reg

#Fit

bagging_reg.fit(x_train,y_train)



how can i hande it thanks

It looks like there are nan values in your data which you are passing to the Bagging ago. You can change the below code to remove the nan values. Thanks



#Drop nan


data.dropna(inplace=True)

Thanks your responds i type your cod and alll nan turn to zero than even ? type fit cod:bagging_reg.fit(x_train,y_train) it dosen't any throw after type Visalize The Model cod :

from sklearn import tree

import graphviz

dot_data = tree.export_graphviz(bagging_reg, 

                                out_file=None, 

                                filled=True,   

                                feature_names=predictor_list)  

graphviz.Source(dot_data) 



it throw    raise NotFittedError(msg % {'name': type(estimator).name})



NotFittedError: This BaggingRegressor instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.



really  i dont understant Why do I get an error even though I spell the code correctly ?

Could you please share your full code with data on quantra@quantinsti.com? Thanks

ok i will sent it

Did you get my email ? 

Hello Aytac,

 
Your code looks fine. Just one mistake in the end. 
 
random_subspace object is a RandomForest generated by the class BaggingRegressor. 
The sklearn function tree.export_graphviz takes only a DecissionTree to print. We can't print an entire forest! 
 
So, you will have to either drop the idea or print individual Decision trees in the RandomForest. These Decision trees can be obtained by using the random_subspace.estimators_ member variable. This gives a list of all decision trees in the forest. You can print them like:
 
dot_data = tree.export_graphviz(regr.estimators_[0],
                                out_file=None,
                                filled=True,  
                                feature_names=predictor_list)