Hi i attemp to decison tree in trading and I am workind BaggingRegressor. I try to fit like that:
bagging_reg.fit(x_train,y_train)
but throw me error :
raise ValueError(msg_err.format(type_err, X.dtype))
ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
i sent my all code below :
import pandas as pd
import numpy as np
#%%
#İMPORT DATA AND DROP UNUSEFUL COLUMNS
data=pd.read_csv("AAPL.csv")
data.info()
data.drop(["Adj_Volume","Adj_Low","Adj_High","Adj_Open","Split","Dividend"],axis=1,inplace=True)
#%%
#DEFİNE PREDİCTOR AND VARAIBALES AN A TARGET VARIABLE
#RETURNS
data["ret1"]=data.Adj_Close.pct_change()
data["ret5"]=data.ret1.rolling(5).sum()
data["ret10"]=data.ret1.rolling(10).sum()
data["ret20"]=data.ret1.rolling(20).sum()
data["ret40"]=data.ret1.rolling(40).sum()
#STANDART DEVIATION
data["std5"]=data.ret1.rolling(5).std()
data["std10"]=data.ret1.rolling(10).std()
data["std20"]=data.ret1.rolling(20).std()
data["std40"]=data.ret1.rolling(40).std()
#Target veriable is going to be feture return so that we use shift() function.
data["retFut1"]=data.ret1.shift(-1)
#Drop nan
data.dropna()
predictor_list=["ret5","ret10","ret20","ret40","std5","std10","std20","std40","ret1","Volume"]
x=data[predictor_list]
y=data.retFut1
#SPLİT DATA
train_lenght=(int(len(data)*0.8))
x_train=x[:train_lenght]
x_test=x[train_lenght:]
y_train=y[:train_lenght]
y_test=y[train_lenght:]
#CREATE REGRESSIN MODEL
#Base Estimator remember firt of all each subset make a decision tree regression
from sklearn.tree import DecisionTreeRegressor
#Improt the BaggingRegressor
from sklearn.ensemble import BaggingRegressor
#Her bir altküme için sample sayısı belirleyelim
seed=42
#BaggingRegressor modelimiz oluşturalım
bagging_reg=BaggingRegressor(base_estimator=DecisionTreeRegressor(min_samples_leaf=400),
n_estimators=10,
random_state=seed)
bagging_reg
#Fit
bagging_reg.fit(x_train,y_train)
how can i hande it thanks
It looks like there are nan values in your data which you are passing to the Bagging ago. You can change the below code to remove the nan values. Thanks
#Drop nan
data.dropna(inplace=True)
Thanks your responds i type your cod and alll nan turn to zero than even ? type fit cod:bagging_reg.fit(x_train,y_train) it dosen't any throw after type Visalize The Model cod :
from sklearn import tree
import graphviz
dot_data = tree.export_graphviz(bagging_reg,
out_file=None,
filled=True,
feature_names=predictor_list)
graphviz.Source(dot_data)
it throw raise NotFittedError(msg % {'name': type(estimator).name})
NotFittedError: This BaggingRegressor instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.
really i dont understant Why do I get an error even though I spell the code correctly ?
Could you please share your full code with data on quantra@quantinsti.com? Thanks
ok i will sent it
Did you get my email ?
Hello Aytac,
out_file=None,
filled=True,
feature_names=predictor_list)