Machine Learning: Regression Section 3 Unit 22 | reg.fit(X_train, yU_train) returns error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[16], line 2
      1 # We call the fit function of the model and pass the X_train and yU_train datasets
----> 2 reg.fit(X_train, yU_train)

File ~\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\base.py:1474, in _fit_context.<locals>.decorator.<locals>.wrapper(estimator, *args, **kwargs)
   1467     estimator._validate_params()
   1469 with config_context(
   1470     skip_parameter_validation=(
   1471         prefer_skip_nested_validation or global_skip_validation
   1472     )
   1473 ):
-> 1474     return fit_method(estimator, *args, **kwargs)

File ~\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\model_selection\_search.py:970, in BaseSearchCV.fit(self, X, y, **params)
    964     results = self._format_results(
    965         all_candidate_params, n_splits, all_out, all_more_results
    966     )
    968     return results
--> 970 self._run_search(evaluate_candidates)
    972 # multimetric is determined here because in the case of a callable
    973 # self.scoring the return type is only known after calling
    974 first_test_score = all_out[0]["test_scores"]

File ~\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\model_selection\_search.py:1527, in GridSearchCV._run_search(self, evaluate_candidates)
   1525 def _run_search(self, evaluate_candidates):
   1526     """Search all candidates in param_grid"""
-> 1527     evaluate_candidates(ParameterGrid(self.param_grid))

File ~\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\model_selection\_search.py:947, in BaseSearchCV.fit.<locals>.evaluate_candidates(candidate_params, cv, more_results)
    940 elif len(out) != n_candidates * n_splits:
    941     raise ValueError(
    942         "cv.split and cv.get_n_splits returned "
    943         "inconsistent results. Expected {} "
    944         "splits, got {}".format(n_splits, len(out) // n_candidates)
    945     )
--> 947 _warn_or_raise_about_fit_failures(out, self.error_score)
    949 # For callable self.scoring, the return type is only know after
    950 # calling. If the return type is a dictionary, the error scores
    951 # can now be inserted with the correct key. The type checking
    952 # of out will be done in `_insert_error_scores`.
    953 if callable(self.scoring):

File ~\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\model_selection\_validation.py:536, in _warn_or_raise_about_fit_failures(results, error_score)
    529 if num_failed_fits == num_fits:
    530     all_fits_failed_message = (
    531         f"\nAll the {num_fits} fits failed.\n"
    532         "It is very likely that your model is misconfigured.\n"
    533         "You can try to debug the error by setting error_score='raise'.\n\n"
    534         f"Below are more details about the failures:\n{fit_errors_summary}"
    535     )
--> 536     raise ValueError(all_fits_failed_message)
    538 else:
    539     some_fits_failed_message = (
    540         f"\n{num_failed_fits} fits failed out of a total of {num_fits}.\n"
    541         "The score on these train-test partitions for these parameters"
   (...)
    545         f"Below are more details about the failures:\n{fit_errors_summary}"
    546     )

ValueError: 
All the 10 fits failed.
It is very likely that your model is misconfigured.
You can try to debug the error by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
5 fits failed with the following error:
Traceback (most recent call last):
  File "C:\Users\anilh\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\model_selection\_validation.py", line 895, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "C:\Users\anilh\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\base.py", line 1474, in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\anilh\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\pipeline.py", line 475, in fit
    self._final_estimator.fit(Xt, y, **last_step_params["fit"])
  File "C:\Users\anilh\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\base.py", line 1467, in wrapper
    estimator._validate_params()
  File "C:\Users\anilh\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\base.py", line 666, in _validate_params
    validate_parameter_constraints(
  File "C:\Users\anilh\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\utils\_param_validation.py", line 95, in validate_parameter_constraints
    raise InvalidParameterError(
sklearn.utils._param_validation.InvalidParameterError: The 'fit_intercept' parameter of LinearRegression must be an instance of 'bool' or an instance of 'numpy.bool_'. Got 0 instead.

--------------------------------------------------------------------------------
5 fits failed with the following error:
Traceback (most recent call last):
  File "C:\Users\anilh\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\model_selection\_validation.py", line 895, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "C:\Users\anilh\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\base.py", line 1474, in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\anilh\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\pipeline.py", line 475, in fit
    self._final_estimator.fit(Xt, y, **last_step_params["fit"])
  File "C:\Users\anilh\anaconda3\envs\mql5Python\Lib\site-packages\sklearn\base.py", line 1467, in wrapper
    e

Hi Anil,



The notebook is working fine on the portal and my local system. Can you please share the pandas, numpy and sklearn library versions that you are using on your system?

 

Hi Akshay



Yeh most of the heavily discounted Course will run well if we make the same evironment as suggested in the course.



But that is not practicle :slight_smile: as the software versions have changed a lot since.



Below are the required details :



Python            3.12.2

Pandas            2.2.2

Numpy            1.26.4

Sklearn            1.4.2        (Scikit-learn)



I am using Anaconda with mql5Python evironment.



Is there possibility of MQL support? As most of the Class used in the course do not have direct functions in MQL5.



Thanks for your reply.

Hi Anil,



As such, we do not provide support for MQL. However, you can refer to this article on MT5 if you haven't already. Also, please do let us know if you have any specific queries regarding MQL, and we will try to assist you with the same. 

Hi Akshay



Thanks for the link to the article on MQL5 and Python. Though most of the points I am already aware of from it.



HOWEVER, my original Python Course issue remain unresolved.



Waiting for a solution for more than almost 3-4 days, I am sure anyone will loose his interest in the course.



I am highly discouraged to look forward for EPAT programme registration, if this the speed of support from QuantInsti.

Hi Anil,



As you pointed out, the library versions change a lot; hence, it is an essential skill for an algorithmic trader to be able to debug the issues that arise due to these frequent updates. Let us understand this with the help of the error you have posted. So, if you see the error, it says that 

 

sklearn.utils._param_validation.InvalidParameterError: The 'fit_intercept' parameter of LinearRegression must be an instance of 'bool' or an instance of 'numpy.bool_'. Got 0 instead.

Now, let us try to approach this in the most simplistic manner.

What can we infer from the error statement here?

It says that the "fit_intercept" parameter that we passed as a hyperparameter to the model was expected to be 'bool', but it was 0 instead. This means that something unexpected was passed to the model.

Now, there are two questions: did we pass something incorrect to the model?

Not necessarily because this parameter was working fine in earlier library versions. So what might have happened is that something had changed in some libraries, which is causing the error.

The second question that comes to my mind is what a "bool" is. So bool is simply a datatype which is either True or False. So, what I understand from this is I need to pass True or False to the model instead of 0/1.

Now, how to pass this. Upon reading the markdown in the notebook, we can see the explanation which states that:
"We will use the fit_intercept function which can tell us whether to calculate the intercept for this model or not. This is a boolean function and hence can only return 0 or 1. If our result is 0, it means that the model performs better without the intercept. If the result is 1, it means an intercept needs to be modelled for better results."  

From the above statement, what I infer is that we need to replace:
 
parameters = {'linear__fit_intercept': [0, 1]}
with
 
parameters = {'linear__fit_intercept': [False, True]}
and this can be a tentative solution to the issue we are facing.

In case this doesn't solve the issue, I would dive a bit deeper into it by checking the recent changes in the library update or going through StackOverflow, as sometimes there are active threads about some common issues.

I have just stated the thought process I would follow to debug these kind of issues. I hope you might find the approach helpful in debugging any similar issues that you might be facing.

Another way to avoid this issue can be using the quantra_py environment to the maximum extent possible as we have created it with all the stable library versions and their dependencies.

Dear Akshay

Thanks for your very well explained solution. I have tried it, and curretly it does compile without error.

It was a bit delayed though :slight_smile:

I like the way you explained the problem, to make understand how to look for solutions.

About the quantra_py environment, I understand that when using Anaconda I make sure that I am using stable versions only.

I also realised the importance of keeping 'markdowns' in my file to help debuging.

Thanks a lot and I hope to see my first algo working soon in demo and real life, with help of Quantra Course and support from staff like you.

regards