Underfit Neural Network & Validation score (and loss curve) is not decreasing

Dwi_Hadyan_Harsono_3IbTH · July 14, 2021, 3:34am

Hi,

I did a little tweak in this notebook that i've added plot of loss function & validation score to see whether our model is actually learning or not, with code below

mlp = MLPClassifier(activation='logistic', hidden_layer_sizes=(
    5), random_state=42, solver='sgd', early_stopping=True)
            .
            .
            .

plt.plot(mlp.loss_curve_)
plt.plot(mlp.validation_scores_)

and the result is that look like this, which we can see that it underfit.

First question is : Was this sample model intentionally made underfit?, or correct us if we are wrong if we do this the wrong way

Second question is, that we try changing the asset into usdjpy, and what we got the loss vs validation score graph is this. we tried tweaking the hyperparameter using GridSearchCV, the validation score is not going down (ie : the model is not learning. Can you share with us how can we make sure that the validation score does decreastes overtime (ie : the model is actually learning).

Or, if you have another better solution than "ensuring the model is learning is to look at validaton vs loss chart and make sure it coverges to 0 overtime", please do share

Thanks!

Gaurav_Singh_5JwXj · July 15, 2021, 4:06am

Hello Dwi,

The model if underfit can always be tweaked (parameters changed) to achieve a better fit. After a certain point, the performance does not improve. That is the time when the different frequencies of data or different features may be considered for the model.
Can you please try without setting the random state? It is the seed value for any randomisation and is usually set to replicate results in subsequent runs. Another thing to explore here is if the data is sufficient or not for the model to learn. Often times when dealing with a lower frequency of data, the model does not get enough data to make a prediction.

Also, there is no guaranteed framework to ensure the loss converges to 0. We don't desire that behaviour as well since that would be a strong indication of an overfitted model.

Hope this helps!

Thanks,

Gaurav