RL totally freezes when trying to replicate the very same params

antoni_ansarov_8y30t · February 17, 2023, 4:58am

RL totaly freezes Visual Code. It gives no indication of doing anything, beside the few lines at the beginning. It also makes Visual Code unsesponsive. However, it works with the native traines files, so one can see the results of the pre-trained memory episodes. I undersyand this is not an easy program, but i feel that i got really a bad trade here. Running RL on the local, with 'TEST_MODE': False. Freezes without any indication of anything. Even VCode can not be closed, kernel can not be stopped. Total disaster. Guys please advise, I am desperate. Why we can not see and monitor where in the process of the Training we are?

varun_kumar_pothula · February 17, 2023, 8:02am

Hello Antoni,

Few things to check here:

Reinforcement learning models can be resource-intensive, and if you are running your model on a machine with limited resources, it can cause the system to freeze. Check your system's resource usage (e.g., CPU, memory, and disk usage) to ensure that it's not running out of resources.
Sometimes external dependencies can cause this issue. To make sure it's not an issue with package versions and their dependencies, please use the quantra_py environment. You can follow the steps mentioned in this guide - link

I hope this helps!

In addition to the above steps, you can also add print statements at intermediate levels of the code as explained in the response to your previous query here.

Please let us know if you are able to solve the issue after following the above steps.

Thank you

James_Smith_8IrwZ · February 17, 2023, 2:12pm

Anton Ansarov email me! phantomsamurai1.000@gmail.com

Thanks.

antoni_ansarov_8y30t · February 18, 2023, 7:53pm

This is what i get before i stop in manually- because even after 10 hrs does not get anywhere.

Th eprint statements are mine.

Run called now... -------------------- Resetting now... -------------------- Act called now... -------------------- Updating Position now ----------------------- Get Reward called now... -------------------- Assembling State now.....-------------------- Initialized NN now... --------------------
WARNING:absl:`lr` is deprecated, please use `learning_rate` instead, or use the legacy optimizer, e.g.,tf.keras.optimizers.legacy.SGD. WARNING:absl:`lr` is deprecated, please use `learning_rate` instead, or use the legacy optimizer, e.g.,tf.keras.optimizers.legacy.SGD.
Output exceeds the size limit. Open the full output data in a text editor
New Game initialized now... -------------------- Resetting now... -------------------- Act called now... -------------------- Updating Position now ----------------------- Get Reward called now... -------------------- Assembling State now.....-------------------- Get State called now... -------------------- Assembling State now.....-------------------- Act called now... -------------------- Updating Position now ----------------------- Get Reward called now... -------------------- Get State called now... -------------------- Assembling State now.....-------------------- Creating a new Q-Table now... -------------------- Recency Sampling Loop called now... -------------------- 1/1 [==============================] - 0s 335ms/step 1/1 [==============================] - 0s 49ms/step Update the NN model with a new Q-Table now... -------------------- Act called now... -------------------- Updating Position now ----------------------- Action is 2 or BUY -------- Get Reward called now... -------------------- Get State called now... -------------------- Assembling State now.....-------------------- Creating a new Q-Table now... -------------------- Recency Sampling Loop called now... -------------------- 1/1 [==============================] - 0s 24ms/step 1/1 [==============================] - 0s 24ms/step Update the NN model with a new Q-Table now... -------------------- Act called now... -------------------- Updating Position now ----------------------- Action is 2 or BUY -------- Get Reward called now... -------------------- Get State called now... -------------------- Assembling State now.....-------------------- Creating a new Q-Table now... -------------------- Recency Sampling Loop called now... -------------------- 1/1 [==============================] - 0s 24ms/step 1/1 [==============================] - 0s 23ms/step Update the NN model with a new Q-Table now... -------------------- Act called now... -------------------- Updating Position now ----------------------- Action is 1 or SELL -------- Get Reward called now... -------------------- Get State called now... -------------------- Assembling State now.....-------------------- Creating a new Q-Table now... -------------------- Recency Sampling Loop called now... -------------------- 1/1 [==============================] - 0s 34ms/step 1/1 [==============================] - 0s 42ms/step Update the NN model with a new Q-Table now... -------------------- Trade 001 | pos 1 | len 2 | approx cum ret -0.23% | trade ret -0.23% | eps 1.0010 | 2008-01-22 08:50:00 | 5004 New Game initialized now... -------------------- Resetting now... -------------------- Act called now... -------------------- Updating Position now ----------------------- Get Reward called now... -------------------- Assembling State now.....-------------------- Get State called now... -------------------- Assembling State now.....-------------------- 1/1 [==============================] - 0s 50ms/step Act called now... -------------------- Updating Position now ----------------------- Get Reward called now... -------------------- Get State called now... -------------------- Assembling State now.....-------------------- Creating a new Q-Table now... -------------------- Recency Sampling Loop called now... -------------------- 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 38ms/step Update the NN model with a new Q-Table now... -------------------- 1/1 [==============================] - 0s 51ms/step Act called now... -------------------- Updating Position now ----------------------- Get Reward called now... -------------------- Get State called now... -------------------- Assembling State now.....-------------------- Creating a new Q-Table now... -------------------- Recency Sampling Loop called now... ---------------

varun_kumar_pothula · February 21, 2023, 4:02am

Hello Antoni,

A possible reason is that the model is failing to converge to a solution. Can you please share the code file so that I can check if there are any issues in the code?

Thanks