Queries on the Sequence of return for act function in Deep Reinforcement Learning Course

chinnawut_Vongsariyavanich_Acfz · December 4, 2020, 2:59pm

I try to understand the code on this course and would like to ask for more explanation on some parts.

- Section 12 for act function, the code first return self.is_over and then self.reward but in the section 17, the code was reward, game_over = env.act(action). Does this alreay correct in term of sequence?

Section 15 for init_net function seems to have the same problem as the function first return modelR and then modelQ but when the function was used on section 17 the sequence is as belowed:

q_network, r_network = init_net(env, rl_config)

Could anyone please help me clarify on this ?

Gaurav_Singh_5JwXj · December 8, 2020, 4:20am

Hi Chinnawut,

In the course, we are explaining how each function is built from scratch up till section 17.

Once these concepts are explained, we put all these functions in our utility code file named "quantra_reinforcement_learning.py" located in the "data_modules" folder.

This is the final implementation of the code and concepts you have learnt in the previous sections.

To import the functions from there we write:

from data_modules.quantra_reinforcement_learning import init_net

from data_modules.quantra_reinforcement_learning import ExperienceReplay

…and so on.

Thus, the code in section 17 will be executed as however the functions are implemented in "quantra_reinforcement_learning.py".

You can open "quantra_reinforcement_learning.py" to follow the program logic. The concepts are the same as what you have learnt in the previous sections.

Hope this helps!

Thanks!