How do you discover the memory shape from scratch?

Learning_Fielding_C66UV · November 7, 2022, 4:53am

Course Name: Deep Reinforcement Learning in Trading, Section No: 13, Unit No: 12, Unit type: Quiz

I need help understanding where memory[0][0][0].shape[1] comes from.

IOWs, if I was given a pickle file and told it was a replay buffer of some sort. How could I integrate the `memory` var until I arrived at the correct answer?

Rekhit_Pachanekar · November 8, 2022, 5:44am

Hi,

As you know, deep reinforcement learning is a vast topic and there are certain concepts which need to be understood before we implement deep reinforcement learning on trading data. Thus, in the course we have divided the topic into sections and in Section 13, we see how the experience replay buffer is implemented. This is further used in subsequent sections, on out-of-sample data (real-world as well as synthetic) to check how effective is the deep reinforcement learning model.

I hope this helps. By the way, may I ask if there is a specific notebook where you are having a query? Do let me know and I will be happy to help.

Thanks.

Learning_Fielding_C66UV · November 15, 2022, 5:48pm

While I appreciate your comment, did you actually read my question? Because the first line states exactly what course and section I am in. I am in Section 13,

Notebook: Section 13, Unit 9 (the only notebook in this section):

def process(self, modelQ, modelR, batch_size=10):
        # Get the length of the memory filled in the buffer
        len_memory = len(self.memory)
        # Get the number of actions the agent 
        num_actions = modelQ.output_shape[-1]
        # Get the shape of state
        env_dim = self.memory[0][0][0].shape[1]

and while I can read the source code and see how this version was implemented, the code nor your response answered my question; otherwise, I would not have posted it.

It needed to be made clear why we needed to reference [0] three times when the coding exercise

Section 13, Unit 12:

The structure of memory is [[SARS],game_over]]

only showed a two-dimensional array and not a three-dimensional array - this is where the confusion came from. And trying to "explore" the structure was not strange forward as Python isn't my primary programming langauge.

As I read other liteture, I have noticed that there are several ways a replay_buffer can be implemented.

Therefore, once again, assuming I'm new on the job and my boss gives me an unknown replay_buffer file, like the coding exercise in Section 13, Unit 12, which reads

# Read the replay memory from the pickle file
import pickle
path = '../data_modules/'
memory = pickle.load(open(path+'replay_buffer.bz2','rb'))
print(memory[0])

(My question is:) How could I go about integating the file structure (i.e. the memory object) to discover that the sturcture is a pandas dataframe (or a numpy array) wrapped in a standard 3-demensional array?

Not what is a replay buffer or how to use it in this demo or in real life, that part I get. My question is a simple general programming question which spans beyond this specific topic. How would anyone go about integate any python structure to figure out is internal structure and decode whether or not the structure is a pandas dataframe, numpy array or a standard python array?

The Answer

The answer I was looking for was: use "type(obj)" as in:

# Read replay memory from pickle file
path = '../data_modules/'
memory = pickle.load(open(path+'replay_buffer.bz2','rb'))
print(type(memory))
print(len(memory))
print(type(memory[0]))
print(len(memory[0]))
print(type(memory[0][0]))
print(len(memory[0][0]))
print(type(memory[0][0][0]))
print(len(memory[0][0][0]))
exit()

Which would have given you

<class 'list'>
600
<class 'list'>
2
<class 'list'>
4
<class 'numpy.ndarray'>
1

Notice that the last dimension is a numpy array therefore you can now use the funciton "shape" to discover its dimensions as in:

print(memory[0][0][0].shape)

which would have given you

(1, 138)

Ah! Now I understand the structure of the 'memory' object.