Deep Reinforcement Learning in Trading
₹14300/-₹57199/-
75% OFF
Get for ₹11440 with Course Bundle
- Live Trading
- Learning Track
- Prerequisites
- Syllabus
- About author
- Testimonials
- Faqs
Apply Deep Reinforcement Learning in Trading
- Learn applications, effectiveness, need and challenges of using RL models in trading.
- Describe states, actions, double Q-learning, policy, experience replay, positions and rewards. And enhance state, action and reward.
- Explain various input features used in the construction of a state. Assemble the input features to construct a state.
- Create a game class starting with the initialisation of the Game class, updating the position, calculating the reward and assembling the state.
- Explain the basics of ANNs and implement Double Deep Q Learning agents using Keras.
- Create and backtest a reinforcement learning model. Analyse returns and risk using different performance measures.
- Learn about the steps to automate your trading and deploy the RL model for paper and live trading. Implement the concepts on real market data through a capstone project.

Skills Required for Deep Reinforcement Learning
Finance and Math Skills
- Sharpe ratio
- Returns & Maximum drawdowns
- Stochastic gradient descent
- Mean squared error
Python
- Pandas, Numpy
- Matplotlib
- Datetime, TA-lib
- For loops
- Tensorflow, Keras, SGD
Reinforcement Learning
- Double Q-learning
- Artificial Neural Networks
- State, Rewards, Actions
- Experience Replay
- Exploration vs Exploitation

learning track 5
This course is a part of the Learning Track: Artificial Intelligence in Trading Advanced
Course Fees
Full Learning Track
These courses are specially curated to help you with end-to-end learning of the subject.
Course Features
- Community
Faculty Support on Community
- Interactive Coding Exercises
Interactive Coding Practice
- Capstone Project
Capstone Project Using Real Market Data
- Trade & Learn Together
Trade and Learn Together
- Get Certified
Get Certified
Prerequisites
This course requires a basic understanding of financial markets such as buying and selling of securities. To implement the strategies covered, the basic knowledge of “pandas dataframe”, “Keras” and “matplotlib” is required. The required skills are covered in the free course, 'Python for Trading: Basic', 'Introduction to Machine Learning for Trading' on Quantra. To gain an in-depth understanding of Neural Networks, you can enroll in the 'Neural Networks in Trading' course which is recommended but optional.
Deep Reinforcement Learning in Trading Course
- IntroductionLearn applications and effectiveness of using RL models in trading. The course has utilised 100+ research papers, articles to create the RL model which went through hundreds of iterations on known synthetic patterns to finalise hyperparameters, Q-learning, experience replay and feedforward network. You would learn to create a full RL framework from scratch and practice in the capstone project. At the end of the course, you will learn to implement the model in live trading.
Need for Reinforcement Learning
Delayed gratification says that foregoing a reward in the short term might lead to a greater reward in the long term. Designing a normal decision-making algorithm to address delayed gratification is difficult. Selecting an action favouring no reward immediately but a possible future reward is problematic for the algorithm. Learn how reinforcement learning tackles delayed gratification by assigning immediate rewards in the short term to maximise our long term reward.Introduction to Reinforcement Learning1m 52sDilemma of Decision making2mDesign Algorithm for Promotion2mFactors in Trading2mImpact of Decision2mDecision Problem and Trading2mDelayed Gratification2m 39sReward Based on Delayed Gratification2mReward Based on Time2mDesigning Delayed Gratification Algorithm2mDelayed Gratification Using RL2mTest on Needs of RL14m- State, Actions and RewardsIn this section, you will learn about states, actions and rewards. These are the basic building blocks of a reinforcement learning model. A state gives us a view of our environment, which can include price data, technical indicators as well as trend identifications. You will learn how a reward function is designed to help us maximise our rewards. This helps in analysing the environment and choosing the right actions.States, Actions and Rewards2m 40sDefinition of State2mWhat Can Be Added in State2mNeed for Reward2mIdentifying the Reward2mActions in Reinforcement Learning2mDefining Action for Self Driving Car2mLimitation of Profit as Reward2mReward Function Design1m 34sPoorly Designed Reward Function2mMetrics of Reward System2mNext Step in Reinforcement Learning2m
- Q LearningIn this section, you will be introduced to the Q table, which is used for calculating the immediate rewards. Further, you will learn about the role of neural networks in calculating the q-values. These values are used to select the actions which will maximise our rewards.Creating the Q Table2m 55sRequirement of Q Table2mDifference between Q and R Table2mRepresentation of Q Table2mIdentifying the Bellman Equation2mImportance of the Bellman Equation2mUpdating the Q Table2mSolving the Bellman Equation2mFinding Q Table Value2mZero Learning Rate Impact2mHigh Learning Rate2mTraditional vs Deep RL2mAction Based on Q Value2mDQN and Experience Replay2m 27sI/O of NN2mDefinition of Experience Replay2mAdditional Reading10m
- State ConstructionIn this section, you will learn about the input features used in the construction of a state. You will understand why an input feature should be weakly predictive and stationary.State Construction1m 59sRaw Price Data as Input Feature2mProperties of Input Features2mCharacteristics of Input Features2mReturns From Price Series2mTechnical Indicators in a State2mMoving Average as Input Feature2mInput Features3m 13sRole of Information Coefficient2mAvoiding Correlated Inputs2mTime Signature as Input Feature2mAdditional Reading10m
- Policies in Reinforcement LearningIn this section, you will learn how a policy is used by the RL model to choose the method used for selecting an action. You will learn about exploration and exploitation based policies, and the differences between them.Policies in Reinforcement Learning4m 7sDefinition of Policy2mTypes of Action2mExploration Versus Exploitation2mBest Reinforcement Learning Policy2mUse of Epsilon Value2mFunction to Calculate Epsilon2mPlotting Epsilon Value Curve2mProbability of Random Number2mReduction of Exploration Rate2mAdditional Reading10mTest on States, Q-Learning and Policies16m
- Challenges in Reinforcement LearningIn this section, you will learn about the challenges in designing a reinforcement learning model for the financial markets. Addressing these challenges will help you develop a potent trading system using reinforcement learning.Difficulties in RL3m 37sDifference Between Chaos Types2mImportance of Type 2 Chaos2mEfficiency of Noise Filters2mEffect of Changing Market Regime2mReinforcement Learning Concept25m 28s
- Initialise Game ClassEach trading game is treated as its own game with a start, play period and end and a final score or reward. This whole process is done inside the Game class. In this notebook, you will learn how to initialise the Game class.Introduction to Part II2mHow to Use Jupyter Notebook?1m 54sWorking With Pickle File5mInitialise Game Class10mRead Price Data5mResample Price Data5mResampling Price Bars2m
- Positions and RewardsIn this section, you will learn to update the trading positions based on the actions suggested by the neural network. You will also learn different pnl based reward systems.Positions and Rewards2m 7sElement of Reinforcement Learning2mWhat Action Do You Take?2mUpdate the Positions10mSame Action2mNo Position and Buy Action2mOpposite Action2mReward System10mCalculate Percentage PnL5mCategorical PnL Reward5mDifference Between Two PnL Rewards2mAdditional Reading10m
Input Features
In this section, you will learn in detail about the various input features used in the construction of a state. These are candlestick bars, technical indicators and time signatures. You will also learn about the importance of the stationary input features in creating a state.Input Features1m 58sWhy Time Signature?2mGranularity of Candlesticks2mWhich Technical Indicators?2mCandlestick Input Features2m 34sWhy Stationary Features?2mEndogenous Features2mExogenous Features2mConstruct and Assemble State
In this section, you will learn to create input features in Python. Once you are ready with the input features, you can assemble them to construct a state. This state is passed to the neural networks as an input. Based on the input state, the neural network predicts the actions; buy, sell or hold.Construct and Assemble State3m 42sHow to Make Data Stationary?2mGet Last N Time Bars5mSize of State Vector2mOutput of the Code2mGet Last N Timebars2mMinute Price Data and Resampling Techniques10mAssemble States10mFlatten the Array5mNormalise Candlesticks5mCalculate RSI5mCalculate Aroon Oscillator5mDatetime2mTime of the Day5mDay of the Week5mAdditional Reading10mTest on Features and State Construction14m- Game ClassIn this section, you will learn to create a full Game class starting with the initialisation of the Game class, updating the position, calculating the reward and assembling the state.Game Class10mact() Method Returns?2mCreate Game Class Environment5m
- Experience ReplayIn experience replay, we use a memory buffer to store Current State, Action, Reward for action, Next State, as well as whether the game is over or not. This is called one experience. Experience replay is an integral part of reinforcement learning which uses random sampling of previous experiences to train the neural network. Random sampling helps speed up the learning process due to the experiences being uncorrelated with each other.Memory and Saving3m 30sAdvantages of Random Sampling2mStructure of Replay Buffer Entries2mLength of Replay Buffer2mObjective of Experience Replay2mQ Value Update3m 14sQ Value Update2mWhich are True for Experience Replay?2mExperience Replay Implementation10mTruncate Length of Replay Memory2mNumber of Columns in Target Array2mNumber of Columns in Input Array2mSize of Input Array2mGenerate Random Numbers5mGet SARS Values5mPredict Q Values from Current State5mPredict Maximum Q Value from Next State5mValue of Target Array2mUpdate Target Array5mAdditional Reading10m
- Artificial Neural Network ConceptsIn this section, you will first learn about the workings of an Artificial Neural Network, the way inputs get multiplied with weights and passed into nonlinear functions to generate outputs. You will learn about how these weights are changed using optimization algorithms to reflect ground truths. Thereafter, you will learn about overfitting and why it is particularly tough to train on financial data.Artificial Neural Network2m 34sUse of Q-Tables2mInput State2mWeights of the Neural Network2mGradient Descent and Loss Function2m 34sGradient Descent2mLoss functions2mOverfitting2m 46sDifficulty in Modelling Financial Data2mAdditional Reading10m
- Artificial Neural Network ImplementationIn this section, you will learn about and implement Double Deep Q Learning agents using Keras. Further, you will go through the learning agent architecture as well.Agent Implementation4m 11sUse of two Q-Tables2mSize of hidden layer2mLoss function of DDQN Agent2mActivation Function2mLearning rate2mANN in Keras10mCreate Sequential Object5mParameters in Dense Layer2mCreate Dense Layer5mAdd Layer to the Neural Network5mModel Compile5mFAQs: Neural Networks10mAdditional Reading10mTest on Artificial Neural Network14m
- Backtesting LogicThe RL model is initialised with the help of Game class, two deep neural networks and replay memory. You will go through the intuition of how the different components of the reinforcement learning model come together.Combining Elements of RL Model2m 21sMeaning of an Episode2mExit Process of RL2mGoing Through Dataset2mSequentially Accessing Dataset2mProcess of Backtesting3m 44sInitialising Components of RL2mWay of Selecting Action2mProcess of Backtesting2mStoring in Replay Memory2mUpdating and Freezing Models2mModel R Update Frequency2mBacktesting Logic10m
- Backtesting ImplementationApply the knowledge of the previous sections to build a reinforcement learning model and start playing games on the price datasets.Backtesting Implementation10mCalculate Epsilon5mExploitation Vs Exploration2mRandomly Generated Action5mAdditional Reading10mGenerate Action Through Deep Q Network5mTrain the Model5m
- Performance Analysis: Synthetic DataBefore we use a real dataset of price data, it is advisable to test the RL model on a known series. Thus, a synthetic data set consisting of sine waves and trending price pattern is created as an input for the reinforcement learning model. This will help analyse the performance of the model.Generating Patterns for RL Model2m 5sImportance of Testing Known Patterns2mUsing Random Data Generator2mPure Sine Wave and Trend Series2mSynthetic Time Series Patterns10mCreate a Trending Pattern5mCreate a Mean Reverting Pattern5mAdd Noise to a Signal5mPerformance on Synthetic Data1m 39sDifferent Result on Same Dataset2mVaried Performance in Similar Scenario2mSelection of Correct Model2mApply RL on Synthetic Mixed Wave Pattern10mConfiguration Parameters3m 7sStability of a Reinforcement Learning Model2mTuning of a Reinforcement Learning Model2m
- Performance Analysis: Real World Price DataIn this section, you will apply the Reinforcement Learning Model on real world price data and then analyse the performance.RL Model on Real World Price Data2m 12sRL Model on Real World Price Data10mFrequently Asked Questions10mReinforcement Learning Implementation31m 37sTest on Backtest and Performance Analysis14m
- Automated Trading StrategyIn this section, you will first learn about the steps to automate your trading. After a brief overview of the basics, you will learn how to integrate your trading algorithm with Interactive Brokers API using IBridgePy.Automated Trading Overview2m 20sSteps in Live Trading2mTasks Required for Live Trading2mRepeated Actions in Live Trading2mStreaming Live Data2mApplication Programming Interface2mAutomated Execution of Trades10mIBridgePy Course Link10mPlacing Orders2mGetting Historical Price2mScheduling the Strategy2m
- Paper and Live TradingIn this section, you will learn about the challenges in live trading, and get answers to the FAQs regarding the same. You will understand the basic program flow of a live trading strategy. A live trading strategy template will also be provided to you. You can tweak the template to deploy your strategies in the live market!Live Trading FAQs2mLive Trading Flow Diagram10mTemplate Documentation10mTemplate Code Files2m
- Capstone ProjectIn this section, you will undertake a capstone project on real-world data. This project will require you to apply and practice the concepts learnt throughout this course. You will also get a model solution at the end of the section.Capstone Project: Getting Started10mProblem Statement10mModel Solution Template: Building the RL Model10mFrequently Asked Questions10mTemplate Code Files2mModel Solution: Combining the Agents10mCapstone Model Solution FAQs2mCapstone Solution Downloadable2m
- Future EnhancementsThe RL model can be modified and tweaked to suit the needs of the individual trader. In this section, you will explore the various methods in which the state, action and reward can be enhanced further.Future Enhancement3m 5sDesigning Actions for Portfolio2mCondition to Stop RL Model2mAvoiding Buy and Hold2mBest Reward Function2mInput Features for Gold Trading2mCapital Allocation for Crude Oil2mInput Features for Calendar Anomalies2mGenerating Higher Returns2mDealing With Regime Change2mExploration Rate After Million Trades2mAdditional Reading10m
- Run Codes Locally on Your MachineLearn to install the Python environment in your local machine.Uninterrupted Learning Journey with Quantra2mPython Installation Overview2m 18sFlow Diagram10mInstall Anaconda on Windows10mInstall Anaconda on Mac10mKnow your Current Environment2mTroubleshooting Anaconda Installation Problems10mCreating a Python Environment10mChanging Environments2mQuantra Environment2mTroubleshooting Tips For Setting Up Environment10mHow to Run Files in Downloadable Section?10mTroubleshooting For Running Files in Downloadable Section10m
- Course SummaryThis section includes a course summary and downloadable zipped folder with all the codes and notebooks for easy access.Course Summary2m 44sPython Codes and Data2m
Registered Successfully!
You will receive webinar joining details on your registered email
Would you like to start learning immediately?
about author


Why quantra®?
- More in Less Time
Gain more in less time
- Expert Faculty
Get taught by practitioners
- Self-paced
Learn at your own pace
- Data & Strategy Models
Get data & strategy models to practice on your own
Faqs
- When will I have access to the course content, including videos and strategies?
You will gain access to the entire course content including videos and strategies, as soon as you complete the payment and successfully enroll in the course.
- Will I get a certificate at the completion of the course?
Yes, you will be awarded with a certification from QuantInsti after successfully completing the online learning units.
- Are there any webinars, live or classroom sessions available in the course?
No, there are no live or classroom sessions in the course. You can ask your queries on community and get responses from fellow learners and faculty members.
- Is there any support available after I purchase the course?
Yes, you can ask your queries related to the course on the community: https://quantra.quantinsti.com/community
- What are the system requirements to do this course?
Fast-speed internet connection and a browser application are required for this course. For best experience, use Chrome.
- What is the admission criteria?
There is no admission criterion. You are recommended to go through the prerequisites section and be aware of skill sets gained and required to learn most from the course.
- Is there a refund available?
We respect your time, and hence, we offer concise but effective short-term courses created under professional guidance. We try to offer the most value within the shortest time. There are a few courses on Quantra which are free of cost. Please check the price of the course before enrolling in it. Once a purchase is made, we offer complete course content. For paid courses, we follow a 'no refund' policy.
- Is the course downloadable?
Some of the course material is downloadable such as Python notebooks with strategy codes. We also guide you how to use these codes on your own system to practice further.
- Can the python strategies provided in the course be immediately used for trading?
We focus on teaching these quantitative and machine learning techniques and how learners can use them for developing their own strategies. You may or may not be able to directly use them in your own system. Please do note that we are not advising or offering any trading/investment services. The strategies are used for learning & understanding purposes and we don't take any responsibility for the performance or any profit or losses that using these techniques results in.
- I want to develop my own algorithmic trading strategy. Can I use a Quantra course notebook for the same?
Quantra environment is a zero-installation solution to get beginners to start off with coding in Python. While learning you won't have to download or install anything! However, if you wish to later implement the learning on your system, you can definitely do that. All the notebooks in the Quantra portal are available for download at the end of each course and they can be run in the local system just the same as they run in the portal. The user can modify/tweak/rework all such code files as per his need. We encourage you to implement different concepts learnt from different learning tracks into your trading strategy to make it more suited to the real-world scenario.
- If I plug in the Quantra code to my trading system, am I sure to make money?
No. We provide you guidance on how to create strategy using different techniques and indicators, but no strategy is plug and play. A lot of effort is required to backtest any strategy, after which we fine-tune the strategy parameters and see the performance on paper trading before we finally implement the live execution of trades.
- What does "lifetime access" mean?
Lifetime access means that once you enroll in the course, you will have unlimited access to all course materials, including videos, resources, readings, and other learning materials for as long as the course remains available online. There are no time limits or expiration dates on your access, allowing you to learn at your own pace and revisit the content whenever you need it, even after you've completed the course. It's important to note that "lifetime" refers to the lifetime of the course itself—if the platform or course is discontinued for any reason, we will inform you in advance. This will allow you enough time to download or access any course materials you need for future use.
- Does deep reinforcement learning work for trading?
Deep reinforcement learning trading has shown significant promise in the trading world. Combining deep learning and reinforcement learning enables automated trading decisions based on complex market data. This approach involves training neural networks, such as deep Q-networks (DQNs), to learn optimal trading policies, adapting to dynamic market conditions.
The process of deep reinforcement learning trading involves an agent (the trading algorithm) interacting with an environment (the market) to learn the best actions to take in different states. The agent receives rewards (profits or losses) based on its actions, guiding it toward strategies that yield positive outcomes. The goal is to maximise long-term cumulative rewards.
Let's consider an example to illustrate how deep reinforcement learning trading works. Suppose we have a deep reinforcement learning trading model focused on trading a specific stock. The model takes in various inputs, such as historical price data, technical indicators, and news sentiment. It uses these inputs to determine the best course of action: buy, sell, or hold the stock.
The model's decision-making process is guided by the Q-value function, which estimates the expected future rewards for taking a particular action in a given state. The Q-value function is updated through an iterative process known as Q-learning. The formula for updating the Q-value is as follows:
Q(state, action) = (1 - α) * Q(state, action) + α * (reward + γ * max[Q(next_state, next_action)])
Here, "state" and "action" represent the current state and action in deep reinforcement learning trading, "reward" is the immediate reward obtained by taking the action in the state, and γ is the discount factor. The learning rate α determines the extent to which the Q-value is updated.
The deep reinforcement learning trading model explores different actions in various market states, gradually refining its strategies based on the feedback it receives. The model adapts to changing market conditions through continuous learning and optimisation, enhancing its trading performance over time.
It's important to note that while deep reinforcement learning trading can be effective, it is not without challenges. Financial markets are complex and subject to various influences that can impact trading outcomes. The performance of deep reinforcement learning trading models heavily relies on the quality of data, the choice of features, and the design of reward functions.
To effectively trade using deep reinforcement learning, it is essential to have a solid understanding of trading principles and the underlying concepts of deep reinforcement learning. Enrolling in a specialised course on deep reinforcement learning trading can provide you with the necessary knowledge and skills to apply these techniques effectively in your trading strategies.
In conclusion, deep reinforcement learning trading offers exciting possibilities for enhancing trading strategies by integrating deep learning and reinforcement learning techniques. By training neural networks to make informed decisions based on market data, deep reinforcement learning trading can adapt to dynamic market conditions. However, it is important to approach deep reinforcement learning trading with an understanding of its challenges and limitations and to continually refine and adapt your strategies as you gain experience in deep reinforcement learning trading.
- How does deep reinforcement learning work?
Deep reinforcement learning uses the concept of rewards and penalty to learn how the game works and proceeds to maximise the rewards. Let`s take an oversimplified example, let`s say the stock price of ABC company is $100 and moves to $90 for the next four days, before climbing to $150. Our logic is to buy the stock today and hold till it reaches $150. If maximising our investment is the reward, the reinforcement learning model will receive a reward if it chooses to buy, and will be penalised if it chooses to sell/short.
- Where is reinforcement learning used?
Reinforcement learning has been used in a variety of applications including traffic light controls, computer web system management according to bandwidth, as well as beating human players in video games. Tesla has also been known to use reinforcement learning in their cars.
In the field of trading, there have been quite a few research papers which have achieved promising results by using RL in their trading strategies. MAN AHL uses RL for order execution of their strategies. Rosetta Analytics has developed a reinforcement learning framework which became live in 2020. These are just a few examples.
- What are the advantages of reinforcement learning?
The major advantage of reinforcement learning is that it is not as strictly rule based as other machine learning algorithms. This helps the reinforcement learning (RL) model to adapt to changing market conditions.
Another major advantage of RL is that it has an exploration component. This helps the RL model take unconventional decisions during backtesting, which might lead to poor results in the short term but might give good results in the long run. This helps the RL model gain "experience" and know which action should be chosen. This is called training the network.
Alphago, a reinforcement learning model built by Google Deepmind, was trained to play chess, among other games. As mentioned in the blog, Alphago, searches for only 10000s of moves results, in comparison to a rule-specific chess engine which searches millions of moves. This makes Alphago faster and also less resource-intensive. - Is reinforcement learning difficult to understand and learn?
As far as machine learning algorithms go, reinforcement learning can be fairly difficult for a newcomer to grasp. To truly understand the workings of the model, you must have a basic understanding of machine learning concepts and frameworks.
Once this prerequisite is cleared, you will find that reinforcement learning is only as difficult as learning a new type of trading strategy.
Nevertheless, the "Deep Reinforcement Learning in Trading" course has been developed with a focus on first principles, where the concepts are broken down to a fundamental level and explained in a lucid and crisp manner.
Thus, you will be able to first understand the concepts through videos and MCQs and then build the basic building blocks of the reinforcement learning model, which include state, action and Deep Q networks. Lastly, you will backtest this model on asset classes and adapt them for live trading. - How long does it take to learn reinforcement learning?
The short answer is, it depends. If you are an advanced practitioner of machine learning, then it will only take a day or two to understand and build your own trading model. On the other end of the spectrum, a novice learner will take about two weeks to a month if they are starting from scratch.