What is deep reinforcement learning?

Man using tablet to teach machine using reinforcement learning

Posted by Ruth Brooks

Feb 13, 2023

Deep reinforcement learning is a rapidly expanding area of machine learning. It combines reinforcement learning and deep learning to create machines that can solve complex, multi-layered problems, and perform sophisticated decision-making tasks. This is because, through deep reinforcement learning, machines can more closely mimic the workings of the human brain than would be possible than through reinforcement learning, or deep learning, on their own.

What is deep learning?

Deep learning enables computers to learn by example using artificial neural network architectures that are modelled after the neural networks in the human brain.

Deep learning algorithms and multi-layered deep neural networks can process and extract huge swathes of data – in fact, the more data a deep learning network processes, the better it’s likely to perform. They’re trained using large labeled datasets, and continue to learn as they perform classification tasks.

What is reinforcement learning?

Reinforcement learning, or RL, enables machines to learn to solve problems and make decisions dynamically through trial-and-error RL algorithms. Trained using real-life scenarios, the machine receives rewards or penalties for the actions it performs in increments known as time steps. It performs these actions in action spaces, which is the set of all valid actions – or choices – available to the machine agent as it interacts within an environment.

If the agent does something correctly, it receives positive feedback, and if it does something incorrectly, it receives negative feedback. With a goal of optimisation and maximising its total rewards, the machine learns as it receives feedback, and progressively improves with each iteration.

What is the difference between deep reinforcement learning and traditional reinforcement learning?

The key difference between reinforcement learning and deep reinforcement learning is the addition, in deep reinforcement learning, of deep learning.

This is because deep reinforcement learning, also called deep RL, is the combined product of both traditional reinforcement learning and deep learning. It utilises the feedback learning methods of reinforced learning, as well as the neural networks of deep learning, to create machines that can make decisions and perform tasks autonomously, and apply existing knowledge to new and unstructured datasets.

How does deep reinforcement learning work?

Deep reinforcement learning works through the trial-and-error techniques of reinforcement learning alongside the many – or deep – layers of artificial neural networks used in deep learning. It does this by using algorithms, frameworks, and techniques that are suitable for deep reinforcement learning. Examples of these include:

Model-based deep reinforcement learning algorithms, which typically use model simulations of the environment, and allow the reinforcement learning agents (or RL agent) to learn as it carries out actions. Actions can also be optimised by combining model-learning with model-free methods, as well as through Monte Carlo methods like the cross-entropy method, which offers an adaptive procedure for estimating optimal reference parameters.
Model-free deep reinforcement learning algorithms, which can find an optimal policy despite limited knowledge of the environment, and without the reward function. The policy can also be optimised through deterministic policy gradient methods.
Markov Decision Process (MDP) algorithms, which enable sequential decision-making.
Deep Q-Learning and Deep Q-Networks (DQN) algorithms, which use what’s called the Q-function to estimate and measure the expected value of returns. This is known as function approximation.
Actor-critic learning, which is a technique that enables a machine to simultaneously learn a policy function (which tells the machine how to make decisions) and a value function (which helps improve the machine’s training process).
Temporal difference learning, which is an unsupervised, reinforced learning technique that can be used to predict the total reward expected.
Experience replay, which can act like the memory of an agent, allowing it to save and build on its observations multiple times.
Tree search, which is a search algorithm.

What is deep reinforcement learning used for?

Deep reinforcement learning has a wide range of applications and implementations, including state-of-the-art areas of computer science such as robotics, natural language processing, and computer vision. It is also used in industries and fields such as education, finance, healthcare, and transportation.

One of its biggest uses has been in gaming, with deep reinforcement learning performing at superhuman levels in everything from Atari games to online chess, poker, and other multiplayer competitive video games. In fact, deep reinforcement learning contributed to DeepMind AlphaGo’s much-publicised defeat of a real-world Go grandmaster in a game of Go.

What are the advantages and disadvantages of deep reinforcement learning?

Deep reinforcement learning is a rapidly evolving area of artificial intelligence (AI), one that is being increasingly embraced and utilised.

It has several advantages. For example:

Deep reinforcement learning can solve challenging, sequential decision-making problems in a way that deep learning, or reinforcement learning, would be unable to do on their own.
Deep RL has achieved human-level – and even superhuman – performance, particularly in gaming applications.
Deep RL can solve more complicated tasks with less prior knowledge.

Deep reinforcement learning is not without its limitations, of course. Its artificial neural networks replicate the structure of the human brain, so they have many layers within their architecture, and it uses algorithms with a high-dimensional state space, so it requires huge computing power.

Deep learning also needs large amounts of training data in order to learn and perform effectively.

However, while there are challenges within deep reinforcement learning, new developments in the field are already working to address them – and its advantages far outweigh its disadvantages.

Deep reinforcement learning resources

There are a number of open source algorithms and other resources available for deep reinforcement learning. These include:

Spinning Up, an educational deep RL resource produced by OpenAI and available on GitHub.
Packt’s Deep Reinforcement Learning with Python, also available on GitHub.
DeepMind’s educational resources on deep reinforcement learning.
Richard Sutton and Andrew Barto’s textbook, Reinforcement Learning: An Introduction, which they have made available for free online.

Learn more about deep reinforcement learning

Explore deep reinforcement learning in greater depth while gaining specialist skills in a fast-growing area of computer science with the flexible MSc Computer Science with Artificial Intelligence at Abertay University. This flexible master’s degree is taught part-time and 100% online, and is suited to mid-level professionals who want to move into computer science, or take the next step in their existing computer science career.

You will develop specialist knowledge in artificial intelligence, including how to use it to improve processes and how businesses and systems can be improved through AI techniques. You’ll also gain a firm foundation in the field of computer science, with core knowledge in networking, databases, and web development.