#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Lex Fridman Podcast

/#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Episode

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

1:48:28

Published: Fri Apr 03 2020

Description

David Silver leads the reinforcement learning research group at DeepMind and was lead researcher on AlphaGo, AlphaZero and co-lead on AlphaStar, and MuZero and lot of important work in reinforcement learning. Support this podcast by signing up with these sponsors: - MasterClass: https://masterclass.com/lex - Cash App - use code "LexPodcast" and download: - Cash App (App Store): https://apple.co/2sPrUHe - Cash App (Google Play): https://bit.ly/2MlvP5w EPISODE LINKS: Reinforcement learning (book): https://amzn.to/2Jwp5zG This conversation is part of the Artificial Intelligence podcast. If you would like to get more information about this podcast go to https://lexfridman.com/ai or connect with @lexfridman on Twitter, LinkedIn, Facebook, Medium, or YouTube where you can watch the video versions of these conversations. If you enjoy the podcast, please rate it 5 stars on Apple Podcasts, follow on Spotify, or support it on Patreon. Here's the outline of the episode. On some podcast players you should be able to click the timestamp to jump to that time. OUTLINE: 00:00 - Introduction 04:09 - First program 11:11 - AlphaGo 21:42 - Rule of the game of Go 25:37 - Reinforcement learning: personal journey 30:15 - What is reinforcement learning? 43:51 - AlphaGo (continued) 53:40 - Supervised learning and self play in AlphaGo 1:06:12 - Lee Sedol retirement from Go play 1:08:57 - Garry Kasparov 1:14:10 - Alpha Zero and self play 1:31:29 - Creativity in AlphaZero 1:35:21 - AlphaZero applications 1:37:59 - Reward functions 1:40:51 - Meaning of life

Chapters

Reinforcement Learning and the Future of Artificial Intelligence with David Silver | MIT | Artificial Intelligence Podcast

In this episode, Lex Friedman talks with David Silver about reinforcement learning, the future of artificial intelligence, and the developments in AlphaGo, AlphaZero, AlphaStar, and MuZero.

00:00 - 04:32 (04:32)

Artificial Intelligence

Summary

In this episode, Lex Friedman talks with David Silver about reinforcement learning, the future of artificial intelligence, and the developments in AlphaGo, AlphaZero, AlphaStar, and MuZero. They also discuss the history of US dollar and Bitcoin.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Podcast
Lex Fridman Podcast

Building Handcrafted AI Agents with Reinforcement Learning

The speaker shares their experience building handcrafted AI agents that could perform certain tasks better and faster than humans, such as in Twitch-like scenarios, using reinforcement learning to learn patterns and make moves to increase their chances of winning.

04:32 - 13:04 (08:31)

Summary

The speaker shares their experience building handcrafted AI agents that could perform certain tasks better and faster than humans, such as in Twitch-like scenarios, using reinforcement learning to learn patterns and make moves to increase their chances of winning. They found it satisfying that the system was able to learn and outwit them based on its own trial-and-error experiences.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Podcast
Lex Fridman Podcast

The Complexity of AI in the Game of Go

In the game of Go, players must combine vast amounts of human knowledge with search-based methods to solve different sub-problems.

13:04 - 26:15 (13:11)

Summary

In the game of Go, players must combine vast amounts of human knowledge with search-based methods to solve different sub-problems. The game poses a unique challenge for AI due to its rules and symmetric nature.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Podcast
Lex Fridman Podcast

Deep Reinforcement Learning and Neural Networks

Deep reinforcement learning is a family of solution methods that leverage the representation power of neural networks to learn functions for different components of the agent, including the value function, the policy, and the model of the environment.

26:15 - 40:16 (14:01)

Deep Reinforcement Learning

Summary

Deep reinforcement learning is a family of solution methods that leverage the representation power of neural networks to learn functions for different components of the agent, including the value function, the policy, and the model of the environment. Despite the non-linear and bumpy nature of neural networks, deep learning has proven to be a universal toolkit for representing any function and making progress in learning.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Podcast
Lex Fridman Podcast

Monte Carlo Tree Search

Monte Carlo tree search is a form of Monte Carlo search that evaluates every node of a search tree and is based on the average of the random playouts from that node onwards, making it possible for a pure deep learning system to reach a human level at the full game of Go.

40:16 - 55:14 (14:57)

Monte Carlo tree search

Summary

Monte Carlo tree search is a form of Monte Carlo search that evaluates every node of a search tree and is based on the average of the random playouts from that node onwards, making it possible for a pure deep learning system to reach a human level at the full game of Go.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Podcast
Lex Fridman Podcast

The Experience of Playing Go Against Lee Sedol

The speaker reflects on the privilege and luck of playing Go against Lee Sedol, as well as the unexpected challenges and brilliance that he encountered during the games.

55:15 - 1:05:58 (10:42)

Lee Sedol

Summary

The speaker reflects on the privilege and luck of playing Go against Lee Sedol, as well as the unexpected challenges and brilliance that he encountered during the games.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Podcast
Lex Fridman Podcast

AlphaZero learns to play games: Men still needed.

The main idea of AlphaZero is to come up with a single elegant principle by which a system can learn for itself all of the knowledge which it requires to play a game such as Go, without any human intervention.

1:05:58 - 1:21:46 (15:48)

AlphaZero

Summary

The main idea of AlphaZero is to come up with a single elegant principle by which a system can learn for itself all of the knowledge which it requires to play a game such as Go, without any human intervention. Learning through reinforcement and deep learning, the things one figures out are actually be applicable to other problems that are real world problems.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Podcast
Lex Fridman Podcast

Reinforcement Learning and AlphaZero

The progress of self-improvement in the single-agent case will lead all the way to the optimal possible behavior.

1:21:46 - 1:28:38 (06:52)

Reinforcement Learning

Summary

The progress of self-improvement in the single-agent case will lead all the way to the optimal possible behavior. In AlphaZero, it is able to play the game of Go, beat AlphaGo Zero and AlphaGo, and even chess with iterations of reinforcement learning.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Podcast
Lex Fridman Podcast

AlphaZero: self-taught game master

AlphaZero was able to reach superhuman levels of performance in games like Go, Chess, and Shogi without any rules or human input, just through trial and error.

1:28:38 - 1:38:59 (10:20)

AlphaZero

Summary

AlphaZero was able to reach superhuman levels of performance in games like Go, Chess, and Shogi without any rules or human input, just through trial and error. Its application is essentially limitless in any digitized domain that can be consumed by a reinforcement learning framework to sense and act in an environment.

The ability to set and achieve sub-goals is crucial for the flexibility and intelligence of artificial systems, allowing them to achieve ultimate goals more efficiently.

1:38:59 - 1:48:05 (09:05)

Artificial Intelligence

Summary

The ability to set and achieve sub-goals is crucial for the flexibility and intelligence of artificial systems, allowing them to achieve ultimate goals more efficiently.

Episode

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

1:48:28

Published: Fri Apr 03 2020

Description

Chapters

Reinforcement Learning and the Future of Artificial Intelligence with David Silver | MIT | Artificial Intelligence Podcast

In this episode, Lex Friedman talks with David Silver about reinforcement learning, the future of artificial intelligence, and the developments in AlphaGo, AlphaZero, AlphaStar, and MuZero.

00:00 - 04:32 (04:32)

Summary

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Lex Fridman Podcast

Building Handcrafted AI Agents with Reinforcement Learning

The speaker shares their experience building handcrafted AI agents that could perform certain tasks better and faster than humans, such as in Twitch-like scenarios, using reinforcement learning to learn patterns and make moves to increase their chances of winning.

04:32 - 13:04 (08:31)

Summary

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Lex Fridman Podcast

The Complexity of AI in the Game of Go

In the game of Go, players must combine vast amounts of human knowledge with search-based methods to solve different sub-problems.

13:04 - 26:15 (13:11)

Summary

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Lex Fridman Podcast

Deep Reinforcement Learning and Neural Networks

Deep reinforcement learning is a family of solution methods that leverage the representation power of neural networks to learn functions for different components of the agent, including the value function, the policy, and the model of the environment.

26:15 - 40:16 (14:01)

Summary

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Lex Fridman Podcast

Monte Carlo Tree Search

Monte Carlo tree search is a form of Monte Carlo search that evaluates every node of a search tree and is based on the average of the random playouts from that node onwards, making it possible for a pure deep learning system to reach a human level at the full game of Go.

40:16 - 55:14 (14:57)

Summary

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Lex Fridman Podcast

The Experience of Playing Go Against Lee Sedol

The speaker reflects on the privilege and luck of playing Go against Lee Sedol, as well as the unexpected challenges and brilliance that he encountered during the games.

55:15 - 1:05:58 (10:42)

Summary

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Lex Fridman Podcast

AlphaZero learns to play games: Men still needed.

The main idea of AlphaZero is to come up with a single elegant principle by which a system can learn for itself all of the knowledge which it requires to play a game such as Go, without any human intervention.

1:05:58 - 1:21:46 (15:48)

Summary

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Lex Fridman Podcast

Reinforcement Learning and AlphaZero

The progress of self-improvement in the single-agent case will lead all the way to the optimal possible behavior.

1:21:46 - 1:28:38 (06:52)

Summary

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Lex Fridman Podcast

AlphaZero: self-taught game master

AlphaZero was able to reach superhuman levels of performance in games like Go, Chess, and Shogi without any rules or human input, just through trial and error.

1:28:38 - 1:38:59 (10:20)

Summary

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Lex Fridman Podcast

The Importance of Sub-Goals in Artificial Intelligence

The ability to set and achieve sub-goals is crucial for the flexibility and intelligence of artificial systems, allowing them to achieve ultimate goals more efficiently.

1:38:59 - 1:48:05 (09:05)

Summary

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Lex Fridman Podcast