goodlisten

Episode

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
listen on SpotifyListen on Youtube
1:48:28
Published: Fri Apr 03 2020
Description

David Silver leads the reinforcement learning research group at DeepMind and was lead researcher on AlphaGo, AlphaZero and co-lead on AlphaStar, and MuZero and lot of important work in reinforcement learning. Support this podcast by signing up with these sponsors: - MasterClass: https://masterclass.com/lex - Cash App - use code "LexPodcast" and download: - Cash App (App Store): https://apple.co/2sPrUHe - Cash App (Google Play): https://bit.ly/2MlvP5w EPISODE LINKS: Reinforcement learning (book): https://amzn.to/2Jwp5zG This conversation is part of the Artificial Intelligence podcast. If you would like to get more information about this podcast go to https://lexfridman.com/ai or connect with @lexfridman on Twitter, LinkedIn, Facebook, Medium, or YouTube where you can watch the video versions of these conversations. If you enjoy the podcast, please rate it 5 stars on Apple Podcasts, follow on Spotify, or support it on Patreon. Here's the outline of the episode. On some podcast players you should be able to click the timestamp to jump to that time. OUTLINE: 00:00 - Introduction 04:09 - First program 11:11 - AlphaGo 21:42 - Rule of the game of Go 25:37 - Reinforcement learning: personal journey 30:15 - What is reinforcement learning? 43:51 - AlphaGo (continued) 53:40 - Supervised learning and self play in AlphaGo 1:06:12 - Lee Sedol retirement from Go play 1:08:57 - Garry Kasparov 1:14:10 - Alpha Zero and self play 1:31:29 - Creativity in AlphaZero 1:35:21 - AlphaZero applications 1:37:59 - Reward functions 1:40:51 - Meaning of life

Chapters
In this episode, Lex Friedman talks with David Silver about reinforcement learning, the future of artificial intelligence, and the developments in AlphaGo, AlphaZero, AlphaStar, and MuZero.
00:00 - 04:32 (04:32)
listen on SpotifyListen on Youtube
Artificial Intelligence
Summary

In this episode, Lex Friedman talks with David Silver about reinforcement learning, the future of artificial intelligence, and the developments in AlphaGo, AlphaZero, AlphaStar, and MuZero. They also discuss the history of US dollar and Bitcoin.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
Podcast
Lex Fridman Podcast
The speaker shares their experience building handcrafted AI agents that could perform certain tasks better and faster than humans, such as in Twitch-like scenarios, using reinforcement learning to learn patterns and make moves to increase their chances of winning.
04:32 - 13:04 (08:31)
listen on SpotifyListen on Youtube
AI
Summary

The speaker shares their experience building handcrafted AI agents that could perform certain tasks better and faster than humans, such as in Twitch-like scenarios, using reinforcement learning to learn patterns and make moves to increase their chances of winning. They found it satisfying that the system was able to learn and outwit them based on its own trial-and-error experiences.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
Podcast
Lex Fridman Podcast
In the game of Go, players must combine vast amounts of human knowledge with search-based methods to solve different sub-problems.
13:04 - 26:15 (13:11)
listen on SpotifyListen on Youtube
AI
Summary

In the game of Go, players must combine vast amounts of human knowledge with search-based methods to solve different sub-problems. The game poses a unique challenge for AI due to its rules and symmetric nature.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
Podcast
Lex Fridman Podcast
Deep reinforcement learning is a family of solution methods that leverage the representation power of neural networks to learn functions for different components of the agent, including the value function, the policy, and the model of the environment.
26:15 - 40:16 (14:01)
listen on SpotifyListen on Youtube
Deep Reinforcement Learning
Summary

Deep reinforcement learning is a family of solution methods that leverage the representation power of neural networks to learn functions for different components of the agent, including the value function, the policy, and the model of the environment. Despite the non-linear and bumpy nature of neural networks, deep learning has proven to be a universal toolkit for representing any function and making progress in learning.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
Podcast
Lex Fridman Podcast
Monte Carlo tree search is a form of Monte Carlo search that evaluates every node of a search tree and is based on the average of the random playouts from that node onwards, making it possible for a pure deep learning system to reach a human level at the full game of Go.
40:16 - 55:14 (14:57)
listen on SpotifyListen on Youtube
Monte Carlo tree search
Summary

Monte Carlo tree search is a form of Monte Carlo search that evaluates every node of a search tree and is based on the average of the random playouts from that node onwards, making it possible for a pure deep learning system to reach a human level at the full game of Go.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
Podcast
Lex Fridman Podcast
The speaker reflects on the privilege and luck of playing Go against Lee Sedol, as well as the unexpected challenges and brilliance that he encountered during the games.
55:15 - 1:05:58 (10:42)
listen on SpotifyListen on Youtube
Lee Sedol
Summary

The speaker reflects on the privilege and luck of playing Go against Lee Sedol, as well as the unexpected challenges and brilliance that he encountered during the games.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
Podcast
Lex Fridman Podcast
The main idea of AlphaZero is to come up with a single elegant principle by which a system can learn for itself all of the knowledge which it requires to play a game such as Go, without any human intervention.
1:05:58 - 1:21:46 (15:48)
listen on SpotifyListen on Youtube
AlphaZero
Summary

The main idea of AlphaZero is to come up with a single elegant principle by which a system can learn for itself all of the knowledge which it requires to play a game such as Go, without any human intervention. Learning through reinforcement and deep learning, the things one figures out are actually be applicable to other problems that are real world problems.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
Podcast
Lex Fridman Podcast
The progress of self-improvement in the single-agent case will lead all the way to the optimal possible behavior.
1:21:46 - 1:28:38 (06:52)
listen on SpotifyListen on Youtube
Reinforcement Learning
Summary

The progress of self-improvement in the single-agent case will lead all the way to the optimal possible behavior. In AlphaZero, it is able to play the game of Go, beat AlphaGo Zero and AlphaGo, and even chess with iterations of reinforcement learning.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
Podcast
Lex Fridman Podcast
AlphaZero was able to reach superhuman levels of performance in games like Go, Chess, and Shogi without any rules or human input, just through trial and error.
1:28:38 - 1:38:59 (10:20)
listen on SpotifyListen on Youtube
AlphaZero
Summary

AlphaZero was able to reach superhuman levels of performance in games like Go, Chess, and Shogi without any rules or human input, just through trial and error. Its application is essentially limitless in any digitized domain that can be consumed by a reinforcement learning framework to sense and act in an environment.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
Podcast
Lex Fridman Podcast
The ability to set and achieve sub-goals is crucial for the flexibility and intelligence of artificial systems, allowing them to achieve ultimate goals more efficiently.
1:38:59 - 1:48:05 (09:05)
listen on SpotifyListen on Youtube
Artificial Intelligence
Summary

The ability to set and achieve sub-goals is crucial for the flexibility and intelligence of artificial systems, allowing them to achieve ultimate goals more efficiently.

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
Podcast
Lex Fridman Podcast