Chapter

Reinforcement Learning and AlphaZero
The progress of self-improvement in the single-agent case will lead all the way to the optimal possible behavior. In AlphaZero, it is able to play the game of Go, beat AlphaGo Zero and AlphaGo, and even chess with iterations of reinforcement learning.
Clips
Using reinforcement learning can help correct errors and delusions that a system may have by allowing it to learn from its own mistakes and make corrections to improve its performance.
1:21:46 - 1:25:10 (03:23)
Summary
Using reinforcement learning can help correct errors and delusions that a system may have by allowing it to learn from its own mistakes and make corrections to improve its performance.
ChapterReinforcement Learning and AlphaZero
Episode#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
PodcastLex Fridman Podcast
David Silver explains how self-improvement can lead to optimal possible behavior in single agent games, and how the system AlphaZero was able to beat AlphaGo Zero, AlphaGo, and even the world's strongest computer chess program, using its own principles.
1:25:10 - 1:28:38 (03:28)
Summary
David Silver explains how self-improvement can lead to optimal possible behavior in single agent games, and how the system AlphaZero was able to beat AlphaGo Zero, AlphaGo, and even the world's strongest computer chess program, using its own principles.