Chapter

Reinforcement Learning and AlphaZero
listen on SpotifyListen on Youtube
1:21:46 - 1:28:38 (06:52)

The progress of self-improvement in the single-agent case will lead all the way to the optimal possible behavior. In AlphaZero, it is able to play the game of Go, beat AlphaGo Zero and AlphaGo, and even chess with iterations of reinforcement learning.

Clips
Using reinforcement learning can help correct errors and delusions that a system may have by allowing it to learn from its own mistakes and make corrections to improve its performance.
1:21:46 - 1:25:10 (03:23)
listen on SpotifyListen on Youtube
Reinforcement Learning
Summary

Using reinforcement learning can help correct errors and delusions that a system may have by allowing it to learn from its own mistakes and make corrections to improve its performance.

Chapter
Reinforcement Learning and AlphaZero
Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
Podcast
Lex Fridman Podcast
David Silver explains how self-improvement can lead to optimal possible behavior in single agent games, and how the system AlphaZero was able to beat AlphaGo Zero, AlphaGo, and even the world's strongest computer chess program, using its own principles.
1:25:10 - 1:28:38 (03:28)
listen on SpotifyListen on Youtube
AI
Summary

David Silver explains how self-improvement can lead to optimal possible behavior in single agent games, and how the system AlphaZero was able to beat AlphaGo Zero, AlphaGo, and even the world's strongest computer chess program, using its own principles.

Chapter
Reinforcement Learning and AlphaZero
Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning
Podcast
Lex Fridman Podcast