Reinforcement Learning and AlphaZero

Chapter

Reinforcement Learning and AlphaZero

1:21:46 - 1:28:38 (06:52)

The progress of self-improvement in the single-agent case will lead all the way to the optimal possible behavior. In AlphaZero, it is able to play the game of Go, beat AlphaGo Zero and AlphaGo, and even chess with iterations of reinforcement learning.

Clips

How Reinforcement Learning Can Correct Delusions

Using reinforcement learning can help correct errors and delusions that a system may have by allowing it to learn from its own mistakes and make corrections to improve its performance.

1:21:46 - 1:25:10 (03:23)

Reinforcement Learning

Summary

Using reinforcement learning can help correct errors and delusions that a system may have by allowing it to learn from its own mistakes and make corrections to improve its performance.

Chapter
Reinforcement Learning and AlphaZero

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Podcast
Lex Fridman Podcast

The Power of Self-Improvement in Gaming AI

David Silver explains how self-improvement can lead to optimal possible behavior in single agent games, and how the system AlphaZero was able to beat AlphaGo Zero, AlphaGo, and even the world's strongest computer chess program, using its own principles.

1:25:10 - 1:28:38 (03:28)

Summary

David Silver explains how self-improvement can lead to optimal possible behavior in single agent games, and how the system AlphaZero was able to beat AlphaGo Zero, AlphaGo, and even the world's strongest computer chess program, using its own principles.

Chapter

Reinforcement Learning and AlphaZero

1:21:46 - 1:28:38 (06:52)

Clips

How Reinforcement Learning Can Correct Delusions

Using reinforcement learning can help correct errors and delusions that a system may have by allowing it to learn from its own mistakes and make corrections to improve its performance.

1:21:46 - 1:25:10 (03:23)

Summary

ChapterReinforcement Learning and AlphaZero

Reinforcement Learning and AlphaZero

Episode#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

PodcastLex Fridman Podcast

Lex Fridman Podcast

The Power of Self-Improvement in Gaming AI

David Silver explains how self-improvement can lead to optimal possible behavior in single agent games, and how the system AlphaZero was able to beat AlphaGo Zero, AlphaGo, and even the world's strongest computer chess program, using its own principles.

1:25:10 - 1:28:38 (03:28)

Summary

ChapterReinforcement Learning and AlphaZero

Reinforcement Learning and AlphaZero

Episode#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

PodcastLex Fridman Podcast

Lex Fridman Podcast

Chapter
Reinforcement Learning and AlphaZero

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Podcast
Lex Fridman Podcast

Chapter
Reinforcement Learning and AlphaZero

Episode
#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Podcast
Lex Fridman Podcast