Improving AI's approximation of human play through self-play

Clip

Improving AI's approximation of human play through self-play

2:07:30 - 2:09:20 (01:50)

Researchers use self-play processes to create a better model of human play by allowing deviations from an anchor policy for actions with high expected value. The neural net trained to imitate human data is referred to as the anchor policy.

Clip

Improving AI's approximation of human play through self-play

2:07:30 - 2:09:20 (01:50)

Similar Clips