Chapter
Maximizing Reward Sum in Sequential Decision Trees using Bayesian Framework
The Bayesian framework involves assigning a priori probability to any given stochastic program and evaluating what policies or action sequences lead to the maximum reward sum in expectation by replacing the true distribution with a universal distribution. The reward signal is occasionally given to the agent to maximize the reward sum, while avoiding greedy approaches.
Clips
This podcast explains the reinforcement learning framework where an agent is rewarded positively or negatively or sometimes not at all in every time step and the agent seeks to maximize rewards over the lifetime by choosing actions that lead in expectation to the maximum reward sum.
43:41 - 44:53 (01:12)
Summary
This podcast explains the reinforcement learning framework where an agent is rewarded positively or negatively or sometimes not at all in every time step and the agent seeks to maximize rewards over the lifetime by choosing actions that lead in expectation to the maximum reward sum.
ChapterMaximizing Reward Sum in Sequential Decision Trees using Bayesian Framework
Episode#75 – Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI
PodcastLex Fridman Podcast
In game AI, expecting max strategy involves calculating probabilities and back propagating the best possible move by assuming the opponent plays the move that is worst for the player.
44:54 - 46:24 (01:30)
Summary
In game AI, expecting max strategy involves calculating probabilities and back propagating the best possible move by assuming the opponent plays the move that is worst for the player. This replaces the classic mini max strategy.
ChapterMaximizing Reward Sum in Sequential Decision Trees using Bayesian Framework
Episode#75 – Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI
PodcastLex Fridman Podcast
The Bayesian framework uses a priori probability to model a distribution, which can be used to replace unknown distributions in sequential decision trees.
46:24 - 48:08 (01:43)
Summary
The Bayesian framework uses a priori probability to model a distribution, which can be used to replace unknown distributions in sequential decision trees. This approach involves considering shorter programs with higher probability and longer programs with lower probability.
ChapterMaximizing Reward Sum in Sequential Decision Trees using Bayesian Framework
Episode#75 – Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI
PodcastLex Fridman Podcast
The Universal Distribution, also known as the Solomonoff prior, is a probability distribution that's weighed by the simplicity of a program and the likelihood.
48:08 - 49:15 (01:06)
Summary
The Universal Distribution, also known as the Solomonoff prior, is a probability distribution that's weighed by the simplicity of a program and the likelihood. Planning problems up to a certain horizon, denoted as M, are exponential in time, making them computable but intractable.