首页 > 美文阅读

增强学习四要素

更新时间:2023-06-02 19:04:09 阅读：评论：0

有关月亮的歌曲

增强学习四要素龙溪社区

做不倒翁

增强学习有四个要素：policy, reward signal, value function and model of the environment.

1.Policy

policy定义了在给定时间点，对环境(situation）将做出如何的⾏为。( a policy defines the learning agent's way of the behaving at a given time).

2.Reward Signal

reward signal定义了在增强学习过程中的⽬标(goal)（a reward signal defines the goal in a reinforcement learning problem)。我们的学习⽬标就是要maximize the total reward。

热带草原猫3. Value Function

干燥皮肤value function定义了长期来看的reward（a value function specifies what is good in the long run)。举个例⼦，agent可能选择⼀个暂时low的reward，但是在那个时间段内，总体的reward⽐较⼤。value function可以看作是对未来reward的estimate，是增强学习算法中核⼼的部分。

二次元网名

4. Model of the environment

完美腰臀比model of the environment定义了环境因agent的action如何变化（the model of the environment is something that mimics the behavior of the environment， or more generally，that allows inferences to be made about how the environment will behavior)。

>小鱼游戏

本文发布于:2023-06-02 19:04:09，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/82/836244.html

上一篇：2023年医生的辞职信医院辞职信简单(14篇)

下一篇：质检工作总结

标签：学习增强定义可能

留言与评论（共有 0 条评论）