Reinforcement Learning with MATLAB Understanding the Basics and Setting Up
the Environment
What Is Reinforcement Learning?
Reinforcement learning is learning what to do—how to map
situations to actions—so as to maximize a numerical reward
makino
signal. The learner is not told which actions to take, but instead
must discover which actions yield the most reward by trying them.
—Sutton and Barto, Reinforcement Learning: An Introduction
Reinforcement learning (RL) has successfully trained
computer programs to play games at a level higher than the
协调英文
world’s best human players.
The programs find the best action to take in games with
large state and action spaces, imperfect world information,
学习 英文and uncertainty around how short-term actions pay off in the
long run.
Engineers face the same types of challenges when designing
controllers for real systems. Can reinforcement learning also
help solve complex control problems like making a robot walk
or driving an autonomous car?
This ebook answers that question by explaining what RL is
in the context of traditional control problems and helps you
understand how to t up and solve the RL problem.
The Goal of Control
Broadly speaking, the goal of a control system is to determine the
correct inputs (actions) into a system that will generate the desired
system behavior.
With feedback control systems, the controller us state obrvations
to improve performance and correct for random disturbances and
errors. Engineers u that feedback, along with a model of the
26个英文字母发音
plant and environment, to design the controller to meet the system
requirements.
This concept is simple to put into words, but it can quickly become
航空学校分数线
difficult to achieve when the system is hard to model, is highly
nonlinear, or has large state and action spaces.
imagine developing a control system for a walking robot.
To control the robot (i.e., the system), you command potentially dozens
wet
of motors that operate each of the joints in the arms and legs.
Each command is an action you can take. The state obrvations
come from multiple sources, including a camera vision nsor,
英文写作网accelerometers, gyros, and encoders for each of the motors.
有声小说mp3
99宿舍六级查分• Determine the right combination of motor torques to get the robot
walking and keep it balanced.
• Operate in an environment that has random obstacles that
need to be avoided.
• Reject disturbances like wind gusts.
A control system design would need to handle the as well as any
additional requirements like maintaining balance while walking down
a steep hillside or across a patch of ice.
Typically, the best way to approach this problem is to break it up into smaller discrete ctions that can be solved independently.
For example, you could build a process that extracts features from the camera images. The might be things like the location and type of obstacle, or the location of the robot in a global reference fram
e. Combine tho states with the procesd obrvations from the other nsors to complete the full state estimation.
英语专业毕业论文
The estimated state and the reference would feed into the controller, which would likely consist of multiple nested control loops. The outer loop would be responsible for managing high-level robot behavior (like maybe maintaining balance), and the inner loops manage low-level behaviors and individual actuators.
All solved? Not quite.
The loops interact with each other, which makes design and tuning challenging. Also, determining the best way to structure the loops and break up the problem is not simple.