1 / 12

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning. Shijiang Lu. What Is Reinforcement Learning. Reinforcement learning (RL) is the problem facing an agent that must learn how to interact on a trial and error basis with a dynamic environment so that to maximize a scalar reward. Agent. s. r. a.

Download Presentation

Introduction to Reinforcement Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Reinforcement Learning Shijiang Lu

  2. What Is Reinforcement Learning • Reinforcement learning (RL) is the problem facing an agent that must learn how to interact on a trial and error basis with a dynamic environment so that to maximize a scalar reward.

  3. Agent s r a Fs(sT) Fr(sT) sT Environment Agent And Its Environment a: the agent’s action sT: the true state of the environment s: state of the environment perceived by the agent r: immediate reward perceived by the agent Fs(sT) an Fr(sT): functions that map sT to s and r

  4. Characteristics of RL • The agent has a goal (or goals) to achieve • The agent can take actions and the agent’s action will affect its environment • The agent learns in a trial and error fashion, i.e., the agent has no teacher and must learn by itself

  5. Characteristics of RL (Cont.) • The agent’s action should be chosen based on its perception of its environment and its evaluation of how well its need has been fulfilled already. • The agent may or may not have knowledge about its environment initially. Nevertheless, it must interact with its environment.

  6. Characteristics of RL (Cont.) • The agent may not know everything about the environment, i.e., there can be hidden states that the agent has no knowledge about. • The environment may change independent of the agent’s action

  7. Characteristics of RL (Cont.) • The environment may be non-deterministic, i.e., when the agent takes the same action under the same state, the environment may response differently. • The reward of an action may come instantaneously, or it may be delayed, i.e., not immediately after the agent’s action.

  8. Tradeoff Between Exploration and Exploitation • Exploration: Finding new knowledge by trying new actions, etc. • Exploitation: Using learned knowledge to find the best action. • Tradeoff: Neither exploration nor exploitation alone will yield satisfactory results

  9. Four Components of A RL Agent • Policy . At each time step, a policy  takes s and r as input and outputs an action a • Reward function R(s, a). Reward function takes s and a as input and returns a scalar value (the expected immediate reward) for taking action a at state s

  10. Four Components of A RL Agent (Cont.) • Value function V. The expected total return from s given that the agent uses policy  • Model. The model predicts the behavior of the environment, i.e., for given s and a, what the immediate reward will be and how the states will change

  11. RL for Adaptive Clustering • Actions: changing clustering algorithms, parameters, attributes/features, etc. • Immediate reward: how good the clustering result is • By using a trial and error approach, we can learn what is the best clustering algorithm, what attributes/features to choose, etc.

  12. References • [Sutton98] Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press. http://citeseer.nj.nec.com/sutton98reinforcement.html • [Kaelbling96] Leslie P. Kaelbling, Michael L. Littman, and Andrew W. Moore Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research 4:237­285, 1996 http://citeseer.ist.psu.edu/kaelbling96reinforcement.html

More Related