1 / 36

Reinforcement Learning | Reinforcement Learning In Python | Machine Learning Tutorial | Simplilearn

This presentation on Reinforcement Learning will help you understand the basics of reinforcement learning. You will learn the concepts like reward and states. You will also understand about actions in reinforcement learning. Finally, you'll look at a demo on Tic Tac Toe using Reinforcement Learning in Python.<br><br>Post Graduate Program in AI and Machine Learning:<br>Ranked #1 AI and Machine Learning course by TechGig<br>Fast track your career with our comprehensive Post Graduate Program in AI and Machine Learning, in partnership with Purdue University and in collaboration with IBM. This AI and machine learning certification program will prepare you for one of the worldu2019s most exciting technology frontiers. This Post Graduate Program in AI and Machine Learning covers statistics, Python, machine learning, deep learning networks, NLP, and reinforcement learning. You will build and deploy deep learning models on the cloud using AWS SageMaker, work on voice assistance devices, build Alexa skills, and gain access to GPU-enabled labs.<br><br>Key Features:<br>u2705 Purdue Alumni Association Membership<br>u2705 Industry-recognized IBM certificates for IBM courses<br>u2705 Enrollment in Simplilearnu2019s JobAssist<br>u2705 25 hands-on Projects on GPU enabled Labs<br>u2705 450 hours of Applied learning<br>u2705 Capstone Project in 3 Domains<br>u2705 Purdue Post Graduate Program Certification<br>u2705 Masterclasses from Purdue<br>u2705Get noticed by the top hiring companies<br><br>ud83dudc49Learn more at: http://bit.ly/38uySSS

Simplilearn
Download Presentation

Reinforcement Learning | Reinforcement Learning In Python | Machine Learning Tutorial | Simplilearn

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What’s in it for you? • Why Reinforcement Learning? • What is Reinforcement Learning? • Supervised vs Unsupervised vs Reinforcement Learning • Important Terms • Reinforcement Learning Example • Markov’s Decision Process

  2. Why Reinforcement Learning? Training a Machine Learning model requires a lot of data which might not always be available to us

  3. Click here to watch the video

  4. Why Reinforcement Learning? Training a Machine Learning model requires a lot of data which might not always be available to us. Further, the data provided might not be reliable

  5. Why Reinforcement Learning? Learning from a small subset of actions will not help expand the vast realm of solutions that may work for a particular problem Learn: To Walk

  6. Why Reinforcement Learning? Learning from a small subset of actions will not help expand the vast realm of solutions that may work for a particular problem Learn: To Walk

  7. Why Reinforcement Learning? This is going slow the growth that technology is capable of. Machines need to learn to perform actions by themselves and not just learn off humans Objective: Climb Mountain

  8. What is Reinforcement Learning?

  9. What is Reinforcement Learning? Reinforcement learning is a sub-branch of Machine Learning that trains a model to return an optimum solution for a problem by taking a sequence of decisions by itself. Consider a robot learning to go from one place to another

  10. What is Reinforcement Learning? The robot is given a scenario and must arrive at a solution by itself. The robot can take different paths to reach the destination

  11. What is Reinforcement Learning? It will know the best path by the time taken on each path. It might even come up a unique solution all by itself

  12. What is Reinforcement Learning? It will know the best path by the time taken on each path. It might even come up a unique solution all by itself

  13. Supervised vs Unsupervised vs Reinforcement Learning Data provided is labeled data, with output values specified The machine learns from its environment using rewards and errors Data provided is unlabeled data, the outputs are not specified, machine makes its own prediction Used to solve Association and clustering problems Used to solve Regression and classification problems Used to solve Reward based problems Reinforcement Learning Unsupervised Learning Supervised Learning Unlabeled data is used No predefined data is used Labeled data is used External Supervision No supervision No supervision Solves problems by understanding patterns and discovering output Solves problems by mapping labeled input to known output Follows Trail and Error problem solving approach

  14. Important Terms

  15. Important Terms in Reinforcement Learning Agent Agent is the model that is being trained via reinforcement learning

  16. Important Terms in Reinforcement Learning Environment The training situation that the model must optimize to is called its environment Output

  17. Important Terms in Reinforcement Learning Action All possible steps that can be taken by the model

  18. Important Terms in Reinforcement Learning State The current position/ condition returned by the model

  19. Important Terms in Reinforcement Learning Reward To help the model move in the right direction, it is rewarded/ points are given to it to appraise some action

  20. Important Terms in Reinforcement Learning Policy Policy determines how an agent will behave at anytime. It acts as a mapping between Action and present State

  21. Reinforcement Learning Example

  22. Reinforcement Learning Example Consider a dog that we must house train Agent Environment

  23. Reinforcement Learning Example We can get the dog to perform various actions by offering incentives such as dog biscuits as reward Action = Fetching

  24. Reinforcement Learning Example We can get the dog to perform various actions by offering incentives such as dog biscuits as reward Reward Action = Fetching

  25. Reinforcement Learning Example The dog will follow a policy to maximize its reward and hence, will follow every command and might even learn a new action, like begging, by itself Begging Fetching Handshake

  26. Reinforcement Learning Example The dog will also want to run around and play and explore its environment. This quality of a model is called Exploration

  27. Reinforcement Learning Example The dog will also want to run around and play and explore its environment. This quality of a model is called Exploration Exploring new parts of house

  28. Reinforcement Learning Example The tendency of the dog to maximize rewards is called Exploitation. There is always a tradeoff between exploration and exploitation as exploration actions may lead to lesser rewards Action: Climbing on the sofa Reward: None

  29. Markov’s Decision Process

  30. Markov’s Decision Process Markov’s Decision Process is a Reinforcement Learning policy used to map a current state to an action where the agent continuously interacts with the environment to produce new solutions and receive rewards Environment Agent Action Reward State

  31. Markov’s Decision Process In the diagram shown, we need to find the shortest path between node A and D. Each path has a reward associated with it and the path with maximum reward is what we want to choose The nodes; A, B, C, D; denote the nodes. To travel from node to node( A to B) is an action. Reward is the cost at each path and policy is each path taken A 15 B 5 10 0 -20 C D 25

  32. Markov’s Decision Process In the diagram shown, we need to find the shortest path between node A and D. Each path has a reward associated with it and the path with maximum reward is what we want to choose The process will maximize the output based on the reward at each step and will traverse the path with the highest reward. This process does not explore, but maximizes reward A B 5 C D 25

  33. Reinforcement Learning Demo

  34. Join us to learn more! simplilearn.com UNITED STATES Simplilearn Solutions Pvt. Limited 201 Spear Street, Suite 1100 San Francisco, CA 94105 Phone: (415) 741-3319 INDIA Simplilearn Solutions Pvt. Limited #53/1C, 24th Main, 2nd Sector HSR Layout, Bangalore 560102 Phone: +91 8069999471 UNITED STATES Simplilearn Solutions Pvt. Limited 801 Corporate Center Drive, Suite 138 Raleigh, NC 27607 Phone: (919) 205-5565

More Related