1 / 24

Against the Gods : Strategies for Robust Autonomous Behaviour

Against the Gods : Strategies for Robust Autonomous Behaviour. Subramanian Ramamoorthy School of Informatics The University of Edinburgh 3 December 2008. Autonomous Robots are Coming Here. Physically. Virtually. Some Observations about Robotics .

teddy
Download Presentation

Against the Gods : Strategies for Robust Autonomous Behaviour

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Against the Gods:Strategies for Robust Autonomous Behaviour Subramanian Ramamoorthy School of Informatics The University of Edinburgh 3 December 2008

  2. Autonomous Robots are Coming Here Physically Virtually

  3. Some Observations about Robotics • Autonomy is not useful unless it is also “robust” – a many-hued concept • My focus is on strategy: systems issues vs. task • Consider this origami robot [Balkcom & Mason at CMU]: • Can we do such things autonomously using Nao/PR2 (robotic equivalents of the MITS Altair or Apple I from 1975) – in a semi-structured home environment? Autonomous robots must act in an adversarial world

  4. What Problem is the Robot Solving? Perception High-level goals Adversarial actions & other agents Environment Adversarial actions & other agents Action Problem: How to generate actions, to achieve high-level goals, using limited perception and incomplete knowledge of environment & adversarial actions?

  5. Robust Control and Decision Problem • Robust control  Play a differential game against nature or other self-interested/cooperative agents • w are adversarial actions (e.g., large deviations) • Constrained high-dim partially-observed problem is hard!

  6. Lattice of Control and Decision Problems Approach: Use this structure to devise abstractions & shape learning Adversary: Constraints impact immediate moves, e.g., state space subset rendered infeasible, and longer term (sequential decision making) (X,U,W) Robust control Game, adversary, strategy (X,W) Verification (X,U) Feedback control & optimality (X) Motion synthesis & planning Model incompleteness: Many constraints (e.g., c-space limits) play out at a slower time scale Can we combine such problem factorization and machine learning methods to learn solutions?

  7. A Worked Example:Global Control of the Cart-Pole System

  8. Introducing the Cart-Pole System • System consists of two subsystems – pendulum and cart on finite track • Only one actuator – cart • We want global asymptotic stability of 4-dim system • The Game: Experimenter hits the pole with arbitrary velocity at any time, system picks controls • What are the weak sufficient conditions defining this task? Phase space of the pendulum

  9. Dealing with the Adversary - Global Structure Adversary could push system anywhere, e.g., here Can describe global strategy as a qualitative transition graph Larger disturbances could truly change quantitative details, e.g., any number of rotations around origin The uncontrolled system converges to this point We want to reach and stay here

  10. Describing Local Behaviour: Templates Lemma (Spring – Mass - Positive Damping): Let a system be described by where, and Then it is asymptotically stable at (0,0). Lemma (Spring – Mass - Negative Damping): Let a system be described by where, and Then it has an unstable fixed-point at (0,0), and no limit cycle.

  11. The control law: if Balance else if Pump else Spin Constraints: Global Controller for Pendulum

  12. The Global Control Law The switching strategy: If then Balance else if then Pump else Spin [Ramamoorthy & Kuipers, HSCC 02 & 03]

  13. Demonstration on a physical set-up Result: Best Response computation for this game * A few more technical steps to ‘lift’ pendulum strategy to 4-dim

  14. More Complex Examples

  15. Bipedal Walking on Irregular Terrain • Many constraints – dynamic stability, intermittent footholds • Incomplete models: No high-dim models, only data from randomized exploration The Game: Nature picks foothold (on-line), robot picks trajectory

  16. Structure of the Solution Define qualitative strategy in low-dimensions (finite horizon optimal control) (X,U,W) (X,W) (X,U) (X) Lift resulting strategy to the more complex c-space (presently unknown!)

  17. Data-driven Approximation of Strategy

  18. The Result – Humanoid Robot Simulation [Ramamoorthy & Kuipers RSS 06, ICRA 08]

  19. Another Problem: (Un)tying Knots Task encoding: • Knot energy shape descriptor • For an n-edge polygonal knot • Manipulation planning • (Offline) Learn multi-scale structure in energy functional • (Online) – Navigate a hierarchical graph The Game: Nature/adversary picks ways to deform/disturb object, robot picks manipulation actions

  20. Simulation of Knot Untying • Action synthesis: • Shaped reinforcement learning (SARSA) • Optimality of MDP solution is not compromised – knot energy is a valid potential energy • 10x faster than uninformed RL • For large problems, RL simply doesn’t converge within acceptable time – ours does • [Also see poster by • SandhyaPrabhakaran]

  21. Current Work:Learning Abstractions and Strategies

  22. Learning Abstractions • Not hard to acquire low-dimensional models from data • Simple tools like PCA/SVD have been around for a long time • Recent explosion of non/semi-parametric methods • Hard to summarize this information for use in the larger planning and control framework • In order to reason about adversaries and actions • My approach: • Define notions of system equivalence –many geometric ideas • Sampling-based algorithms to induce abstractions , with dim(A) << dim(Q)

  23. Learning Global Strategies • Shaping (PO)MDP and related models • How to combine this with abstraction concepts and algorithms in previous slide? • Multi-scale formulations of learning algorithms • Risk-sensitive control – beyond simple best response • Control learning is often driven by metrics related to predictive accuracy • For robust control, we may be interested in quite different issues, e.g., large reachable sets from all c-space points • Particularly relevant in electronic markets and competitive scenarios, i.e., agents with conflicting interests

  24. Acknowledgements • Pendulum and bipedal walking problems are from my PhD thesis – work with Benjamin Kuipers (U. Texas – Austin) • Knots work was done by SandhyaPrabhakaran - MSc thesis • Collaborators in my current & future work: • IoannisHavoutis, Thomas Larkworthy (PhD students) • SethuVijayakumar, Taku Komura, Michael Herrmann (IPAB) • RahulSavani (Warwick) – algorithms for automated trading • Ram Rajagopal (Berkeley) – sampling & non-parametric learning * The title of this talk is taken from a wonderful book by Peter Bernstein

More Related