1 / 19

Distributed Watchpoints: Debugging Very Large Ensembles of Robots

A solution for debugging distributed errors in large robot ensembles using distributed watchpoints that can detect and trigger on distributed conditions, improving the effectiveness of debugging tools.

bwilkins
Download Presentation

Distributed Watchpoints: Debugging Very Large Ensembles of Robots

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Watchpoints: Debugging Very Large Ensembles of RobotsDe Rosa, Goldstein, Lee, Campbell, Pillai Aug 19, 2006

  2. Motivation • Distributed errors are hard to find with traditional debugging tools • Centralized snapshot algorithms • Expensive • Geared towards detecting one error at a time • Special-purpose debugging code is difficult to write, may itself contain errors Distributed Watchpoints

  3. Expressing and Detecting Distributed Conditions “How can we represent, detect, and trigger on distributed conditions in very large multi-robot systems?” • Generic detection framework, well suited to debugging • Detect conditions that are not observable via the local state of one robot • Support algorithm-level debugging (not code/HW debugging) • Trigger arbitrary actions when condition is met • Asynchronous, bandwidth/CPU-limited systems Distributed Watchpoints

  4. Distributed/Parallel Debugging:State of the Art Modes: • Parallel: powerful nodes, regular (static) topology, shared memory • Distributed: weak, mobile nodes Tools: • GDB • printf() • Race detectors • Declarative network systems with debugging support (ala P2) Distributed Watchpoints

  5. Example Errors: Leader Election Scenario: One Leader Per Two-Hop Radius Distributed Watchpoints

  6. Example Errors: Token Passing Scenario: If a node has the token, exactly one of it’s neighbors must have had it last timestep Distributed Watchpoints

  7. Example Errors: Gradient Field Scenario: Gradient Values Must Be Smooth Distributed Watchpoints

  8. Expressing Distributed Error Conditions Requirements: • Ability to specify shape of trigger groups • Temporal operators • Simple syntax (reduce programmer effort/learning curve) A Solution: • Inspired by Linear Temporal Logic (LTL) • A simple extension to first-order logic • Proven technique for single-robot debugging [Lamine01] • Assumption: Trigger groups must be connected • For practical/efficiency reasons Distributed Watchpoints

  9. Watchpoint Primitives • nodes(a,b,c); • n(b,c) • & • (a.var > b.var) • & (c.prev.var != 2) • Modules (implicitly quantified over all connected sub-ensembles) • Topological restrictions (pairwise neighbor relations) • Boolean connectives • State variable comparisons (distributed) • Temporal operators Distributed Watchpoints

  10. Distributed Errors: Example Watchpoints nodes(a,b,c);n(a.b) & n(b,c) & (a.isLeader == 1) & (c.isLeader == 1) nodes(a,b,c);n(a,b) & n(a,c) & (a.token == 1) & (b.prev.token == 1) & (c.prev.token == 1) nodes(a,b);(a.state - b.state > 1) Distributed Watchpoints

  11. 1 2 3 1 2 3 4 5 6 7 8 2 1 9 10 11 12 13 14 15 16 9 1 17 18 19 20 21 22 23 24 9 2 1 25 26 27 28 29 30 31 32 10 9 1 Watchpoint Execution nodes(a,b,c)… √ . . . . Distributed Watchpoints

  12. Performance: Watchpoint Size • 1000 modules, running for 100 timesteps • Simulator overhead excluded • Application: data aggregation with landmark routing • Watchpoint: are the first and last robots in the watchpoint in the same state? Distributed Watchpoints

  13. Performance: Number of Matchers • This particular watchpoint never terminates early • Number of matchers increases exponentially • Time per matcher remains within factor of 2 • Details of the watchpoint expression more important than size Distributed Watchpoints

  14. Performance: Periodically Running Watchpoints Distributed Watchpoints

  15. Future Work • Distributed implementation • More optimization • User validation • Additional predicates Distributed Watchpoints

  16. Conclusions • Simple, yet highly descriptive syntax • Able to detect errors missed by more conventional techniques • Low simulation overhead Distributed Watchpoints

  17. Thank You

  18. Backup Slides Distributed Watchpoints

  19. Optimizations • Temporal span • Early termination • Neighbor culling • (one slide per) Distributed Watchpoints

More Related