1 / 32

Françoise André IRISA – Prof. University of Rennes 1 Jérémy Buisson IRISA – INSA of Rennes

Dynamic adaptability Phenix workshop on self-healing and fault tolerant systems December 7-8, 2006 – IRISA, Rennes. Françoise André IRISA – Prof. University of Rennes 1 Jérémy Buisson IRISA – INSA of Rennes. Outline. Dynamic adaptability Dynaco: generic framework for adaptability

Download Presentation

Françoise André IRISA – Prof. University of Rennes 1 Jérémy Buisson IRISA – INSA of Rennes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dynamic adaptabilityPhenix workshop on self-healing and fault tolerant systemsDecember 7-8, 2006 – IRISA, Rennes Françoise André IRISA – Prof. University of Rennes 1 Jérémy Buisson IRISA – INSA of Rennes

  2. Outline • Dynamic adaptability • Dynaco: generic framework for adaptability • Afpac: tool for the adaptation of SPMD codes • Evaluations • Conclusion and future works Dynamic adaptability

  3. Outline • Dynamic adaptability • Dynaco: generic framework for adaptability • Afpac: tool for the adaptation of SPMD codes • Evaluations • Conclusion and future works Dynamic adaptability

  4. Adaptability • A functionality of applications • Ability to modify itself (reconfigure) at runtime (dynamically) according to its execution environment • Some synonyms for “adaptability” • Autonomous computing, autonomic computing • More or less adaptability • Sometimes structured as provided functionalities, such as self-healing, self-optimization, … • Adaptivity, autonomicity • Other similar area • Application steering • More or less adaptability triggered by users Dynamic adaptability

  5. Need for adaptability • When resources vary in the execution environment • Some resources may appear • Some resources may disappear • Possible causes • Faults; administrative tasks; resource sharing among users • When an application have several configurations that use resources differently • Different possible algorithms • Some parameters that can be tuned • Adaptability ensures that the application continuously executes the “best” configuration • According to the actual execution environment Dynamic adaptability

  6. Overall goal • Benefit from appearing resources • Terminate sooner • Support disappearing resources • Avoid expected crashes Dynamic adaptability

  7. Adaptability in the PARIS team • Studied for some years • Initially • Mobile computing • Distributed computing • Last works • Parallel computing • Framework approach • Ad-hoc implementations should be avoided • The structure should highlight reusable tools • Current prototypes • Dynaco: generic framework for adaptability • Afpac: tool for adapting SPMD codes Dynamic adaptability

  8. Other works on adaptability • Many ad-hoc implementations • Specific to one kind of adaptation • E.g. adapting the number of processes to the number of processors/machines, redistributing tasks [Paul et al., 1998] • Specific to one application • E.g. video streaming [Plasma] • Some (more or less generic) frameworks • [EPSN] • Some compiler approaches • [ASSIST] • Some semantic models • [Zhang et Cheng, 2005] Dynamic adaptability

  9. Outline • Dynamic adaptability • Dynaco: generic framework for adaptability • Afpac: tool for the adaptation of SPMD codes • Evaluations • Conclusion and future works Dynamic adaptability

  10. Dynaco: a generic adaptability framework • Decomposition of adaptability in 4 steps • Observe the execution environment as it evolves • Decide that the component should adapt • Plan how to achieve the adaptation • Schedule and execute planned actions Dynamic adaptability

  11. Adaptability step 1: observe • Collect information about the execution environment • Connect to the monitoring infrastructure of the environment • Detect relevant changes • Trigger adaptability when the adaptable component may not be well adapted anymore Dynamic adaptability

  12. Adaptability step 2: decide • Find the best strategy • With regard to a developer- or user-provided criterion • E.g. performance model • Depending on information collected at the observe phase • Possible implementations • Any optimization algorithm • Depending on the properties of the criterion that should be optimized • Expert systems and decision diagrams Dynamic adaptability

  13. Adaptability step 3: planning • Find how the decided strategy can be achieved • Starting from the currently executing configuration • Assembling predefined actions with some control flow • Possible algorithm • Planning algorithms • May be costly if too much expressivity is required • Collection of predefined plans • Difficult to construct a sufficient collection Dynamic adaptability

  14. Adaptability step 4: execution • Execute generated plans • Schedule accordingly to dependencies highlighted in plans • Synchronize with the applicative execution flows • Possible implementations • Hooks in the applicative code • Called “adaptation points” • Rendezvous at the next hook in applicative code • Rollback to the previous hook in applicative code • Applicative code suspension Dynamic adaptability

  15. Dynaco: a generic adaptability framework • In order to instantiate the framework • Choose implementations for the generic engines • Implement policy, guide and actions Dynamic adaptability

  16. Dynaco: a generic adaptability framework • Integrate the framework instance within the adaptable component • Bind “actions” and “execute” to the content of the component • Bind the framework to the monitoring infrastructure Dynamic adaptability

  17. Integration in the development cycle Dynamic adaptability

  18. Outline • Dynamic adaptability • Dynaco: generic framework for adaptability • Afpac: tool for the adaptation of SPMD codes • Evaluations • Conclusion and future works Dynamic adaptability

  19. Adaptation for parallel components • Parallel components • Components that encapsulate parallel codes • Case of parallel components • In the execute phase • Synchronize adaptation actions with the execution of the applicative code • Hook the applicative execution threads • Adaptation points are global states Dynamic adaptability

  20. Afpac: adaptation for SPMD components • Rendezvous at the upcoming global state hook • Locally to each process, adaptation points are indicated by developers • Call to an Afpac function • Globally, adaptation points are built as the identity relation over local adaptation points • SPMD code assumption Dynamic adaptability

  21. Afpac • Distributed algorithm to find the upcoming adaptation point • Iterative • Each process locally predicts upcoming local adaptation points • If prediction is impossible, wait for the applicative execution thread to progress • E.g. in case of conditional instructions • Each process gathers other processes’ predictions • As long as at least one process does not agree, rerun the algorithm • Each process computes a least upper bound according to other processes’ predictions • Concurrent to the applicative execution thread Dynamic adaptability

  22. Afpac • Requirements for the applicative code • Tracking the progress of the execution in each process • Upon local adaptation points • Upon control structures containing adaptation points • Predicting upcoming adaptation points • Control flow model of the applicative code • With the same granularity as above Dynamic adaptability

  23. Taco: AOP tool easing the use of Afpac • Specific aspect weaver • Handling of control structures • Source code transformation for inserting calls upon control structures • Extraction of the control flow model • Task still belonging to developers • Indicating local adaptation points Dynamic adaptability

  24. Outline • Dynamic adaptability • Dynaco: generic framework for adaptability • Afpac: tool for the adaptation of SPMD codes • Evaluations • Conclusion and future works Dynamic adaptability

  25. Examples of using Dynaco • FT (from the NAS Parallel Benchmark suite): numerical kernel • Adapting the number of processes to the number of available processors • i.e. implementing malleability • Gadget 2: N body simulator • Adapting the data distribution to load unbalance • i.e. revisiting load balancing • Dad: home-made genetic algorithm • Adapting the implementation to the underlying architecture • Including to communication facilities Dynamic adaptability

  26. Progress of one adaptation Dynamic adaptability

  27. Progress of one adaptation Dynamic adaptability

  28. Outline • Dynamic adaptability • Dynaco: generic framework for adaptability • Afpac: tool for the adaptation of SPMD codes • Evaluations • Conclusion and future works Dynamic adaptability

  29. Summary • Dynaco: a generic framework for adaptability • Independent of the application • E.g.: numerical algorithms, transactional systems • Independent of formalisms and technologies • E.g.: 3 interchangeable formalisms for the policy in the current implementation • Objective function, optimized by a genetic algorithm • Collection of condition-action rules, interpreted by the Jess expert system • Plain Java code, executed by a JVM Dynamic adaptability

  30. Ongoing and future works • Trying to reduce applicative code suspension while selecting a global adaptation point • Designing speculative algorithm • Guessing what other processes will do, rather than waiting for those processes to do it • Compensating small desynchronizations • Using rollback in case of wrong prediction • Designing a dialogue between grid resource managers and adaptable applications • Investigate how resource managers and adaptable applications can mutually benefit from each other • Better resource management • Avoid considering rescheduling as faults • Avoid using checkpoint/restart Dynamic adaptability

  31. Long term goals • Connections with fault tolerance • Making Dynaco resilient • Dynaco is centralized • Even if able to command the adaptation of parallel applications • The process executing the Dynaco framework must never fail • Furthermore, it should be able to execute actions • Implementing fault tolerance with Dynaco • One adaptation action may be “restart from checkpoint” • Adaptability would allow to restart with a different behavior/implementation • Using fault tolerance features for adaptability • Several adaptability implementations use checkpoint/restart • It can be useful to implement speculative adaptation point selection Dynamic adaptability

  32. Long term goals • Adaptability in the context of systems • Not restricted to well suited resource management • Resource management • Data management • Replication • Consistency • Adaptation of the system • According to the underlying platform • According to hosted applications Dynamic adaptability

More Related