230 likes | 427 Views
Context-Sensitive Domain-Independent Algorithm Composition and Selection Troy A. Johnson and Rudi Eigenmann Purdue University. Motivation. Increasing programmer productivity Typical language approach: increase abstraction do more with less code reduce development and maintenance costs
E N D
Context-Sensitive Domain-Independent Algorithm Composition and Selection Troy A. Johnson and Rudi Eigenmann Purdue University PLDI 2006
PLDI 2006 Motivation • Increasing programmer productivity • Typical language approach: increase abstraction • do more with less code • reduce development and maintenance costs • Domain-specific languages / libraries (DSLs) provide a high level of abstraction • e.g., a domain is biology, chemistry, physics • But, typically a sequence of library calls is needed
PLDI 2006 A Library Designer's Problem • Consider two useful library procedures A and B • reluctant to include a third procedure C that simply calls A and B (i.e., has composite behavior) • though convenient, C is redundant • including only C prevents reuse of A and B • including all three, for all such procedures, greatly increases library size and complexity • Library user is expected to compose sequences of “fundamental” calls
PLDI 2006 A Library User's Problem • Novice users don't know these call sequences • procedures documented independently • tutorials provide a few example call sequences • not an exhaustive list • may need adjusted for each calling context • User knows what they want to do, but not how to do it. • Can the compiler let them specify a goal, and insert an appropriate call sequence? Yes!
PLDI 2006 Our Solution • Add an “abstract algorithm” (AA) construct to the programming language • named and defined (once) by the programmer • definition is the programmer's goal • called like a procedure (any number of times) • compiler replaces each AA call w/ a library call sequence • How does the compiler do this? • short answer: it uses a domain-independent planner that accepts procedure specifications as operators
PLDI 2006 Our Major Contributions • Explain how composition of DSL procedures can be implemented as a language feature • First compiler to use a planner to insert a sequence of library calls, while considering the calling context • A novel application of planning that motivates research in planning theory • Support incomplete, abstract procedure specifications that can be written for many domains • use programmer-compiler interaction to clarify ambiguity
PLDI 2006 Outline • Example: The BioPerl DSL for Bioinformatics • Brief Introduction to Planning • Mapping Composition onto Planning • Related Work (very abbreviated; see paper) • Conclusions & Future Work
PLDI 2006 A Common BioPerl Call Sequence • Query a remote database and save the result to local storage: Query q = bio_db_query_genbank_new(“nucleotide”, “Arabidopsis[ORGN] AND topoisomerase[TITL] AND 0:3000[SLEN]”); DB db = bio_db_genbank_new( ); Stream stream = get_stream_by_query(db, q); SeqIO seqio = bio_seqio_new(“>sequence.fasta”, “fasta”); Seq seq = next_seq(stream); write_seq(seqio, seq); 5 data types, 6 procedure calls TypeProcedure Example adapted from http://www.bioperl.org/wiki/HOWTO:Beginners
PLDI 2006 Describing the Library User's Goal • Library author provides a domain glossary • query_result(result, db, query) – result is the outcome of sending query to the database db • contains(filename, data) – file named filename contains data • in_format(filename, format) – file named filename is in format format • Glossary terms areproperties (facts), whereasprocedurecalls are actions • Library user lists properties of their goal
PLDI 2006 Defining and Calling an AA • AA (goal) defined using the glossary... algorithm save_query_result_locally(db_name, query_string, filename, format) => { query_result(result, db_name, query_string), contains(filename, result), in_format(filename, format) } Order does not matter. These are not procedure calls. • ...and called like a procedure Seq seq = save_query_result_locally(“nucleotide”, “Arabidopsis[ORGN] AND topoisomerase[TITL] AND 0:3000[SLEN]”, “>sequence.fasta”, “fasta”); 1 data type, 1 AA call TypePropertyAA
PLDI 2006 Call-Sequence Selection • Compiler’s planner may find multiple sequences • programmer can select a sequence • or the compiler can select one heuristically • by using library annotations as a guide; may require knowing typical program values to compare sequences • by selecting sequence with fewest calls • Incomplete specifications may cause undesirable sequences to be suggested • incompleteness will occur in practice • cannot eliminate the option for programmer-review
PLDI 2006 Making Interaction Unobtrusive • Compilation normally non-interactive • what programmers expect • interaction should be minimized • Cache programmer’s responses • most code does not change between compiles • avoids repeatedly selecting same sequence • compiler flag to clear or ignore cache
PLDI 2006 Advantages of Our AA approach • The following can remain unknown • library procedure names • order of calls • intermediate variables • Can teach library users call sequences • AA calls adapt under different calling contexts
PLDI 2006 Why Calling Context Matters • save_query_result_locally replaced with library calls • In general, would use the 6-call sequence • creates 5 data objects • What if some objects already exist? • suppose there is a live DB object • then DB db = bio_db_genbank_new( ); is unnecessary • What if the goal is already satisfied? • then the AA call does not generate any code
PLDI 2006 Composing a Call Sequence • A planner discovers a sequence of instantiated operators (actions; calls), known as a plan • Given • initial state <= calling context, from compiler • goal state <= AA definition, from programmer • operator set <= library procedure specifications, from librarian
PLDI 2006 Greatly-Simplified View of Planning (Compiler) (Executable) • World is composed of objects • Actions modify objects' properties and relationships • Planner deals with a symbolic model Plan User World (Library Specs.) Operators Planner Actions Plan (Call Context) Initial State (AA Definition) Goal State A Domain-INDEPENDENT Planner A Domain-Dependent Planner
PLDI 2006 Why Planning Is Difficult • Operators define a state-transition system • precondition – when an operator can be used • effects – what an operator does • Planner finds a path through the system from the initial state to the goal state • What's difficult? • typically too many states to enumerate • search intelligently using reasonable time & space • danger that planner may not terminate
PLDI 2006 Is planning necessary for composition? • Many possible actions • libraries contain 10s – 100s procedures (operators) • each procedure has several parameters • 10s – 100s live variables (objects) at a call site • many ways to bind variables to parameters • Ex: 128 procs, 2 params each, 8 objs, 4 calls • assume all objects & params have the same type • (128 * 82)4 = (27 * 26)4 = 2(13*4) = 252 potential plans
PLDI 2006 Overall System & Challenges Plan (Challenge #4) Procedure Specifications Domain-Specific Library Operators DIPACS Planner DIPACS Compiler (Challenge #1) Goal State Initial State Application Code (Challenge #3) (Challenge #2) C++ Code Run-Time Program State (World) Binary (Actions) gcc 00101010 01000010 1. Ontological Engineering – choosing a glossary for the domain 2. Determine Initial & Goal States – requires flow analysis; translation to a planning language 3. Object Creation – most planners assume a fixed set of objects 4. Merge the Plan into the Program – destructive vs. non-destructive plans DIPACS = Domain-Independent Planned Algorithm Composition and Selection
PLDI 2006 Related Work • Languages and Compilers • Jungloids • David Mandlin et al. Jungloid Mining: Helping to Navigate the API Jungle.PLDI, June 2005. • Broadway • Samuel Z. Guyer and Calvin Lin. Broadway: A Compiler for Exploiting the Domain-Specific Semantics of Software Libraries. Proceedings of the IEEE, 93(2):342–357, February 2005. • Speckle • Mark T. Vandevoorde. Exploiting Specifications to Improve Program Performance. PhD thesis, Massachusetts Institute of Technology, 1994.
PLDI 2006 Related Work (continued) • Automatic Programming • Robert Balzer. A 15-year Perspective on Automatic Programming. IEEE Transactions on Software Engineering, 11(11):1257–1268, November 1985. • David R. Barstow. Domain-Specific Automatic Programming. IEEE Transactions on Software Engineering, 11(11):1321–1336, November 1985. • Charles Rich and Richard C. Waters. Automatic Programming: Myths and Prospects. IEEE Computer, 21(8):40–51, August 1988. • Automated (AI) Planning • Keith Golden. A Domain Description Language for Data Processing. Proc. of the International Conf. on Automated Planning and Scheduling, 2003. • M. Stickel et al. Deductive Composition of Astronomical Software from Subroutine Libraries. Proc. of the International Conf. on Automated Deduction, 1994. • Other work at NASA Ames
PLDI 2006 Conclusions & Future Work • A DSL compiler can use a planner to implement a useful language feature • We provide an example using a real DSL • Identified implementation challenges in this talk • for detailed solutions see paper • Future work • call sequences that include branches and loops • develop examples using additional domains
Context-Sensitive Domain-Independent Algorithm Composition and Selection Troy A. Johnson and Rudi Eigenmann Purdue University PLDI 2006