1 / 44

Bard

Bard. ASLs. An Application-Specific Language (ASL) is a programming language that supports: a particular class of applications in a highly idiosyncratic domain In order to understand a particular ASL, it is usually necessary to understand the domain

burke
Download Presentation

Bard

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bard

  2. ASLs • An Application-Specific Language (ASL) is a programming language that supports: • a particular class of applications • in a highly idiosyncratic domain • In order to understand a particular ASL, it is usually necessary to understand the domain • An ASL can be extremely effective when used for its intended purpose

  3. Bard • Bard is designed to perform pattern matching and manipulation of Abstract Syntax Trees (ASTs) in the reengineering domain. • In order to understand Bard, it is necessary to understand a little about ASTs and reengineering

  4. Outline of the talk • This talk will cover: • What reengineering is all about • What an Abstract Syntax Tree (AST) is • What an "idiom" is • What the Idiom Tool does • How Bard and the Idiom Tool work together • The design decisions behind Bard

  5. Legacy code • The legacy code problem: • About 80% of programmer time goes to maintenance • New software methodologies are the preferred solution (that is, do it over from scratch!) • BUT • 40 years of working systems can't be abandoned.

  6. Software reengineering • Software reengineering is any activity that • improves one's understanding of software • improves the software documentation • improves the software itself • brings legacy code up to more current standards • aids in a new implementation of the software

  7. Why reengineer? • We reengineer code for many reasons: • add functionality • ease maintenance • move to new platforms and/or new languages • facilitate reuse • improve performance • decrease cost of keeping obsolete systems • combat software rot

  8. The reengineering process • Reverse engineering moves from "software assets" (code, documentation, etc.) to a higher level of abstraction • Transformations modify the system at some level of abstraction • Forward engineering moves from a higher level of abstraction back to code

  9. Reengineering levels • Reengineering can be carried out at many levels:

  10. Idiom tool • Idiom Tool is the part of the reengineering suite that performs transformations on the AST

  11. Creating the Abstract Syntax Tree • Parsers convert programs written in various languages into their AST representations • This AST can be manipulated by Idiom Tool, or used as input for various other tools (to create flowcharts, data flow diagrams, concordances, etc.) • Idiom Tool is a programmable subsystem • The programming language used is Bard

  12. Example of an AST (simplified) • In C: if (n < 0) return -n;

  13. Gray • The ASTs can expressed in a language called Gray • Gray is itself an ASL (Application-Specific Language) • Since ASTs are trees, Gray borrows the syntax (but not the operators) of LISP

  14. Example Gray code • (If_Op (Less Identifier:n Number_Literal:0) (Return_Value (Negate Identifier:n)) Epsilon) )

  15. Idioms • An idiom is a conventional way of expressing an idea • Programming languages, like natural languages, have idioms • Examples: • for (i = 0; i < n; i++) A[i] = 0; • temp = x; x = y; y = temp;

  16. Idioms change • Idioms vary over time and across languages • Arrays may be zero-based or one-based • FORTRAN used parallel arrays to simulate structs • Languages may use array indices to simulate pointers • C lacks multiply-dimensioned arrays • No early languages were object-oriented • COBOL typically represented years as two digits

  17. Recap • To summarize, • ReEngineer is a tool that uses • parsers to translate programs in various languages into • Abstract Syntax Trees, which can be represented in a linear fashion by • Gray code, and which can be manipulated by • Idiom Tool, which uses programs written in • the Bard language.

  18. The reengineering flow

  19. Regular expressions • Recognizing idioms in a program is a matter of recognizing certain kinds of patterns • Regular expressions (regexps) are a standard, well-developed mechanism for doing pattern recognition on strings • Regular expressions are used by grep, Perl, Tcl, awk, Python, vi, emacs, and a number of other languages and tools

  20. Common regexp operators

  21. More regexp operators

  22. Regular expressions in Bard • Regular expressions in Bard are modelled after regular expressions in grep • Because we are doing pattern matching on ASTs rather than strings, • Literals are represented by Gray expressions, • Regular expression syntax must be modified to accomodate arbitrary Gray expressions, and • Regular expressions may occur within Gray expressions, and vice versa

  23. Some regexps in Bard

  24. More regexps in Bard

  25. Inner and outer languages • Regular expressions are not enough -- we need some way to control pattern matching • We also need to manipulate the parts of the AST found by pattern matching • Solution: wrap a more-or-less conventional "outer language" around the "inner language" of regular expressions • This is also the approach taken in Perl

  26. Conventional statements in Bard • Conventional statements in Bard include: • assignment statements • if statements • while loops • print statements • function and procedure calls • calls to native (Ada) code • tracing and debugging facilities

  27. Special-purpose statements in Bard • Examining and manipulating the AST: • match pattern [at position] • insert subtree as relative-position • for example, insert identifier:x as first child; • delete position • replace position with subtree • find pattern [up to position] • go to position

  28. Efficiency • ReEngineer has been used to process hundreds of thousands of lines of code • Pattern-matching languages are not efficient enough for this purpose • Bard procedures are inefficient • Bard procedures are called only when there is a high likelihood that pattern matching will succeed

  29. Fast pre-screening of nodes • Each Bard routine specifies the types of nodes at which it might be applicable • Idiom Tool walks the AST and calls Bard when it finds such a node • Since Idiom Tool is written in Ada, it can screen nodes very, very fast

  30. The structure of Bard procedures • A "procedure" in Bard, like a "rule" in an expert system, consists of two parts: • The test part determines whether this particular procedure is applicable at this point in the tree • Tests must not have have side effects • The action part consists of arbitrary Bard code • Actions may modify the AST and/or may collect and store information about the AST

  31. Skeleton of a Bard procedure • procedure name (pass, node_types) tests; -- may not have side effects commit; -- separates tests from actions actions; -- manipulate the ASTend procedure;

  32. Bard example idiom remove_null_statements; -- Remove null statements, except empty loop bodies. procedure remove_null_statements (1, Null_Statement); go to first child of parent; not match While_Op; not match Do_While; not match C_For_Loop; commit; go to @trigger; delete; end procedure; end idiom;

  33. Diffuse patterns • A pattern is compact if it is a single recognizable unit • A pattern is diffuse if it consists of multiple pieces scattered throughout the AST • Diffuse patterns may overlap in complex ways • The problem is to collect the various pieces • This is not easily solved with only simple data structures such as arrays

  34. Dealing with diffuse patterns • Idiom Tool makes multiple passes over the data base • Each Bard procedure indicates the pass for which it is active • This allows some procedures to collect information and other procedures to process the information. • Idiom tool uses a fact base to store information between passes

  35. Modifying the fact base • Bard's fact base is modeled after that of Prolog • Bard allows negative assertions as well as positive assertions • assert fact; • deny fact; • retract fact;

  36. Interrogating the fact base • Because Bard does not used the Closed World Assumption, facts may be "true", "false", or (unlike Prolog) "unknown" • true fact; • false fact; • known fact; • unknown fact;

  37. Multiple occurrences • It is sometimes necessary to recognize multiple occurrances of the same subtree. • For example, to recognize Pascal statements of the form x:=x+1, you have to realize that the identifier x is repeated • (Assignment Identifier:?id (Integer_Addition Identifier:?id Numeric_Literal:1))

  38. Unification in Bard • When pattern matching variable ?id against value v, • If ?id has no prior value, it is unified with v • If id? has a prior value, then the match succeeds if and only if id? == v • If a test succeeds, all unifications performed during the test are retained • If a test fails, all unifications performed during the test are discarded

  39. Summary I • Idiom tool does multiple preorder traversals of the AST; at each node it may trigger one or more Bard procedures • A Bard procedure consists of • a header that specifies appropriate node types • a test part that performs any additional tests • the keyword commit, and • an action part

  40. Summary II • Special features of Bard include • an "inner language" for pattern matching • procedures modeled after rules in expert systems • powerful tests that can comprise arbitrary actions • unification and unification unwinding • an integrated fact base • These features make Bard far better for its purpose than a general language such as Ada

  41. Summary III • Bard has been used • in a Pascal to C translation system • to assist in converting CMS-2 to C • in a prototype Y2K system

  42. Current status of Bard • Currently owned by either Lockheed or Unisys • No longer under active development • Lost and presumed dead as a result of corporate mergers and reorganizations

  43. The End

More Related