730 likes | 882 Views
Tabled Prolog. David S. Warren XSB, Inc. Stony Brook University. Outline. Introduction Symmetric, Transitive Relations Basic Tabling Uses Databases (and Datalog) Grammars Automata Theory Dynamic Programming Advanced Tabling Evaluating Recursive Definitions Program Processing
E N D
Tabled Prolog David S. Warren XSB, Inc. Stony Brook University
Outline • Introduction • Symmetric, Transitive Relations • Basic Tabling Uses • Databases (and Datalog) • Grammars • Automata Theory • Dynamic Programming • Advanced Tabling • Evaluating Recursive Definitions • Program Processing • Interpreters • Abstract Interpreters • Beyond Simple Tabling • Negation • Aggregation • Constraints
Family Relations • Siblings in my family: • If my sister is my sibling, then I’m hers: • (symmetric, a problem in Prolog) sibling(nancy,david). sibling(david,jane). sibling(jane,rick). sibling(rick,emily). sibling(X,Y) :- sibling(Y,X).
Symmetry • Symmetric rule will always cause a loop in Prolog. • Why? (Explore the Prolog program…) • What can we do to “fix” it? • Prolog hackery: cuts, asserts, extra arguments, … • Is there a more general/universal “fix”? • What is the general problem to be fixed? • Logic is OK, but Prolog’s evaluation is problemmatical.
Repeated Computation • The problem is that Prolog repeats computations again and again. • So we can use “tables” to store the fact that we’ve done a computation, and its results. • Then if we’re about to do a computation and it is already in the table, we just use the results from there and don’t redo it.
Observations • Does it solve sibling/2’s problem? (Explore) • When will it eliminate loops in Prolog? • Always? • Sometimes? • When? Can we say something general?
Transitivity • If I am nancy’s sibling and jane is my sibling, then jane is nancy’s sibling: • Add this rule to the Prolog program. (explore) • There is a problem here. Will tabling solve it as well? sibling(X,Y) :- sibling(X,Z), sibling(Z,Y).
Tabling Intuition • Prolog program executed by a growing and shrinking set of virtual procedural machines: • When calling a predicate, a machine duplicates itself once for each matching clause. • When an operation fails, that machine disappears. • For tabling, when calling a predicate, look in table to see if it’s already been called: • If not, record the call in table, call it, and for each machine that returns, record its answer with its call. • If so, duplicate self for every answer, and suspend self waiting for more answers; when one shows up, duplicate self for it. • Asynchonicity is necessary!
Summary • Many simple rules cause Prolog to loop. • Tabling eliminates redundant computation by saving previous computations and their results in a table. • All programs that don’t use structures (lists, or function symbols) will terminate under tabled evaluation.
Basic Applications - Datalog • Prolog without data structures is a natural relational language: Databases • Explore examples • Extends relational databases by including recursion: • supports transitive closure, span-of-control • How does a DB Query language differ from a Programming language? • To be a “Database,” evaluation must be guaranteed to terminate. Prolog evaluation doesn’t; Tabled Prolog does. • Relational Databases include negation (or set difference.) • With recursion, negation is more complicated. • We’ll look at negation later
Context-free Grammars • Easy to write grammars in Prolog: • CFG Rule: A B C • Prolog Rule: a(S0,S) :- b(S0,S1),c(S1,S). A C B S0 S1 S Input Str: …………………………………
CF Grammars • Normally represent input Str as a list: • Position is represented • CF Rule: A B t C • Prolog rule: a(S0,S) :- b(S0,S1), connect(S1,t,S2), c(S2,S). With general “connect” fact (often called ‘C’/3): connect([Term|S],Term,S).
Example CF Grammar [Prolog has DCG preprocessor to add the “input variables” to --> rules for convenience. (See example.)] Simple Expression Grammar 1 (explore) expr --> term, [+], expr. expr --> term. term --> factor, [*], term. term --> factor. factor --> [X], {integer(X)}. factor --> ['('], expr, [')'].
CF Grammar Example 2 Simple Expression Grammar 2 (explore) :- auto_table. expr --> expr, [+], term. expr --> term. term --> term, [*], factor. term --> factor. factor --> [X], {integer(X)}. factor --> ['('], expr, [')']. What’s the difference from Example 1? Why does it matter?
CF Grammar Discussion • Prolog infinitely loops when given left-recursive rules (parses by “recursive descent.”) • Tabled Prolog handles all CFG’s (parses by “chart parsing”, variant of “Earley recognition.”) • Complexity? • Polynomial (whereas rec desc is exponential) • In theory cubic, if grammar is in Chomsky form • But an issue with input representation as lists. • For tabling, better to represent input with facts of form: word(Loc,Word,Loc+1).
Grammar Questions What we’ve seen is recognition: accepting or rejecting input Str. • Is this a Datalog problem? • How do we parse? I.e., construct parse tree. • Why can’t we do it in the same time as recognition? • Can we do it in same time as recognition + linear time for each parse? How?
Grammar Parsing A A A A a • Consider grammar: • DCG for parsing: • Input: aaaaa…aaaaab • Complexity, O(n³) but no parse!! :- auto_table. a(r1(P1,P2)) --> a(P1), a(P2). a(a) --> [a].
Parsing (better) :- auto_table. a --> a, a. a --> [a]. a(r1(P1,P2),S0,S) :- a(S0,S1),a(S1,S), a(P1,S0,S1), a(P2,S1,S). a(a) :- ‘C’(S0,a,S1).
PTQ: A More Complex Grammar • “The Proper Treatment of Quantification in Ordinary English,” by Richard Montague • “There are no significant differences between logical languages and natural languages.” • Proposed a formal grammar for (a fragment of) English, and a formal (model theoretic) semantics! • Examples: • “John seeks a unicorn” vs. “John finds a unicorn” • “John seeks a woman” de dicto/de re ambiguity • “The temperature is ninety and rising” should not imply “ninety is rising.” • “Every man loves a woman” is ambiguous. • … more …
Montague Grammar • My Thesis, took 2-3 years to develop in Lisp • With XSB, took 2-3 days… • Syntax: • Not Context-free, Left-recursive, infinitely many parses (variants), … • Complex parsing • Semantics: • By translation to Intensional Logic (a type theory) • Required simplifications: • β-reduction, • “extensionalization” (explore…)
Automata Theory (cursory) • Represent Finite State Machines by facts: • Transition relation: m(MId,Q1,S,S2). • Initial state: mis(MId,Qi). • Final state: mfs(MId,Qf). • Strs by facts: • Str contents: Str(SId,Loc0,S,Loc1). • Str length: Strlen(SId,Len).
FSA Accepts a String accept(MId,StrName) :- mis(MId,StateStart), recog(MId,StrName,StateStart,StateFinal,0,StrFinal), mfs(MId,StateFinal), Strlen(StrName,StrFinal). % regular transitions recog(MId,StringName,MState0,MState,SLoc0,SLoc) :- string(StringName,SLoc0,Symbol,SLoc1), m(MId,MState0,Symbol,MState1), recog(MId,StringName,MState1,MState,SLoc1,SLoc). % Epsilon transitions recog(MId,StringName,MState0,MState,SLoc0,SLoc) :- m(MId,MState0,'',MState1), recog(MId,StringName,MState1,MState,SLoc0,SLoc).
FSA Discussion • Does recognition need tabling? Why or why not? • Exercises: Write programs that: • Generate an FSA equivalent to the intersection of two FSAs. • Generate an epsilon-free FSA equivalent to a given FSA. • Generate an deterministic FSA equivalent to a given FSA. • Generate a minimal-state FSA equivalent to a given FSA. • Can you write a program that determines whether the intersection of languages of a CFG and an FSA is non-empty? • Hint: Note the representation of a string is the same as that of an FSA that recognizes exactly that string. • What would be the difference in the programs written in Prolog vs. Tabled Prolog?
Dynamic ProgrammingKnapsack Problem (trad.) • Given a set of items, find whether a packing of a knapsack with items with total weight w exists. • Items numbered 1 to n: item(I,K) means item #I weighs K • O(2 ) queries to ks/2, but only O(n*w) different ones. • explore :- table ks/2. % ks(I,K) if a subset of items 1..I sums to K ks(0,0). % empty set sums to 0 ks(I,K) :- I>0, I1 is I-1, ks(I1,K). % exclude Ith element ks(I,K) :- I>0, item(I,Ki), K1 is K-Ki, % include Ith element K1 >= 0, I1 is I-1, ks(I1,K1). :- ks(n,w). n
A General Evaluation Strategy for Recursive Definitions • Have seen examples of tabling as an extension of Prolog evaluation. • Now consider tabling in a more general context: • As a General Evaluation Strategy for Recursive Definitions • Functions and Evaluation • Tabled Evaluation • Multi-valued Functions and Relations
Mathematical Induction • High school math class… • Define functions on natural numbers • We proved that f(n) = n² • But how did we know f was well-defined? • How did we evaluate it? • f(0) = 0 • f(n) = f(n-1) + 2*n - 1
Evaluating Inductive Definitions • f(n) = if n=0 then 0 else n+f(n-1) • Evaluate f(8) • Bottom-up evaluation • Top-down (demand-driven) evaluation Top-down: 36 28 21 15 10 6 3 1 0 : f(n) 8 7 6 5 4 3 2 1 0 : n Bottom-up: 36 28 21 15 10 6 3 1 0 : f(n)
Fibonacci fib(n) = if n=0 then 1 else if n=1 then 1 else fib(n-1)+fib(n-2) • Bottom-up good for fib. Top-down 8 5 3 3 2 2 2 1 1 1 1 1 1 1 1 : fib(n) 5 4 3 2 1 0 : n Bottom-up 8 5 3 2 1 1 : fib(n)
log2 • log2: lg(n) = if n = 1 then 0 else 1 + lg(n div 2) • Top-down is good for log2 td 4 3 2 1 0 : lg(n) • 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 : n bu 4 3 3 3 3 3 3 3 3 2 2 2 2 1 1 0 : lg(n)
Bottom-up vs. Top-downSummary • Bottom-up and Top-down evaluation are “incommensurate”: neither one is uniformly better than the other. • Bottom-up is exponentially better than top-down for fib • Top-down is exponentially better than bottom-up for log2 • Can we get the best of both strategies?
Tabled Evaluation • Top-down demand-driven, but: • Save intermediate results in a table, so • Future requests use the table. • Combines top-down demand-driven with bottom-up non-redundancy.
Tabled Evaluation (fib) fib(n) = if n=0 then 1 else if n=1 then 1 else fib(n-1)+fib(n-2) • Tabled evaluation similar to bottom-up on fib. Top-down w/ tabling 8 5 3 2 1 1 : fib(n) 5 4 3 2 1 0 : n Bottom-up 8 5 3 2 1 1 : fib(n)
Tabled Evaluation (log2) • log2: lg(n) = if n = 1 then 0 else 1 + lg(n div 2) • Tabled evaluation similar to top-down on log2 td w/ tabl 4 3 2 1 0 : lg(n) • 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 : n td 4 3 2 1 0 : lg(n)
A Theoretical Oddity? • Tabled evaluation was proposed in 60’s by D. Michie, but never pursued. • Is the overhead too high? • If a functional programmer is dumb enough to write fib as doubly recursive, s/he gets what s/he deserves! Write a better program. • But …
On to Recursive Definitions • Inductive definitions are: • Defined on the natural numbers (or other well-founded set) • Required to be defined… • Explicitly for minimal argument(s) • In terms of smaller elements for non-minimal arguments • Recursive definitions aren’t!
New Problems with Recursive Definitions • Computationally problematical • No ordering, so where is the “bottom” for bottom-up evaluation? • Demand evaluation loops, e.g., when f(17) = ……f(17)…… • Semantically problematical for functions • May be no solutions: e.g., f(17) = f(17)+1 • May be many: e.g., f(17) = f(17)
One Approach:Multi-valued Functions • Permit functions to be multi-valued: • Result of a function is a set of values • Allow multiple “definitions” for a function • Functions are composed point-wise • Nondeterminism is a useful construct in a variety of applications • Naturally resolves semantic problems with self-loops: • Define the least fixed point, on the lattice of sets • f(17)=f(17) interpreted as f(17) = {}
Self-Loops and Top-Down vs. Tabled Evaluation • Self-loops, when definitions unfold to: • f(17) = … f(17) … • TD, not remembering anything, can’t avoid loops in definitions with this form. • Tabled can terminate those loops, since they don’t contribute to defining a result, and look for other ways to determine f(17). • So tabled evaluation will terminate for definitions for which TD will infinitely loop.
Recursive Definitions of Relations • Multi-valued functions? • or Relations? • Differences: • Syntax, Modes, Higher Order, … • Prolog uses relations, and that is our interest here. fib(n) = if n=0 then 1 else if n=1 then 1 else fib(n-1)+fib(n-2) • fib(0,1). • fib(1,1). • fib(N,F) :- N > 1, N1 is N-1, N2 is N-2, • fib(N1,F1), fib(N2,F2), F is F1+F2
Programming Language Implementations • Programming Languages use recursive definitions as programs and top-down evaluation as the execution strategy. • Evaluation of recursive definitions was (initially) hard to implement • Fortran didn’t implement it • PL/1 said not to use it for efficiency reasons • But it made programming MUCH easier • E.g., C.A.R. Hoare’s experience with quicksort
Invention of Quicksort(Reconstructed anecdote inspired by C.A.R. Hoare) • Quicksort: • Choose a “random” element from array • Partition array with all elements greater than the chosen one to top and all less to bottom • Recurse on each partition • Hoare invented and implemented it in Fortran, explicitly handling all the stacks in arrays. Very complicated and difficult. • Was amazed at how “trivial” it became when it was expressed as a recursive function in Algol60.
CLAIM: • Tabled Evaluation can make relational programming MUCH easier. • As recursion made programming (and algorithm development) much easier. • And it ain’t that easy to implement either…. • (but that’s another talk)
Evidence: Applications • Grammars and Automata Theory, as I hope we have seen. • Program Analysis • Abstract Interpretation • (Model Checking)
Homework Assignment 1: Program Interpretation • Use XSB to write an interpreter for Pascal (a subset). Hints: • Represent state as a list of environments: • An environment is a set of [variable,value] pairs • State contains envs containing variables accessible at the current program point: one env for each (statically) enclosing block.
Example ofProgram State Representation prog P: var A, B proc Q: var C, D proc R: var E,F beginR …call(Q)… end beginQ …call(R)… end beginP …call(Q)… end. Program state at program point [[E,F],[C,D],[A,B]] [[C,D],[A,B]] [[A,B]]
Abstract Syntax Tree • You will be given a routine that, given a file with Pascal program text, will return its AST • Each variable use will be represented in the AST by its name and its “scope” • Scope = 1 if local, • Scope = 2 if declared in immediately enclosing block • Scope = 3 if declared in next outer block …
More Hints: • Write functions like: • getVariableValue: VarName × Scope× State Value • setVariableValue: VarName × Scope × Value × State State • interpExpr: AST × State Value • interpStmt: AST × State State
Homework 1: (cont) • In addition to submitting working code, please answer the following discussion questions: • How did you handle procedure invocation and return? Describe in particular how the structure of the state was changed? • How did you use the nondeterministic aspects of the XSB language? I.e., what would have been different had you used Standard ML? • Due Date: 2 weeks.
Francesco’s HW1 Submission • On entry to a procedure: • I created a new state by: • taking a “tail” of the current state, keeping the envs for the block where the called proc is declared, • And adding a new env to the front for the proc being entered, with (value) parameters initialized. • On exit from a procedure: • I replaced the “tail” of the state on entry by the tail of the state returned from the procedure. • I didn’t use the nondeterministic aspects at all. • (I’ve got better things to do with my time than learn new irrelevant languages. Is the Prof getting old and losing it?)
Homework 2: Program Analysis • Write an abstract interpreter for our Pascal subset that will abstract integers to even/odd. I.e., given any program, it determines for each variable whether it will contain only even integers or only odd or might contain either. • Extra credit: Use the same idea to determine for each variable whether it might be used before it is assigned a value.