400 likes | 521 Views
Artificial Intelligence Inference in first-order logic. Fall 2008 professor: Luigi Ceccaroni. Role of first-order logic (FOL). Knowledge representation (KR) is an interdisciplinary subject that applies theories and techniques from three fields:
E N D
Artificial IntelligenceInference in first-order logic Fall 2008 professor: Luigi Ceccaroni
Role of first-order logic (FOL) • Knowledge representation (KR) is an interdisciplinary subject that applies theories and techniques from three fields: • Logic provides the formal structure and rules of inference. • Ontology defines the kinds of things that exist in the application domain. • Computation supports the applications that distinguish KR from pure philosophy. 2
Effective procedures for answering questions posed in FOL • More or less anything can be stated in FOL. • It is important to have algorithms that can answer any answerable question stated in FOL. • Three major families of first-order inference algorithms: • forward chaining and its applications to deductive databases and production systems • backward chaining and logic programming systems • resolution-based theorem-proving systems
Forward chaining • Forward chaining can be applied to first-order definite clauses. • First-order definite clauses are disjunctions of literals of which exactly one is positive. • Definite clauses such asSituation 1 ∧ Situation 2 ⇒ Response are especially useful for systems that make inference in response to newly arrived information.
Forward chaining • The idea: • start with the atomic sentences in the knowledge base • apply Modus Ponens (A, A⇒B |- B) in the forward direction, adding new atomic sentences, until no further inferences can be made • Reasoning with forward chaining can be much more efficient than resolution theorem proving. • This idea needs to be applied efficiently to first-order definite clauses. 6
First-order definite clauses • Consider the following problem: • The law says that it is a crime for a Spanish to sell weapons to other nations. The country Israel has some missiles and all of its missiles were sold to it by Rodríguez Zapatero, who is Spanish. • To prove that Rodríguez Zapatero is a criminal, first, we have to represent the facts as first-order definite clauses. • Then we can solve the problem with forward-chaining. 7
First-order definite clauses • “... it is a crime for a Spanish to sell weapons to other nations”: Spanish(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ NotInSpain(z) ⇒ Criminal(x). • “...Israel has some missiles” is transformed into two definite clauses by Existential Elimination, introducing a new constant AGM12: Owns(Israel, AGM12) Missile(AGM12) 8
First-order definite clauses • “All of its missiles were sold to it by Rodríguez Zapatero”: Missile(x) ∧ Owns(Israel, x) ⇒ Sells(ZP, x, Israel). • We need to know that missiles are weapons: Missile(x) ⇒ Weapon(x) • We also need to know that Israel is not in Spain: NotInSpain(Israel) • “Rodríguez Zapatero, who is Spanish...”: Spanish(ZP) 9
A simple forward-chaining algorithm function FOL-FC-ASK (KB,a) returns a substitution or falselocalnew, the new sentences inferred on each iteration repeat until new is emptynew <- {}for each sentence rin KBdo(p1… pnq ) <- STANDARDIZE-APART(r)for each t such that SUBST(t , p1… pn)=SUBST(t , p1’ … pn’) for some p1’ ,…, pn’ in KBq’ <- SUBST(t ,q)if q’ <- is not a renaming of some sentence already in KB or newthen do add q’ to newf <- UNIFY(q’, a) if f is not failthen return fadd new to KBreturnanswers 10
A simple forward-chaining algorithm • Starting from the known facts, • it triggers all the rules whose premises are satisfied, • adding their conclusions to the known facts. • The process repeats until • the query is answered (assuming that just one answer is required) • or no new facts are added. 11
A simple forward-chaining algorithm • Notice that a fact is not “new” if it is just renaming a known fact. • One sentence is a renaming of another if they are identical except for the names of the variables. • Likes(x, IceCream) and Likes(y, IceCream) are renamings of each other • their meanings are identical: • everyone likes ice cream 12
A simple forward-chaining algorithm • In the application of forward chaining to the crime problem (with three rules), two iteration are required: • Iteration 1 • “Spanish(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ NotInSpain(z) ⇒ Criminal(x).” has unsatisfied premises. • “Missile(x) ∧ Owns(Israel, x) ⇒ Sells(ZP, x, Israel).” is satisfied with {x/AGM12} and “Sells(ZP, AGM12, Israel).” is added. • “Missile(x) ⇒ Weapon(x).” is satisfied with {x/AGM12} and “Weapon(AGM12)” is added. 13
A simple forward-chaining algorithm • Iteration 2 • “Spanish(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ NotInSpain(z) ⇒ Criminal(x).” is satisfied with {x/ZP, y/AGM12, z/Israel} and Criminal(ZP) is added. • Fixed point: no new inferences are possible at this point because every sentence that could be concluded by forward chaining is already contained explicitly in the KB. • Algorithm not designed for efficiency of operation. 14
Efficient forward chaining • There are three possible sources of complexity: • The “inner loop” involves finding all possible unifiers such that the premise of a rule unifies with a suitable set of facts in the KB. • This pattern matching can be very expensive. • The algorithm rechecks every rule on every iteration, even if very few additions are made to the KB on each iteration. • The algorithm might generate many facts that are irrelevant to the goal. 15
Matching rules against known facts • Consider the two rules: Missile(x) ∧ Owns(Israel, x) ⇒ Sells(ZP, x, Israel). Missile(x) ⇒ Weapon(x). • If the KB contains many objects owned by Israel and very few missiles, • it would be better to find all the missiles first • then check whether they are owned by Israel • This is the conjunct ordering problem: • find an ordering to solve the conjuncts of the rule premise so that the total cost is minimized. 16
Matching rules against known facts • Finding the optimal ordering is NP-hard, but good heuristics are available: • for example, the most constrained variable heuristic • The connection between pattern matching and constraint satisfaction is very close. • Most rules in real-world KBs are small and simple rather than large and complex. 17
Incremental forward chaining • To avoid redundant rule matching: • Every new fact inferred on iteration t must be derived from at least one new fact inferred on iteration t-1. • This leads to an incremental forward chaining algorithm where, at iteration t, a rule is checked only if its premise includes a conjunct that unifies with a fact newly inferred at iteration t-1. 18
Rete algorithm • Only a small fraction of the rules in a knowledge base are triggered by the addition of new facts. • A great deal of redundant work is done in constructing partial matches repeatedly that have some unsatisfied premises. • It would be better to retain and gradually complete the partial matches as new facts arrive, rather than discarding them. • The rete algorithm was the first to address this problem seriously. • The algorithm preprocesses the set of rules in the knowledge base to construct a sort of dataflow network in which each node is a literal from a rule premise. • Variable bindings flow through the network and are filtered out when they fail to match a literal.
Production systems • Rete networks have been a key component of so-called production systems, which were among the earliest forward chaining systems in widespread use. • The word “production” denotes a condition-action rule. • The steps to solve a problem are described as a chain of deductions.
Production systems • The representation is based on two elements: • facts: definite clauses with no negative literals, which simply assert a given proposition; they can include variables • rules: conditional formulas where the consequent is exactly one literal • A problem is defined by: • fact base: atomic sentences describing the specific problem • rule base: rules describing the reasoning mechanisms • inference engine: algorithm executing the rules in a reasoning chain
Irrelevant facts • Forward chaining makes all allowable inferences, even if they are irrelevant to the goal at hand. • One solution is to restrict forward chaining to a selected subset of rules. • Another solution is to use backward chaining. 22
Backward chaining • This algorithm works backward from the goal, chaining through rules to find known facts that support the proof. • It is called with a list of goals containing a single element, the original query. • It returns the set of all substitutions satisfying the query. • If all of them can be satisfied, then the current branch of the proof succeeds. 23
Backward chaining • Set of sentences: • S1: for each x1,y1 child(x1,y1) ⇒ parent(y1,x1) • S2: for each x2,y2 parent(x2,y2) ∧ female(x2) ⇒ mother(x2,y2) • S3: child(Lisa,Homer) • S4: child(Lisa,Marge) • S5: female(Marge) • Goal: mother(x0,Lisa) • Note: variables have already been standardized apart using subscripts 24
Backward chaining • Based on the inductive method: • The objective/hypothesis is validated reconstructing the reasoning chain backwards • Each step implies new sub-objectives: • New hypotheses to be validated
Backward chaining • Algorithm • The FB is initialized with a set of initial facts. • The hypothesis set (HS) is initialized with the objectives. • While there are hypotheses in HS, one is chosen and validated: • Facts and right-end-side of rules are compared to the hypotheses • If a hypothesis is in the FB, then eliminate it from HS. • Else, look for rules with the hypothesis as conclusion. Select one and add its non-satisfied premises to HS.
Backward chaining • Advantages: • Only what is necessary for the resolution of the problem is considered. • The resolution process consist of a tree exploration • Most representative language: Prolog • W. F. Clocksin y C. S. Mellish . Programming in Prolog: Using the ISO Standard. Springer, 2003 (first edition: 1981).
Hybrid chaining • Parts of the reasoning chain from facts to objectives are deductively constructed. • Other parts are inductively constructed. • Bi-directional exploration • The combinatorial explosion of deductive reasoning is avoided.
Hybrid chaining • Meta-rules decide about strategy change as a function of: • Number of initial and final states • Better from smaller to larger • Branching factor • Better in the direction of lower factor • Necessity of justifying the reasoning process • Better what reflects typical user reasoning
Inference engine • The inference engine or control mechanism is made of two elements: • Rule interpreter or inference mechanism • It determines which rules can be applied • Control strategy of conflict-resolution strategy • The function of the inference engine is to execute actions to solve the problem (objective) from an initial set of facts, possibly via some interaction with the user.
Inference engine • Phases of a basic engine cycle: • Detection (filter): relevant rules • Obtaining from the KB the set of executable rules for a given situation (state) of the FB (rule interpreter) • Creation of the conflict set • Selection: which rule? • Conflict resolution: selection of a rule to be executed (control strategy) • Execution • Execution of the rule with an instantiation of the variables: modification of the working memory • Return to 1, or stop if the problem is solved • If a solution had not been found and no rule is executable: failure
Detection • Creation of the set of executable rules • The rule interpreter obtains possible instantiations of the LHS of the rules via matching. • A rule can be instantiated more than once with different values matching the variables.
Selection • Rules are selected according to the control strategy, which can be: • Fixed • Dynamic with criteria decided in advance • Guided by dynamic meta-rules
Selection • Strategies for the selection of the “best” rule/instantiation: • First rule according to KB ordering • Least/most used rule • Most specific (with more literals) / most general rule • Rule with the highest certainty value • The instantiation satisfying • Facts with highest priority • Oldest facts • Most recent facts • Most recently used rule • Dynamic meta-rules • Combination of different criteria
Execution • The execution of a rule can cause: • A modification of the FB (in forward chaining only) • New calculations, new actions, questions to the user • New sub-objectives (in backward chaining) • Instantiations are propagated • Certainty degrees are propagated
End of deduction/induction process • The conclusion (objective) is found/demonstrated: success • No rule is executable: success? / failure?
Prolog: lenguaje de programación lógica • Programa Prolog: • conjunto de aserciones lógicas • cada aserción es una cláusula de Horn • proceso de comparación: unificación • el orden de las aserciones (reglas) es significativo • Ejemplo • gato (bufa). • animaldomestico (X) :- gato (X). • animaldomestico (X) :- pequeño (X), conplumas (X). • p → q ⇔ q :- p • Consultas: ?- predicado. • La negación se representa con la falta de la aserción correspondiente: asunción de mundo cerrado.
Cláusula de Horn • De Wikipedia: Las cláusulas de Horn (instrucciones ejecutables de PROLOG) tienen el siguiente aspecto: hija (*A, *B) :- mujer (*A), padre (*B, *A). que podría leerse así: "A es hija de B si A es mujer y B es padre de A". Obsérvese que, en PROLOG, el símbolo :- separa la conclusión de las condiciones. En PROLOG, las variables se escriben comenzando con un asterisco. Todas las condiciones deben cumplirse simultáneamente para que la conclusión sea válida. Por tanto, la coma que separa las distintas condiciones es equivalente a la conjunción copulativa (en algunas versiones de PROLOG se sustituye la coma por el símbolo &). La disyunción, en cambio, no se representa mediante símbolos especiales, sino definiendo reglas nuevas, como la siguiente: hija (*A, *B) :- mujer (*A), madre (*B, *A). que podría leerse así: "A es hija de B si A es mujer y B es madre de A". Cláusulas con como mucho un literal positivo.