Raisonnement Automatis´e: Principes etApplications (partie I: logique du premier ordre)N. Peltier (CNRS, Universit´e de Grenoble)2009This document contains the first part of the M2R course RAPA: AutomatedReasoning: Principle and Applications. It presents the basis of first-order logicand automated deduction: syntax, semantics, transformation into clausal form,unification and the Resolution calculus (with selection functions and atom or-dering). Some basic properties of the Resolution calculus are also investigated(w.r.t. complexity and termination).This document is self-contained but additional references are provided forthe interested reader. More details and additional explanations can be found in[5,6]. [8]isanadvancedtextboookontheResolutioncalculusandtheHandbookof Automated Reasoning [12] covers the main lines of research in this field.1 First Order LogicFirst-order logic (FOL) is a formal language for expressing properties. Propo-sitional logic allows one to express basic statements (s.t. “Paris is a town” or“Berlinis atown”or“Parisisthe capitalofFrance”)andto combinethem withlogical connectives:¬ (not),∨ (or),∧ (and),⇒ (implies) and⇔ (equivalence).First-orderlogic extends this languageby using predicate symbols and quantifi-cation over individuals. For instance, the property “to be a town” may be ex-pressedby a predicatesymbol Town, which can be applied to differentindividu-als: Town(Paris),Town(Berlin),...Usingquantification ...
RaisonnementAutomatise´:Principeset Applications (partie I: logique du premier ordre)
N.Peltier(CNRS,Universit´edeGrenoble) 2009
This document contains the first part of the M2R course RAPA:Automated Reasoning: Principle and Applications. It presents the basis of first-order logic and automated deduction: syntax, semantics, transformation into clausal form, unification and the Resolution calculus (with selection functions and atom or-dering). Some basic properties of the Resolution calculus are also investigated (w.r.t. complexity and termination). This document is self-contained but additional references are provided for the interested reader. More details and additional explanations can be found in [5, 6]. [8] is an advanced textboook on the Resolution calculus and the Handbook of Automated Reasoning [12] covers the main lines of research in this field. 1 First Order Logic First-order logic (FOL) is a formal language for expressing properties. Propo-sitional logic allows one to express basic statements (s.t. “Paris is a town” or “Berlin is a town” or “Paris is the capital of France”) and to combine them with logical connectives:¬(not),∨(or),∧(and),⇒(implies) and⇔(equivalence). First-order logic extends this language by using predicate symbols and quantifi-cation over individuals. For instance, the property “to be a town” may be ex-pressed by a predicate symbolTown, which can be applied to different individu-als:Town(Paris)Town(Berlin), . . . Using quantification, it is possible to express the property: “all countries have a capital”:∀x[Country(x)⇒ ∃yCapital(y x)] (meaning: “for everyx, ifxis a country, then there exists aysuch thatyis the capital ofx”). However, it is not possible in first order logic to express quantification over setsof individuals or over functions. For instance, the induction principle is not expressible in first-order logic:∀P[P(0)∧ ∀xP(x)⇒P(succ(x))]⇒ ∀xP(x) is nota sentence of FOL, due to the quantification over the sets of natural numbers P the property. Similarly,∀f∃xf(x) =x(every function has a fixpoint) is not expressible in FOL.
1
1.1 Syntax Afirst-order languageLis a set containing: •A set ofconstant symbols, usually denoted bya b c •A set offunction symbolsf g h •A set ofpropositional variablesP Q R •A set ofpredicate symbols, also denoted byP Q R Each function or predicate symbol is associated to a unique natural number called itsarity(number of arguments). Throughout this document, we assume that a first-order languageLis given, together with a set ofvariablesV, disjoint from the symbols inL. Terms and formulae are defined relatively to these sets. Definition 1(Terms) The set oftermsis thesmallestset that satisfies the following properties: •Every constant symbol is a term. •Every variable is a term. •Ift1 tnare terms and iffis a function symbol of aritynthen f(t1 tn)is a term. We emphasize that this definition isinductive(“. . . smallest set that . . . ”). All the terms must be of one of the above forms and infinite terms of the form f(f(f( ))) are forbidden. Alternatively, terms can be seen astreeslabeled by symbols inL. Remark 2For the sake of uniformity, one can also view constant symbols as function symbols of arity0(nullary functions). this convention is used, the If first line in Definition 1 may be deleted. An expression of the formf(t1 tn) wheren= 0is to be read as the constantf. We denote byVar(t) the set of variables occurring in the termt set is. This inductively defined as follows: Var(t) ={t}iftis a variable Var(t) =∅iftis a constant symbol Var(t) =Sin=1Var(ti) iftis of the formf(t1 tn) First-order formulae are built inductively from terms using the predicate symbols and propositional variables inLand the logical symbols∨(or),∧ (and),⇒(implication),⇔(equivalence),∀∃(quantification). Definition 3(Formulae) The set of formulae is the smallest set that satisfies the following properties: 2
•Every propositional variable is a formula. •“true” and “false” are formulae. •Ift1 tnare terms andPis a predicate symbol of aritynthen P(t1 tn)is a formula. •Ifφis a formula then(¬φ)is also a formula. •Ifφ1 φ2are formulae, then(φ1∧φ2),(φ1∨φ2),(φ1⇒φ2)and(φ1⇒φ2) are formulae. •Ifφis a formula andxis a variable, then(∀xφ)and(∃xφ)are formulae. In the above definition, formulae are always written with parentheses. In practice, some parentheses may of course be omitted, and the usual priority rules are used to reconstruct the corresponding formula. The priority rank is as follows:∀∃>¬>∧>∨>⇒ instance. For¬P∧Q∨R⇒Pshould be read as (((¬P)∧Q)∨R)⇒R. A formulaψis said to be asubformulaofφifψis eitherφor a formula occurring insideφ(the formal definition is left to the reader). Formulae that contain no logical symbols (i.e. that are propositional vari-ables or of the formP(t1 tn) wherePis a predicate symbol) are called atoms(oratomic formulae formula that is either an atomic formula or the). A negation of a atomic formula is called aliteral. For instance, the formula (P∨(∀xQ(f(x))))⇒(∃x¬R(x x)) contains 3 atoms:P Q(f(x)) andR(x x). All these atoms are literals, as well as¬R(x x). We say that a variablexisfreein a formulaφif it occurs inφ, but not on the scope of a quantifier∀xor∃x we denote by. Formally,FVar(φ) the set of free variables ofφ, defined as follows:
FVar(φ) =∅Ifφis equal totrueorfalse FVar(φ) =∅ifφis a propositional variable FVar(φ) =Sin=1Var(ti) ifφis of the formP(t1 tn) FVar(φ) =FVar(ψ) ifφis¬ψ FVar(φ) =FVar(ψ1)∪FVar(ψ2) ifφisψ1⋆ ψ2with⋆∈ {∨∧⇒⇔} FVar(φ) =FVar(ψ)\ {x}ifφis∃xψor∀xφ A formula is said to beclosed A variable occurringif it has no free variable. on the scope of a quantifier is said to bebound that the same variable. Note may be free and bound simultaneously, for instancexinp(x)∨ ∀x p(x this). In case, the two occurrences ofxdenote different objects (the first occurrence of xis a free variable, whose value is unknown, whereas the second one ranges over the whole domain). In practice, to avoid confusion, the variables should be renamed:p(x)∨ ∀y p(y).
3
1.2 Substitutions Asubstitution domain ofis a function mapping every variable to a term. The a substitutionσis the set of variablesxs.t.σ(x)6=x(usually the domain is assumed to be finite). A substitutionσof domainx1 xnis denoted as follows:{x17→σ(x1) xn7→σ(xn)}. A substitution is said to be arenamingif for every variablex,σ(x) is a variable, and ifσis injective. Ifσandθare substitutions of disjoint domains, thenσ∪θdenotes the union ofσandθ: (σ∪θ)(x)d=efσ(x) ifx∈dom(σ) and (σ∪θ)(x)d=efθ(x) ifx∈dom(θ). The image of a termtby a substitutionσis obtained by replacing any variablexoccurring intbyσ(x). It is usually denoted bytσ. Formally:
aσd=efaifais a constant symbol. xσd=efσ(x) ifxis a variable. def f(t1 tn)σ=f(t1σ tnσ) Substitutions can also be applied to formulae, exactly in the same way. However, a difficulty occurs when the formula contains quantifiers, because in this case there can be conflicts between the variables in the domain of the substitution and the bound variables occurring in the formula. The following example will clarify this. Assume that the substitution{x7→a}is applied to the formula∀x p(x). The variablexoccurring in the formula shouldnotbe replaced bya since it occurs on . Indeed,the scope of a quantifier, it has no link to the variablexoccurring in the substitution which is a free variable (although both variables have the same name they do not represent the same objects -from a programming point of view one could write that they do not occur in the same environment). Thus, the formula should be renamed into (for example):∀y p(y) before the substitution can be applied. The result is∀y p(y). Similarly, assume that a substitution{x7→f(y)}is applied to the formula ∀x∃yP(x y again, the variable). Onceyoccurring in the substitution (more precisely in the termf(y)) is different from the bound variableyin the formula. The result should be:∀x∃y′P(f(y) y′) (and not∀x∃yP(f(y) f(y)) which has a different meaning). Formally, we denote byVar(σ) the set of variables occurring either in the domain ofσor in a termt=σ(x), for somex∈dom(σ the bound variables). If do not occur inVar(σthe substitution is applied normally (as for terms),) then by replacing any variablexbyσ(x):
4
pσd=efpifpis a propositional variable φσd=efφifφis equal totrueorfalse p(t1 tn)σd=efp(t1σ tnσ) (¬φ)σd=ef¬(φσ) (φ1⋆ φ2)σd=efφ1σ ⋆ φ2σwhere⋆is either∧∨⇒or⇔ (∃xφ)σd=ef(∃x)φσifx6∈Var(σ) (∀xφ)σd=ef(∀x)φσifx6∈Var(σ) If a bound variable occurs inVar(σ), then it should be renamed before applyingσ, as shown that the following definitions:
(∃xφ)σd=ef(∃x′)φ{x7→x′}σ ifx∈Var(σ) andx′is a new variable not occurring inφor inVar(σ) (∀xφ)σd=ef(∀x′)φ{x7→x′}σ ifx∈Var(σ) andx′is a new variable not occurring inφor inVar(σ) Iftis a term (or formula) andσis a substitution, thentσis called aninstance oft. Ifσ θare two substitutions, thenσθdenotes the composition ofσandθ(σ is applied first). Ifγ=σθthenσis said to bemore generalthanγandγis said to be aninstanceofσ. 1.3 Semantics Semantics associate a truth value (true or false) to a given formula. It is of course impossible in general to associate a unique truth value to a given formula because this value depends on the meaning of the non logical symbols occurring in the formula. For instance, the formulaP∧¬Pshould obviously have the truth valuefalse, regardless of the meaning ofPbecausePcannot be false and true simultane-ously (in boolean algebraPP Similarly,is 0).Q(a)⇒Q(a) should betrue. But what is the meaning of the formula:∃xf(x) =x? Obviously it depends onf for instance. Iffis the successor functionx7→x then the formula is+ 1 false(there is noxthat is its own successor), but iffisx7→2×x, then the formula should betrue(x0). = Thus, before affecting a truth value to a formula, it is necessary to specify the meaning of the symbols inL(the meaning of the logical symbols∨∧⇒⇔∀∃ is fixed). This is what we call aninterpretation. More formally: Definition 4(Interpretation) Aninterpretationis defined by adomainDIthat is anon emptyset and by a function mapping: 5
•Each constant symbolato an elementaI∈DI. •Each function symbolfof aritynto a functionfIfromDInintoDI. •Each propositional variablePto a truth valuePI(PIis either true or false). •Each predicate symbolPof aritynto a functionPIfromDIninto {truefalse}(since this function has only two possible values, one could simply denote it by the set of the tuples that are associated to true). •Each variablexto an elementxIofDI. Once the meaning of the constant and function symbols and of the variables are known, it is easy to define the value of all terms, by induction: Definition 5(Value of a Term) Iftis a term andIis an interpretation then [t]I(the value of the termtinI) is inductively defined as follows: •Iftis a constant symbolathen[t]Id=efaI. •Iftis a variablexthen[t]Id=efxI. •Iftis of the formf(t1 tn)then[t]Id=effI([t1]I [tn]I). Note that the value of a term is always an element of the domain, by defini-tion. Once the range of the variables (domain) and the meaning of terms are known, it is easy to define the truth value of a formula. We need to introduce a notation: ifIis an interpretation, then the expression I{x←v}(wherexis a variable or a constant andvis an element of the domain DIofI) denotes an interpretationJwhich is identical toI(same domain and same interpretation of all symbols), except for the interpretation ofxthat is defined as follows:xJd=efv. Definition 6(Value of a Formula) Ifφis a formula andIis an interpretation, then[φ]I(the truth value ofφinI) is defined as follows: •Ifφis a propositional variableP, then[φ]Id=efPI. •Ifφis of the formP(t1 tn)then[φ]Id=efPI([t1]I [tn]I). •Ifφis of the form¬ψthen[φ]d=efthfalseorteufi[eψr]Iwi=flsaese I •Ifφis of the formψ1∧ψ2then: [φ]Id=effueioftsetarl[hψer1]wIis=euterand[ψ2]I=true
6
•Ifφis of the formψ1∨ψ2then: [φ]Id=eftrue if[ψ1]I=trueor[ψ2]I=true false otherwise •Ifφis of the formψ1⇒ψ2then: [φ]Id=eflsfatheotieurf[eψr1]wIis=eslafeor[ψ2]I=true This means thatψ1⇒ψ2is equivalent to¬ψ1∨ψ2. •Ifφis of the formψ1⇔ψ2then: [φ]Id=eftrue if[ψ1]I= [ψ2]I false otherwise •Ifφis of the form∀x ψthen: [φ]Id=efeufiofarynlemenesetrslafiwrehtoetv∈DI,[ψ]I{x←v}=true Essentially,φis true iffψis true regardless of the value ofx. •Ifφis of the form∃x ψthen: [φ]Id=efesiwrehtoeslafifueerthtrtsseeixv∈DIsuch that[ψ]I{x←v}=true Notice that for the sake of simplicity we assume thatIgives a value to all constant/function/predicate symbols in the language and to all variables. But, obviously, the truth value of a formulaφin an interpretationIdepends only on the value of the symbols and of the variables (freely) occurring inφ(the values of the remaining symbols are irrelevant). In particular, the value of a closed formula does not depend on variables. We writeI|=φ(orIvalidatesφ) iff [∀x1 ∀xnφ]I=true, where {x1 xn}is the set of the free variables inφ. ThenIis called amodel ofφ. Notice implies that Thisthe free variables are universally quantified. that the interpretation of the variables in the model is irrelevant (the formula must be true for all possible values of the variables). IfSis a set of formulae, then we writeI|=Siff for allφ∈S,I|=φ. A formula (or set of formulae)φis said to besatisfiableif it has a model, unsatisfiable otherwise. It is said to bevalidif every interpretation is a model. Obviously,φis valid iff¬φ Twois unsatisfiable. formulaeφ ψare said to be equivalent(writtenφ≡ψ) iff for any interpretationI, [φ]I= [ψ]I(i.e. if I|= (φ⇔ψ formula)). Aψis said to be alogical consequenceofφiff for any interpretationIs.t. [φ]I=truewe have [ψ]I=true is written. Thisφ|=ψ Beware that the same notation denotes two distinct relations: “the formulaψ is a logical consequence ofφ” is denoted byφ|=ψand “the interpretationIis a model ofφ” byI|=φ. Two important points deserve to be emphasized because they are often mis-understood:
7
Remark 7IfIis an interpretation andφ1 φ2are formulae, then the fact that I|=φ1∨φ2doesnotimply that there existsi∈ {12}s.t.I|=φi, although we have[φ1∨φ2]I=true iff[φ1]I=true or[φ2]I=true. Indeed,φ1 φ2 may contain free variables and the indexis.t.φiis true can depend on the interpretation of these variables. For instance if Even and Odd are interpreted as the usual predicates on natural numbers, then we haveI|=Even(x)∨Odd(x) butI6|=Even(x)andI6|=Odd(x)(every natural number is either even or odd, but it is not true that all natural numbers are even, nor that all natural numbers are odd). Of course, ifφ1∨φ2is closed then the above property is true. Remark 8Similarly, ifψ φ1 φ2are formulae, then we can haveψ|=φ1∨φ2 butψ6|=φ1andψ6|=φ2, even ifφ1 φ2 reason is that Theare closed.ψ|=φ1∨φ2 means that for any modelIofψthere existsi∈ {12}s.t.I|=φi, but the index isuch thatφiis truecan depend onI. Assume for instance thatψis true, thatφ1is a propositional variablePand thatφ2d=ef¬P. Thenψ|=P∨ ¬Pbut ψ6|=Pandψ6|=¬P. There are alternative (equivalent) ways of defining the semantics of a first-order formula. Sometimes, the interpretation of the free variables is not included in the interpretation itself, but is given separately by a another function (often called avaluation). Our definition is more uniform. In order to avoid having to assign values to variables, we could also replace quantified variables by new (“fresh”) constant symbols before interpreting them. The value of an existential formula would then be defined as follows: [∃xφ]I= trueiff there exists an elementv∈DIs.t. [φ{x7→c}]I{c←v}=true, where cnew constant symbol, not occurring inis a φ. The definition for universal quantifiers is similar. All these definitions are of course equivalent (this can be shown by an easy induction on the formulae). The following lemmata state easy consequences of the definition, namely the possibility of replacing, in a formula, a subformula by an equivalent one, without affecting the truth value. Similarly, a variable may be replaced by a term having the same value. Lemma 9(Replacement Lemma) Letφbe a formula. Letψbe a subformula occurring inφ. Letψ′be a formula equivalent toψand letφ′be a formula obtained fromφby replacing the formulaψbyψ′. ′ φis equivalent toφ. Proofby an easy induction on the size of the formula. It is leftThe proof is to the reader as an exercise. Lemma 10LetIbe an interpretation,xbe a variable,tbe a term andφbe a formula (or term). IfxI= [t]Ithen[φ]I= [φ{x7→t}]I. 8
Proofan immediate consequence of the definition (the detailed proofThis is is by an easy induction on the size ofφ). Corollary 11Letφbe a formula,xbe a variable andtbe a term. The formulae(∀x φ)⇒φ{x7→t}andφ{x7→t} ⇒(∃x φ)are valid. Proof LetWe only give he proof for the first formula.Ibe an interpretation. We have to show that [∀xφ⇒φ{x7→t}]I=true that assume, w.l.o.g.. Wex does not occur int(if it is the case then we simply rename the formula∀xφinto the equivalent formula∀x′φ{x7→x′}wherex′occurs neither inφnot int). By definition [∀xφ⇒φ{x7→t}]I=trueiff either [∀xφ]I=falseor [φ{x7→ t}]I=true. Thus we assume that [∀xφ]I=true. Letv= [t]Iand letJ= I{x←v}. By definition of the interpretation of∀xφwe have [φ]J=true. By definition ofJwe havexJ= [t]I= [t]J(sincexdoes not occur int, the value oftis the same inIand inJ). By Lemma 10 we deduce that [φ{x7→t}]J=true since. Butφ{x7→t}does not containx, we have [φ{x7→t}]I= [φ{x7→t}]J=true. 2 Clausal Normal Form 2.1 Clauses Definition 12Aclauseis a formula of the formWin=1Liwhere theL1 Ln are literals (i.e. atoms or negations of atoms). We may haven= 0, then the clause is empty. The empty clause is denoted by convention, it is equivalent to. Byfalse(empty disjunction). Alternative notations are commonly used to denote clauses in the literature. A clause is often considered as a set (or multiset) of literals (the disjunction is then implicit). This is possible since disjunction is commutative and associative. Thus the meaning of a clause does not depend on the order of the literals. A clause may be also considered as an implication, by regrouping the negative literals before an implication sign:¬φ1∨ ∨ ¬φn∨ψ1∨ ∨ψmis equivalent toφ1∧ φn⇒ψ1∨ ∨ψm. For instance the clause¬P(x)∨Q(x y)∨ ¬R(y) may be written{¬P(x) Q(x y)¬R(y)}orP(x)∧R(y)⇒Q(x y Prolog). In notation, this is written:Q(x y) :−P(x) R(y). 2.2 Transformation into Clausal Form It is possible to transform any formulaφinto a set (conjunction) of clausesS that is sat-equivalent toφ that such, i.e.φis satisfiable iffSis satisfiable.Sis called aclausal formofφ(the clausal form is non unique). In order to transform formulae into sets of clauses, one has to: 1. Eliminate all occurrences of the logical symbols⇒and⇔. Fortunately these symbols can easily by expressed using∨and∧only.