Cet ouvrage fait partie de la bibliothèque YouScribe
Obtenez un accès à la bibliothèque pour le lire en ligne
En savoir plus

AN EFFICIENT GRAPH ALGORITHM FOR DOMINANCE CONSTRAINTS

22 pages
AN EFFICIENT GRAPH ALGORITHM FOR DOMINANCE CONSTRAINTS? ERNST ALTHAUS† , DENYS DUCHIER‡ , ALEXANDER KOLLER , KURT MEHLHORN† , JOACHIM NIEHREN‡ , AND SVEN THIEL† Abstract. Dominance constraints are logical descriptions of trees that are widely used in computational linguistics. Their general satisfiability problem is known to be NP-complete. Here we identify normal dominance constraints and present an efficient graph algorithm for testing their satisfiablity in deterministic polynomial time. Previously, no polynomial time algorithm was known. 1. Introduction. The dominance relation of a tree is the ancestor relation be- tween its nodes. Dominance constraints are logical descriptions of trees talking about the dominance relation. Dominance based tree descriptions were first used in automata theory in the six- ties [TW67], rediscovered in computational linguistics in the early eighties [MHF83], and investigated from a logical point of view in the early nineties [BRVS95]. Since then, they have found numerous applications in computational linguistics: they have been used for grammar formalisms [VS92, RVSW95, DT99, Per00], in natural lan- guage semantics [Mus95, ENRX98], and for discourse analysis [GW98]. The two most important computational tasks for dominance constraints are sat- isfiability testing – does the constraint describe a tree? – and enumerating solu- tions, i.e. the described trees. But as shown recently [KNT01], testing satisfiability is an NP-complete problem.

  • variable names

  • dominance constraints

  • annotate variable

  • let ?

  • problem

  • labeled variables

  • normal dominance

  • identify normal

  • algorithm back

  • graph algorithm


Voir plus Voir moins

AN EFFICIENT GRAPH ALGORITHM
FOR DOMINANCE CONSTRAINTS
y z x yERNST ALTHAUS , DENYS DUCHIER , ALEXANDER KOLLER , KURT MEHLHORN ,
z yJOACHIM NIEHREN , AND SVEN THIEL
Abstract. Dominance constraints are logical descriptions of trees that are widely used in
computational linguistics. Their general satis abilit y problem is known to be NP-complete. Here
we identify normal dominance constraints and present an e cien t graph algorithm for testing their
satis ablit y in deterministic polynomial time. Previously, no polynomial time algorithm was known.
1. Introduction. The dominance relation of a tree is the ancestor relation be-
tween its nodes. Dominance constraints are logical descriptions of trees talking about
the dominance relation.
Dominance based tree descriptions were rst used in automata theory in the six-
ties [TW67], rediscovered in computational linguistics in the early eighties [MHF83],
and investigated from a logical point of view in the early nineties [BRVS95]. Since
then, they have found numerous applications in computational linguistics: they have
been used for grammar formalisms [VS92, RVSW95, DT99, Per00], in natural lan-
guage semantics [Mus95, ENRX98], and for discourse analysis [GW98].
The two most important computational tasks for dominance constraints are sat-
is ability testing { does the constraint describe a tree? { and enumerating solu-
tions, i.e. the described trees. But as shown recently [KNT01], testing satis abilit y
is an NP-complete problem. Earlier attempts at processing dominance constraints
[Cor94, VSWR95, DN00] all su er from this fact. This has shed doubt on their
practical usefulness.
In this article, we identify normal dominance constraints, a natural subclass of
dominance constraints whose restrictions should be unproblematic for many applica-
tions. We present an e cien t graph algorithm that decides satis abilit y of normal
dominance constraints in deterministic polynomial time. Previously, no polynomial
time algorithm was known.
We derive the graph algorithm for testing satis abilit y as follows. First, we in-
troduce dominance graphs and de ne their con guration problem (investigated in
+[ADK 01]). Second, we show that the con gurabilit y of dominance graphs is lin-
ear time equivalent to the satis abilit y of normal dominance constraints ( rst shown
in [KMN00]). Third, we characterize the y of dominance graphs as the
absence of certain cycles, which we nally test for by reduction to a matching problem.
We also discuss how to use the e cien t satis abilit y test to enumerate solutions.
We apply a choice rule exhaustively while checking for satis abilit y after each step.
Both procedures have been implemented in C++ using the LEDA library [MN99] and
applied to scope ambiguities in natural language semantics in the CLLS framework
[ENRX98, EKN01, EKN02].
+This article joins the results from two conference publications [ADK 01] and [KMN00]. The
authors are partially supported by the IST Programme of the EU under contract number IST-
1999-14186 (ALCOM-FT), and by the Collaborative Research Centre (SFB) 378 of the Deutsche
Forschungsgemeinschaft.
yMax-Planck-Institute for Computer Science, Saarbruc ken, Germany
zProgramming Systems Lab, Fachbereich Informatik, Universit at des Saarlandes, Saarbruc ken,
Germany
xDepartment of Computational Linguistics, Universit at des Saarlandes, Saarbruc ken, Germany
18x 9 y 2
! ^
linguist lang
x y
speak
x y
Fig. 2.1. A simple dominance constraint.
8x 9 y 2
! ^
linguist 9 y lang 8x 2
x ^ y !
lang speak linguist speak
x y x y
y x
Fig. 2.2. Readings represented by the constraint in Fig. 2.1.
To complement our results, we nally investigate a close variant of the con gura-
tion problem of dominance graphs where closed leaves are permitted in addition. This
variant is more general but also relevant for applications in computational linguistics
[Bos96, CFS97]. We show that con gurabilit y is already NP-complete for the more
general dominance graphs. Nevertheless, the presented algorithms can still help to
solve this alternative problem more e cien tly.
Plan of the paper. The rst part of the paper introduces dominance constraints:
We motivate using them in computational linguistics in Section 2; then we de ne
them in Section 3, discuss their satis abilit y problem, and introduce the concepts of
normal dominance constraints and of solved forms. In the second part of the pa-
per, we turn to a discussion of dominance graphs. We de ne them and relate them
to normal dominance constraints in Section 4. Section 5 presents a basic algorithm
for enumerating the solved forms of a dominance graph. Then we derive the above-
mentioned characterization of con gurabilit y in Section 6, show how to test for this
property e cien tly in Section 7, and plug this e cien t algorithm into the enumera-
tion algorithm in Section 8. In the nal part of the article, we apply this e cien t
enumeration algorithm back to normal dominance constraints and discuss an imple-
mentation (Section 9), and prove that the more general con gurabilit y with closed
leaves is NP-complete (Section 10). Section 11 concludes and discusses further work.
2. Motivation. As one of the many applications of dominance constraints in
computational linguistics, we will give a brief introduction to scope underspeci c ation
[EKN01, AC92, Rey93, Bos96].
This application is concerned with coping with ambiguous sentences such as the
following:
(2.1) Every linguist speaks two languages.
Sentence (2.1) is ambiguous because it has two di eren t possible meanings, indi-
cated by the continuations
(2.2) . . .namely, English and Chinese.
2say
john
9u 8w 9x 9y 8z
! ! ^ ^ !
comp ^ ^ ^ prod
dept repr spl
u z
w x y
in of see of
w u x w x y y z
Fig. 2.3. A dominance constraint describing the meaning of (2.6).
(2.3) . . .but not necessarily the same ones.
In the rst reading, each linguist must speak the same two languages. In the second, no
two linguists necessarily speak a common language, but each speaks at least two. We
can represent the two possible meanings logically as the following rst-order formulas,
which can be represented as trees as in Fig. 2.2.
(2.4) 8x:(linguist(x)!9 y:lang(y)^ speak(x; y))2
(2.5) 9 y:lang(y)^8x:(linguist(x)! speak(x; y))2
Ambiguity is a real problem to language processing because the number of read-
ings of a sentence grows quickly with the number of \quanti ers" such as \every
linguist" and \two languages", and interacts with other sources of ambiguity besides.
5The sentence (2.6) has already 56 readings, and larger examples are easy to construct.
(2.6) John says that some representative of every department in a company saw a
sample of each product.
The key observation to scope underspeci cation is that the di erences between
the readings are very systematic; all contain the same \semantic material" (e.g. repre-
sentations of the constituents \every linguist", \two languages", and \speak"), which
is only combined in di eren t ways. The constraints on these combinations can be
speci ed using dominance constraints.
An example is Fig. 2.1. This constraint graph is a description of the two readings
of (2.1), shown in Fig. 2.2; it can be seen as a graphical representation of a dominance
constraint. Similarly, the 56 readings of (2.6) can beted by the graph in
Fig. 2.3. In the paper, we will use constraint graphs to link the (logic) work on
dominance constraints to graph algorithms.
Pictures as in Fig. 2.3 are being drawn in most modern approaches to scope
underspeci cation. However, they are not always interpreted as dominance constraints
[Rey93, Bos96]. The subtle di erence in meaning has the surprising e ect of making
these other approaches NP-complete even when the graphs fall into the class where
dominance constraints have polynomial satis abilit y. We will show this in Section 10.
3. Satis abilit y of Dominance Constraints. In this section, we de ne the
the language of dominance constraints and recall known results on satis abilit y. The
variant of dominance constraints we employ describes constructor trees { ground
5The following sentence from [Hob83], which is interesting both in form and in content, has around
200 readings: \Many people feel that most sentences exhibit too few quanti er scope ambiguities for
much e ort to be devoted to this problem, but a casual inspection of several sentences from any text
should convince almost everyone otherwise."
3terms over a signature of function symbols { rather than feature trees as considered
in [Smo92, BRVS95, ST94, MNP00].
3.1. Trees and Constructor Trees. We assume a nite or in nite signature
with function symbols f; g; : : :, each of which is equipped with an arity ar(f) 0.
Constants are function symbols of arity 0 denoted by a; b. We assume that contains
at least one constant and one symbol of arity at least 2.
A constructor tree can be de ned either as a term or equivalently on the basis
of directed graphs. The ground term f(g(a; a)), for instance, corresponds to the
directed graph in Figure 3.1. Throughout this article, we will employ the graph based
de nition.
An (unlabeled) tree is a forest with exactly one root. f
A forest is a nite directed graph (V; E) where V is a
g nite set of nodes denoted by u; v; w, and E V V
a set of edges such that the indegree of each node is at a a
most 1 and there is no cycle. Each forest has at least one
Fig. 3.1. f(g(a; a))
root, i.e. a node with indegree 0. We call the nodes with
outdegree 0 the leaves of the forest.
A ( nite) constructor tree is a triple (V; E; L) consisting of a tree (V; E) and a
labeling function L : E[V ! [N s.t. L(E)N (edge labels) and L(V ) (node
labels). The edge labels in a constructor tree determine the order of the children of
a node: for each node u 2 V and each natural number 1 k ar(L(u)), there is
exactly one edge (u; v)2 E with L((u; v)) = k.
We draw constructor trees as in Figure 3.1, by annotating nodes with their labels
and ordering the edges such that their labels increase from left to right.
3.2. Constraint Language. The language of dominance constraints is a logical
language that is interpreted over the class of tree structures. Tree structures are
rst-order model structures which specify certain relations between the nodes of a
construtor tree.
Let = (V; E; L) be a constructor tree with nodes u; v; v ; : : : v 2 V . The1 n
dominance relation uC v holds in i there is a path from u to v; the labeling relation
u:f(v ; : : :; v ) holds in i u is labeled by the n-ary symbol f and has the children1 n
v ; : : :; v in this order; that is, L(u) = f, ar(f) = n, f(u; v ); : : :; (u; v )g E, and1 n 1 n
L((u; v )) = i for all 1 i n.i
Definition 3.1 (Tree Structure). The tree structure of a constructor tree with
node set V is a rst-or der structure with domain V which provides the dominance
relationC of and the labeling relation of for each function symbol f 2 .
Let Vars be an in nite set of (node) variables X; Y; Z; : : : A dominance constraint
’ is a conjunction of dominance, inequality, and labeling literals of the following form
where ar(f) = n:
0 ’ ::= ’^ ’ j XC Y j X=Y j X:f(X ; : : :; X )1 n
We freely identify a constraint with the set of its literals. Let Var(’) be the set of
variables of ’. A pair of a tree structure with node set V and a variable assignment
: Var(’)! V satis es ’ i it satis es all its literal in the obvious way. We say that
(; ) is a solution of ’ in this case; ’ is satis able if it has a solution.
For instance, the following constraint that happens to be unsatis able:
X:f(X ; X )^ X C Y ^ X C Y:1 2 1 2
4
6It requires that node values of X and X are sisters that are both ancestors of1 2
the node value of Y . This is clearly impossible in a tree, since trees cannot branch
upwards.
3.3. Constraint Graphs. We usually draw a dominance constraint as a con-
straint graph. For instance, the unsatis able constraint from above is drawn in Figure
3.2. It illustrates clearly that the constraint requires upward branching and thus
cannot be satis ed by any tree.
The nodes of a constraint graph are the
f X
variables of the corresponding constraint. La-
X Xbeling constraints relate to solid edges called 1 2
tree edges. Dominance constraints are drawn Y
as dashed lines called dominance edges. As for
trees, we annotate labels to nodes of the graph Fig. 3.2. The unsatis able constraint
X:f(X ; X )^ X C Y ^X C Y1 2 1 2and order tree edges from left to right. Note
that we ignore inequalities in constraint graphs.
We sometimes annotate variable names to the graph nodes. This is not always
necessary since all occurences of the same variable are always represented by a single
node in a constraint graph. We may thus freely omit variable names. In the motivating
example (Figure 2.3), for instance, we have omitted all variable names.
Constraint graphs motivate the following notions to talk about constraints. We
call a variable X labeled in a constraint ’ if there exists a literal X:f(: : :) in ’. A
(solid) fragment of constraint ’ is a maximal set of variables in ’ that are pairwise
connected by labeling literals. A variable X is called a root of a fragment in ’ if it
does not occur in child position of a labeling literal in ’, i.e. if there is no Z such that
Z:f(: : :X : : :) belongs to ’. A hole of a fragment is a variable in ’ that is unlabeled
in ’. A leaf of a fragment is either a hole or a variable labeled by a constant, i.e. a
variable X with X:a in ’.
3.4. Satis abilit y. We are interested in two natural problems concerning dom-
inance constraints that are both motivated by our application: rst of all we would
like to test satis ablit y, and second, we would like to enumerate all solutions of a
satis able dominance constraint.
The complexity of the satis abilit y problem of dominance constraints was inves-
tigated in [KNT01] and shed doubts on their usefulness.
Theorem 3.2. Satis ablitiy of dominance constraints is NP-complete.
Deciding satis abilit y in non-deterministic polynomial time is quite simple: In a rst
step one guesses whether XC Y or :XC Y for each two variables X; Y in a given
constraint. In a second step, one tests the consistency of these relationships.
The NP-hardness proof relies on the fact that solid fragments of a constraint
graph may overlap in a solution. This means that distinct labeled variables may be
assigned to the same node of a tree.
For illustration, consider the constraint X:f(X ; X ) ^1 2
Y fY :f(Y ; Y ) ^ YC X ^ XC Y whose graph is shown in1 2 1 X
fFigure 3.3. Every solution must map X to the same node
Y Y1 2as either Y or Y . We say that X overlaps with Y or Y in1 1
X X
1 2a solution of this constraint.
We call an overlap proper if it involves two labeled vari-
Fig. 3.3. Overlap
ables. In the applications in computational linguistics, we
typically do not want proper overlap (but may accept overlaps of roots with holes).
5The subclass of dominance constraints that excludes proper overlap (and xes some
minor inconveniences) is the class of normal dominance constraints.
3.5. Normal Dominance Constraints. We next distinguish a fragment of
dominance constraints which we will show to have a polynomial time satis abilit y
problem.
Definition 3.3 (Normal dominance constraint). A dominance constraint ’ is
called normal i for all variables X; Y 2 Var(’):
1. There is no proper overlap in solutions of ’: X=Y in ’ if X and Y are
distinct variables that are labeled in ’.
2. Solid fragments are tree shaped or cyclic: every variable in ’ appears at most
once as a parent and at most once as a child in a labeling literal of ’.
3. Dominance edges go from holes to roots: if XC Y in ’ then X is unlabeled
in ’ whereas Y is labeled but does not occur in child position in ’.
4. There are no empty fragments: every hole of ’ occurs in some child position.
Conditions 1 and 4 say that only roots and holes have the permission to overlap
in a solution of a normal constraint. Distinct holes cannot overlap since they must
have distinct parents, which are labeled variables that cannot overlap. For a similar
reason, it is impossible that a hole overlaps with a labeled node that has a parent.
Condition 2 requires acyclic fragments to be tree shaped. This excludes many
constraints, as for instance X:f(Y; Y ), X:f(Y ; Y )^X:f(Z ; Z ) or Y :f(X)^Z:f(X).1 2 1 2
The last two examples are particularly di cult to treat when subsumed by a larger
constraint: they entail equations (Y =Z , Y =Z , respectively Y =Z ) whose global1 1 2 2
consequences are di cult to predict. W.l.o.g we can always restrict ourselves to
normal constraint with acyclic fragments. Other constraints are unsatis able anyway.
Condition 3 forbids to express equality through two side dominance: XC Y ^
YC X is not normal since a variable cannot be at the same time a root and a hole.
Condition 3 is also violated by the dominance edge XC Y in the constraint from1
overlap example in Fig. 3.3. It goes from a root to a hole, instead vice versa.
In the following theorem we state the main result of this article, which will follow
from the results presented in the succeeding sections.
Theorem 3.4. Satis ability of normal dominance constraints can be decided in
deterministic polynomial time.
3.6. Solved Forms. We stated above that we would like to have an algorithm
that enumerates all solutions of a given normal dominance constraint. Interpreting
this proposition literally makes not much sense as the reader already might have
noticed. For instance, we can solve the constraint X:a by all trees that have a node
labeled by a. Indeed, every satis ablet has an in nite number of solutions,
so that we probably do not want to enumerate all of them.
What we want to do is to enumerate all solved forms of a normal dominance
constraint instead of all solutions. The idea behind a solved form is that it should
be similar to a solution but not describe its irrelevant parts. For instance, X:a is a
perfect solved form since all its solutions can be easily read o from this constraint.
We will now de ne an appropriate notion of solved forms. In particular, it should
hold that a normal dominance constraint has a solution if and only if it has a solved
form. Given a constraint ’ we de ne a relation R on the variables of ’ that we call’
the reachability relation of ’. This relation is the transitive closure of the following
relation:
f(X; Y )j X:f(: : : ; Y; : : :)2 ’ or XC Y 2 ’g
6
6We say that Y can be reached from X if (X; Y ) 2 R . In this case, it clearly holds’
that ’ entails the dominance XC Y .
Definition 3.5 (Solved Form). A normal dominance constraint ’ is in solved
form if it satis es the following two properties for all variables X; Y; Z in Var(’):
1. Dominance edges do not branch upwards: if X and Y are distinct then not
both XC Z in ’ and YC Z in ’.
2. The graph of ’ is acyclic: (X; X)62 R .’
In other words, a normal dominance constraint ’ is in solved form if and only if
its constraint graph is a forest.
0A solved form of a normal constraint ’ is a normal constraint ’ that is in solved
form, contains the same labeling literals as ’, and has a stronger reachability relation,
which means R R 0.’ ’
Lemma 3.6. Every normal dominance constraint in solved form has a solution.
Proof. We have to construct a tree solution for a solved form ’. The idea is
that the constraint graph of a solved form is already a forest. It is su cien t to
transform this forest into a tree without dominance edges. This is quite simple given
the transformation illustrated in Figure 3.4. (Note that we assumed to contain a
function symbol f of arity at least two and a constant a.)
In the rst step, we turn the forest into a tree by adding a top most fragment.
Let Y ; : : :; Y be the minimal elements in the reachability order R which exist since1 m ’
R is acyclic. We can then de ne a new solved form ’ with a top most fragment,’ 1
by adding a new fragment to ’ with a single hole with dominance edges towards all
roots Y ; : : :; Y .1 m
In the second step we repeatedly transform dominance edges into tree edges. We
stop, once no dominance edge is left. The idea is illustrated in Figure 3.4. Recall
that we assumed that our signature contains a constant a and a function symbol
f of arity n 2. We will also use a function dist on constraints, which is de nedV
0 0 0 0for all constraints ’ by dist(’ ) = ’ ^ fX = Y j X; Y 2 Var(’ ) distinctg.
Suppose that there still exists a hole X in ’ from where dominance edges start.1
Let Y ; : : :; Y be all the roots of ’ such that the dominance literal XC Y belongs1 m 1 i
to ’ for i = 1; : : :; m. We construct ’ by removing all these literals from ’ and1 2 1
distinguish two cases:
If m > n, we x a fresh variable Z and de ne:
^m
’ = dist(’ ^ X:f(Y ; : : : ; Y ; Z)^ ZC Y )3 2 1 n 1 i
i=n
’ is a solved form that entails ’, and it contains n 1 dominance literals3
less than ’ .1
If m n, we x fresh variables Z ; : : :; Z and de ne:m+1 n
^n
’ = dist(’ ^ X:f(Y ; : : :; Y ; Z ; : : :; Z )^ Z :a)3 1 1 m m+1 n j
j=m+1
’ is a solved form that entails ’, and it contains m dominance literals less3
than ’ .1
By applying the above transformation repeatedly, we obtain a solved form ’ which
entails ’ and contains no dominance literals.
In the third step we can easily satisfy ’ by the constructor tree that corresponds
to ’ itself.
In the construction of a solution of ’ we had to "invent" variables that are not
present ’. Thus the constructor tree in the solution contains nodes that do not
7
6f X
X
f
Y Y1 2
a
Y Y Y Y1 2 3 4 Y Y3 4
Fig. 3.4. Transforming dominance edges into tree edges. Here, 4 dominance edges are trans-
formed while using a function symbol of arity 3.
correspond to variables of ’. In the following lemma we will show that every satis -
able normal constraint has a solved form and that it can essentially be obtained by
removing the invented material.
Lemma 3.7. Every solution of a normal dominance constraint ’ also satis es
some solved form of ’.
Proof. Let (; ) be a solution of ’. In order to construct a solved form, we de ne
a partial function hole on the root variables of ’. Consider a root Y , the function
is de ned if there is a hole X with (X)C (Y ). Since is a tree, there is hole Z
6such that (Z)C (Y ) and (X)C (Z) for all holes X with (X)C (Y ) . We set
hole(Y ) = Z. Let ’ denote the conjunction of the labeling and inequality literals ofl
’, then the following is a solved form of ’:
^
0 ’ = ’ ^ fhole(Y )C Y j Y is a root for which hole is de ned gl
0
0Clearly, ’ is a normal constraint in solved form. We have to show R R .’ ’
Since both constraints have the same labeling literals, it su ces to prove for every

0dominance literal XC Y in ’ that (X; Y )2 R . We will show a stronger statement:’
if X is a hole and Y is a root with (X)C (Y ), then (X; Y )2 R 0.’
We proceed by induction on the length of the path from (X) to (Y ). For
Z = hole(Y ), we have (X)C (Z). If X = Z, the claim holds. Otherwise, let R
denote the root of the fragment of Z. The hole X can only overlap with a root, so
we get (X)C (R). As (R) = (Z), we can apply the induction hypothesis and
obtain (X; R) 2 R 0. Since (R; Z) and (Z; Y ) belong to R 0, the claim follows from’ ’
the transitivity of the reachability relation.
The validity of Lemma 3.7 heavily depends on the f X
absence of proper overlaps (Condition 1 of Def. 3.3).
X X 1 2This is illustrated by the following example:
f Y2^
X:f(X ; X )^Y :f(Y ; Y )^ (XC Z ^YC Z ^Z :a)1 2 1 2 i i i i i Y Y 1 2
i=1
a Z a Z1 2
This constraint satis es all normality conditions except
for the overlap restriction. It also has a solution but no
solved form. The reason for this problem is that X and Y overlap properly in all
solutions of this constraint.
The combination of Lemmas 3.6 and 3.7 yields the following proposition, which
justi es computing with solved forms instead of solutions:
Proposition 3.8. A normal dominance constraint has a solved form if and only
if it is satis able.
6So either (Z) = (Y ) (i.e. Y is plugged into Z) or (Z) is the lowest proper ancestor of (Y )
1for which is de ned (i.e. which is not "invented").
8
6






































Un pour Un
Permettre à tous d'accéder à la lecture
Pour chaque accès à la bibliothèque, YouScribe donne un accès à une personne dans le besoin