partition
8 pages
English
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres
8 pages
English
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

Description

6Incrementalgenerationofpluraldescriptions: SimilarityandpartitioningAlbertGatt and KeesvanDeemterDepartmentofComputingScienceUniversityofAberdeen{agatt,kvdeemte}@csd.abdn.ac.ukAbstract TYPE COLOUR ORIENTATION SIZE X Ye desk red back small 3 11Approaches to plural reference generation empha e sofa blue back small 5 22sise simplicity and brevity, but often lack em e desk red back large 1 13pirical backing. This paper describes a corpus e desk red front large 2 34based study of plural descriptions, and proposes e desk blue right large 2 45a psycholinguistically motivated algorithm for plu e sofa red back large 4 16ral reference generation. The descriptive strategy e sofa red front large 3 37e sofa blue back large 3 2is based on partitioning. An exhaustive evaluation 8showsthattheoutputcloselymatcheshumandata.Table1: Avisualdomain1 Introductiondistractors [1.4]. The description and the distractorContent Determination for the Generation of Re setC areupdatedaccordingly[1.5–1.6],andthede ferring Expressions (GRE) starts from a Knowledgescriptionreturnedifitisdistinguishing[1.7].Base (KB) consisting of a set of entitiesU and a setCompared to some predecessors which empha of propertiesP represented as attribute value pairs,sised brevity (Dale, 1989), the IA is highly effi and searches for a descriptionD⊆ P which distin cient, because the use of thePO avoids exhaus guishes a referent r∈ U from its distractors. Fortive combinatorial search, potentially ...

Informations

Publié par
Nombre de lectures 110
Langue English

Extrait

6
Incrementalgenerationofpluraldescriptions: Similarityandpartitioning
AlbertGatt and KeesvanDeemter
DepartmentofComputingScience
UniversityofAberdeen
{agatt,kvdeemte}@csd.abdn.ac.uk
Abstract TYPE COLOUR ORIENTATION SIZE X Y
e desk red back small 3 11Approaches to plural reference generation empha
e sofa blue back small 5 22sise simplicity and brevity, but often lack em
e desk red back large 1 13
pirical backing. This paper describes a corpus e desk red front large 2 34
based study of plural descriptions, and proposes e desk blue right large 2 45
a psycholinguistically motivated algorithm for plu e sofa red back large 4 16
ral reference generation. The descriptive strategy e sofa red front large 3 37
e sofa blue back large 3 2is based on partitioning. An exhaustive evaluation 8
showsthattheoutputcloselymatcheshumandata.
Table1: Avisualdomain
1 Introduction
distractors [1.4]. The description and the distractorContent Determination for the Generation of Re
setC areupdatedaccordingly[1.5–1.6],andthede ferring Expressions (GRE) starts from a Knowledge
scriptionreturnedifitisdistinguishing[1.7].Base (KB) consisting of a set of entitiesU and a set
Compared to some predecessors which empha of propertiesP represented as attribute value pairs,
sised brevity (Dale, 1989), the IA is highly effi and searches for a descriptionD⊆ P which distin
cient, because the use of thePO avoids exhaus guishes a referent r∈ U from its distractors. For
tive combinatorial search, potentially overspecify example,the KB inTable1represents8entitiesina
ing the description. Overspecification and the use2D visualdomain,eachwith 6attributes,including
of aPO have been justified on psycholinguistictheir location, represented as a combination of hor-
grounds. Speakers overspecify their descriptionsizontal (X) and vertical (Y) numerical coordinates.
because they begin their formulation without ex To refer to an entity an algorithm searches through
haustively scanning a domain (Pechmann, 1989),valuesofthedifferentattributes.
terminating the process as soon as a referent is dis
GRE has been dominated by Dale and Reiter’s
tinguished (Belke and Meyer, 2002). They pri (1995) Incremental Algorithm (IA), one version
oritise the basic level category ( TYPE) of an ob of which, generalised to deal with non disjunctive
1 ject, and salient, absolute properties like COLOURplural references is shown in Algorithm 1 (van
Deemter, 2002). After initialising the description
Algorithm1 IA (R,U,PO)plurD and the distractor set C [1.1–1.2], IA tra plur
1: D←∅verses an ordered list of properties, called the pref
2: C←U−R
erence order (PO) [1.3], which reflects general or
3: forhA : vi∈PO do
domain specific preferences for attributes. For in
4: if R⊆ [hA : vi ]∧[hA : vi ]−C =∅ then
stance, with thePO in the top row of the Table, 5: D←D∪{hA : vi}
the algorithm first considers values of TYPE, then 6: C←C∩[hA : vi ]
COLOUR, and so on, adding a property to D if it is 7: if[D ] =Rthen
trueoftheintendedreferentsR,andexcludessome 8: returnD
9: endif
1
Non disjunctive descriptions, such as the large red chairs, 10: endif
are logically a conjunction of literals. In disjunctive descrip
11: endfor
tions such as the chair and the table, the and represents set
12: returnD
union(ofthingswhicharechairsortables).(Pechmann, 1989; Eikmeyer and Ahlsen,` 1996), as question of whether previous psycholinguistic re
well as locative properties in the vertical dimen search on singular reference is at all applicable to
sion (Arts, 2004). Relative attributes like SIZE are thepluraldisjunctivecase.
avoidedunlessabsolutelyrequiredforidentification This paper starts with an empirical analysis of
(Belke and Meyer, 2002). This evidence suggests plural descriptions using a semantically transparent
speakers conceptualise referents as gestalts (Pech corpus of elicited in well defined do
mann, 1989) whose core is the basic level TYPE mains,ofwhichTable1isanexample. Basedonthe
(Murphy, 2002) and some other salient attributes data analysis, we propose and evaluate an efficient
like COLOUR. Note that the IA does not fully mir- algorithm for the generation of references to arbi
ror these human tendencies, since it only includes trary sets. Our starting point is the assumption that
preferred attributes in a description if they remove plurals,likesingulars,evincepreferencesforcertain
somedistractors,whereaspsycholinguisticresearch attributes. Based on previous work in Gestalt per-
suggests that people include them irrespective of ception (Wertheimer, 1938; Rock, 1983), we pro
contrastiveness (cf. van der Sluis and Krahmer, pose an extension of Pechmann’s Gestalts Princi
2005). ple, whereby plural descriptions are preferred if (a)
More recent research on plural GRE has de they maximise the similarity of their referents, us
emphasised these issues, especially in case of dis ing the same attributes to describe them as far as
junctive plural reference. The first concrete pro possible;(b)prioritisesalient(‘preferred’)attributes
posal in this area, IA (van Deemter, 2002), first whicharecentraltotheconceptualrepresentationofbool
tries to find a non disjunctive description using Al anobject. Weaddress(3)abovebyinvestigatingthe
gorithm 1. Failing this, it searches through disjunc logicalformofpluralsinthecorpus. Onestrongde
tionsofpropertiesofincreasinglength,generatinga terminant of descriptive form is the basic level cat
descriptioninConjunctiveNormalForm(CNF). For egory of objects. For example, to refer to{e ,e}1 2
example, calling the algorithm with R ={e ,e} intheTable,anauthorhasatleastthefollowingop 1 2
would result in a non disjunctive description, since tions:
both referents can be distinguished usinghSIZE :
(1) (a) thesmalldeskandsofasmalli. However, a conjunction wouldn’t suffice to
distinguishR ={e ,e},and IA wouldconsider (b) thesmallreddeskandthesmallbluesofa1 8 bool
combinations such ashTYPE : deski∨hCOLOUR : (c) thesmalldeskandthesmallbluesofa
bluei. This generalised algorithm has three conse
(d) thesmallobjects
quences:
We refer to (1a) as an aggregated disjunctive de
1. Efficiency: Searching through disjunctive
scription, in that the property small has wide scope
combinations results in a combinatorial explo
scope over the coordinate NP desk and sofa (which
sion(vanDeemter,2002).
is logically a disjunction). By contrast, (1b,c)
2. Gestalts and content: The notion of a ‘pre are non aggregated and overspecified because they
contain COLOUR when SIZE alone suffices. Theferred attribute’ is obscured, since it is dif
ficult to apply the same reasoning that moti most economical description is (1d), which is non
disjunctive. This is possible because it containsvated thePO in the IA to combinations like
a superordinate TYPE (object). Since basic level(COLOUR∨ SIZE).
categorisation is preferred on independent grounds
3. Form: Descriptionscanbecomelogicallyvery
(Rosch et al., 1976), we expect (1a–c) to be more
complex(Gardent,2002;Horacek,2004).
frequent. Notethat(1b,c)representapartitionofR
anddescribeeachelementseparately. In(1b),there
Some proposals to deal with (3) include Gar-
is considerable redundancy in including COLOUR
dent’s (2002) non incremental, constraint based al
twice. The potential benefit of this is that the el
gorithm to generate the briefest available descrip
ements of the partition are described in a parallel
tion of a set. An alternative, by Horacek (2004),
fashion,usingexactlythesameattributes(SIZE and
combines best first search with optimisation to re
COLOUR). Thisisnotthecasein(1c),whichisnon
duce logical complexity. Neither approach benefits
parallel. Byhypothesis,parallelismaddstotheper-
from empirical grounding, and both leave open the
ceptual cohesion of the set. Given the psycholin <DESCRIPTION num=‘pl’>
VS VDS num=‘singular’>
<ATTRIBUTE name=‘size’ value=‘small’>small</ATTRIBUTE> +Disj −Disj +Disj −Disj name=‘colour’ value=‘red’>red</ATTRIBUTE> name=‘type’ value=‘desk’>desk</ATTRIBUTE> +aggr 20.2 15.5 2.4 3.7
</DESCRIPTION> −aggr 64.3 – 93.9 –
and
<DESCRIPTION num=‘sg’> %overall 84.5 15.5 96.3 3.7
<ATTRIBUTE name=‘size’ value=‘small’>small</ATTRIBUTE> name=‘colour’ value=‘blue’>blue name=‘type’ value=‘sofa’>sofa Table2: %disjunctiveandnon disjunctiveplurals
</DESCRIPTION>
domains, referents were identifiable using identical
(hSIZE : smalli∧hCOLOUR : redi∧hTYPE : deski)
∨ valuesoftheminimallydistinguishingattributes. In
(hSIZE : smalli∧hCOLOUR : bluei∧hTYPE : sofai)
the remaining 6 Value Dissimilar (VDS) domains,
Figure1: Corpusannotationexamples the minimally distinguishing values were different.
Table1representsa VS domain,where{e ,e}can1 2
guistic evidence, the hypothesised tendency to em be minimally distinguished using the same value of
phasisesimilaritymayalsobesomewhatdependent SIZE (small). Thus, MD in VS was a logical con
on the attributes involved, so that COLOUR should junction. In VDS, it was a disjunction since, if two
be more likely to be redundantly propagated across referents could be minimally distinguished by dif
0disjunctsthanarelativelydispreferredattributelike ferentvaluesv andv ofanattribute A,then MDhad
SIZE. the formhA : vi∨hA : v’i. However, even in VS,
referents had different basic level types. Thus,

  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents