Learning thematic role relations for lexical semantic nets [Elektronische Ressource] / von Andreas Wagner
275 pages
English

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Learning thematic role relations for lexical semantic nets [Elektronische Ressource] / von Andreas Wagner

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus
275 pages
English
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

Description

LEARNING THEMATIC ROLE RELATIONS FORLEXICAL SEMANTIC NETSvonANDREAS WAGNERPhilosophische Dissertationangenommen von der Neuphilologischen Fakult tder Universit t T bingenam 8. Dezember 2004T bingen2005Gedruckt mit Genehmigung der Neuphilologischen Fakult tder Universit t T bingenHauptberichterstatter: Prof. Dr. Erhard HinrichsMitberichterstatter: Prof. Dr. Martin VolkDekan: Prof. Dr. Joachim KnapeAcknowledgementsThis work would not have been possible without the intellectual and/or moral support of a number ofpersons.First of all, I wish thank my main supervisor Prof. Erhard Hinrichs and my second supervisor Prof.Martin Volk. Erhard Hinrichs was the person who made this thesis possible by accepting me as a PhDstudent in the Graduiertenkolleg Integriertes Linguistik-Studium at the University of T bingen. Hissupervision was characterised by a spirit of great open-mindedness and benevolence, especially dur-ing demanding periods of my work. Martin Volk kindly agreed to take on the job of being secondsupervisor, at a point where my research was already at quite an advanced state. He was thus facedwith the dif culty of becoming familiar with a sophisticated system of interdependent data types, pro-cessing approaches, and motivations behind these. The necessity of remote communication betweenT bingen and Stockholm introduced a further challenge. Consequently, Martin gave invaluable hintsfrom an outside perspective .

Informations

Publié par
Publié le 01 janvier 2005
Nombre de lectures 12
Langue English
Poids de l'ouvrage 1 Mo

Extrait

LEARNING THEMATIC ROLE RELATIONS FOR
LEXICAL SEMANTIC NETS
von
ANDREAS WAGNER
Philosophische Dissertation
angenommen von der Neuphilologischen Fakult t
der Universit t T bingen
am 8. Dezember 2004
T bingen
2005Gedruckt mit Genehmigung der Neuphilologischen Fakult t
der Universit t T bingen
Hauptberichterstatter: Prof. Dr. Erhard Hinrichs
Mitberichterstatter: Prof. Dr. Martin Volk
Dekan: Prof. Dr. Joachim KnapeAcknowledgements
This work would not have been possible without the intellectual and/or moral support of a number of
persons.
First of all, I wish thank my main supervisor Prof. Erhard Hinrichs and my second supervisor Prof.
Martin Volk. Erhard Hinrichs was the person who made this thesis possible by accepting me as a PhD
student in the Graduiertenkolleg Integriertes Linguistik-Studium at the University of T bingen. His
supervision was characterised by a spirit of great open-mindedness and benevolence, especially dur-
ing demanding periods of my work. Martin Volk kindly agreed to take on the job of being second
supervisor, at a point where my research was already at quite an advanced state. He was thus faced
with the dif culty of becoming familiar with a sophisticated system of interdependent data types, pro-
cessing approaches, and motivations behind these. The necessity of remote communication between
T bingen and Stockholm introduced a further challenge. Consequently, Martin gave invaluable hints
from an outside perspective . Both supervisors provided comprehensive comments on the rst drafts
of the individual chapters, which helped to enhance the readability and comprehensibility of the nal
text. Last but not least, they managed to write their of cial reports about this thesis within the shortest
time conceivable so that its defence could take place as soon as ve weeks after its submission.
Apart from my supervisors, numerous other people inside and outside of T bingen have supported me
by discussing my research, giving feedback on my talks, or providing technical aid. I am grateful to
Steve Abney and Marc Light, who worked here when I started my work. Steve used to thoroughly read
and comment on my exposØs which comprised my initial efforts to delimit the appropriate aim and
scope of the thesis. From him, I have learned the fundamental terms and techniques of statistical NLP.
Steve and Marc provided me with the program code they had implemented and the data they used for
selectional preference acquisition within their research project. In this way, I could adopt those parts
of their implementation which were independent of the actual acquisition method. This substantially
facilitated the setup of my early experiments. In the meantime, Steve has made his CASS software
package publicly available, which also contains a module for ef ciently collecting and processing
co-occurrence statistics. I have employed this module for all experiments related to this thesis. I also
thank the members of the Stuttgart/T bingen reading group on statistical NLP, namely Stefan Riezler,
Glenn Carroll, Marc Light, Detlef Prescher, and Helmut Schmid. This group was an invaluable forum
for improving our expertise and also each discussing our own work. More general discussions, which
helped me to think out of the box and locate my research within general linguistics, took place in
the Graduiertenkolleg. In particular, I would like to thank Petra Gretsch, Laura Kallmeyer, Anke
L deling, and Doris Stolberg for these discussions and, more importantly, for their friendship.
After my time in the Graduiertenkolleg, I joined the projects EuroWordNet-II and GermaNet at the
Seminar f r Sprachwissenschaft in T bingen. This environment signi cantly broadened my perspec-
tive on resources like WordNet and EuroWordNet, which were essential to nally x the scope and
the limits of this thesis. Special thanks go to Claudia Kunze, Karin Naumann, and Lothar Lemnitzer
for valuable discussions about wordnets and lexical acquisition in general. Moreover, the participation
in EuroWordNet was crucial for me to have unrestricted access to the EWN database, which might
well have have been too costly otherwise.
When EuroWordNet-II was about to come to an end, I joined the Collaborative Research Centre
Linguistic Data Structures (SFB 441) in T bingen, where my duties and responsibilities covered
very interesting aspects of corpus linguistics, which, however, hardly overlapped with the work of
this thesis. I nevertheless enjoyed the inspiring and stimulating atmosphere and, in particular, the
isolidarity of the numerous fellow sufferers who worked on their own PhD theses. Among these, I
would like to give special thanks to Sandra K bler, who was always ready to discuss my ideas and to
listen to my professional and personal complaints. Moreover, she proof-read large parts of the nal
text. In general, I am grateful to all my colleagues for their patience and moral support during periods
when I was completely occupied by my PhD project. This especially holds for Dirk Wiebel and Reiner
Link, who came to my aid by taking over tasks of mine (primarily related to system administration)
in such phases, and to my boss Prof. Marga Reis, who persistently urged me (as well as all other staff
members in the same situation) not to neglect PhD research despite our everyday duties.
I am very grateful to a number of people outside of T bingen for valuable discussions and feedback,
which greatly helped me to develop and clarify my ideas. In the rst place, I have to mention Diana
McCarthy here. We had the opportunity to discuss our PhD projects in great detail and very pro-
ductively. Our research interests were so similar that we were able to exchange our ideas at a very
ne-grained level, yet our ultimate goals were so different that there was no danger of counterpro-
ductive rivalry. For me, this cooperation was a stroke of luck. I thank Gerald Gazdar for arranging the
contact with Diana. With Philip Resnik, I had an interesting email exchange concerning the general
task of this thesis. I also thank Sabine Schulte im Walde for providing me with the training data which
I used for my detailed evaluation experiments.
Regarding nancial support, I thank the Deutsche Forschungsgemeinschaft (DFG) and the European
Union. Without the funding granted by these institutions, it would not have been possible to accom-
plish this work.
Special thanks go to the people from my circle of friends and relatives who have intensively accom-
panied my PhD project, sharing my pleasures and sufferings and reminding me that there are still
other things in the world than corpora and statistical NLP. In this regard, I would like to empha-
sise my T bingen friends, especially Friedhelm Panteleit, Christian H ppler, Wolfgang Huber, Diana
Marquardt, Susanne Neuh usler, Norbert Tausch, and Esther Wedeniwski, who were co-singers of
mine in the KHG choir, and, most notably, my parents and my sister Martina. Finally, I am partic-
ularly grateful to the most important person in my life, Regina. Although we did not meet until the
latest stages of this thesis, her love, cheerfulness, and patience provided a crucial contribution to my
eventually nishing it.
iiSummary
This thesis presents a strategy for the acquisition of thematic role relations (such as AGENT,
PATIENT, or INSTRUMENT) by means of statistical corpus analysis, for the purpose of semi-
automatically extending lexical-semantic nets. In particular, this work focuses on resources in the
style of WordNet (Fellbaum 1998) and EuroWordNet (Vossen 1999). Lexical-semantic nets represent
the meanings of words via semantic relations between words and/or word concepts. Semantic (the-
matic) role relations are conceptual which hold between verbs and their nominal arguments
(e.g. <eat> AGENT?<human> or <eat> PATIENT?<food>). Such relations capture selectional
restrictions of verbs. Therefore, the task of acquiring thematic role relations is intrinsically related to
the task of acquiring selectional restrictions.
Consequently, the core of a strategy for learning role relations consists in a method for learning
selectional restrictions (or, more precisely, selectional preferences). For the latter task, a number of
methods have been proposed which utilise syntactically analysed corpora and WordNet. To acquire
the selectional preferences of a certain verb for a certain argument, the respective complement nouns
of that verb are extracted from the corpus, and statistical methods are applied to generalise over
these nouns; these generalisations are expressed as a set of WordNet noun concepts. One of these
approaches, namely the method proposed by (Abe & Li 1996), constitutes the starting point of my
research. However, this approach is not immediately applicable for learning role relations, but requires
modi cations and extensions for that task. In particular, two aspects have to be taken into account.
Firstly, it is crucial that the WordNet concepts acquired to represent selectional preferences of a verb
are located at an appropriate level of generalisation (e.g. <food> as PATIENT of <eat>, rather
than <cake> or <physical_object>). I develop a modi cation of the approach which substantially
improves its performance in this respect. Secondly, as the existing methods generalise over syntactic
complements, they acquire selectional preferences for syntactic rather than semant

  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents