Text and context in machine translation

EUROPEAN-COMMISSION - Directorate-General For Translation European Commission

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

156 pages

English

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

A propos
Informations
Extrait

Description

Aspects of discourse representation and discourse processing
Information technology and telecommunications
Information - Documentation

Sujets

STUDIES IN MACHINE TRANSLATION
AND NATURAL LANGUAGE PROCESSING
\iwm
TEXT AND CONTEXT IN MACHINE TRANSLATION:
ASPECTS OF DISCOURSE REPRESENTATION
ANDE PROCESSING
Edited by
Wiebke RAMM
iDfïUUÏ rU^MMilSSiï Studies in machine translation
and natural language processing
Published by:
Office for Official Publications
of the European Communities Managing editor
Erwin Valentini (EC), Luxembourg
Editorial board
Doug Arnold
(Department of Language and Linguistics, United Kingdom)
Nicoletta Calzolari
(Istituto di Linguistica Computazionale, Italia)
Frank Van Eynde
(Nationaal Fonds voor Wetenschappelijk Onderzoek, België)
Steven Krauwer
(Rijksuniversiteit Utrecht, Nederland)
Bente Maegaard
(Center for Sprogteknologi, Danmark)
Paul Schmidt
(Institut für Angewandte Informationsforschung, Deutschland)
Luxembourg: Office for Official Publications of the European Communities, 1994
ISSN 1017-6568
© ECSC-EC-EAEC, Brussels · Luxembourg, 1994
Printed in Germany Volume 6
Text and context in machine translation:
Aspects of discourse representation
ande processing
Edited by
Wiebke Ramm
European Commission Volume 6
Text and Context in Machine Translation:
Aspects of Discourse Representation
ande Processing
EDITOR
Wiebke Ramm
CONTENTS
WIEBKE RAMM
Introduction and Overview 7
JOKE DORREPAAL
Meaning and Structure in Logic-Based Discourse Theories 17
SUSANNE PREUSS, BIRTE SCHMITZ,
CHRISTA HAUENSCHILD, CARLA UMBACH
Anaphora Resolution in Machine Translation 29
CHRISTINE DEFRISE
The Treatment of Discourse in Knowledge-Based Machine Translation 53
ERICH STEINER
A Fragment of a Multilingual Transfer Component
and its Relation to Discourse Knowledge 77
MICHAEL GRABSKI
Uncalled-for Repetition of Lexical Items.
A Discourse-Related Problem in Transfer 11WIEBKE RAMM
Introduction and Overview
1 Context and Motivation of this Volume
Compared to their predecessors of some years ago machine translation (MT)
systems available today - commercial systems as well as research prototypes -
have significantly improved regarding their linguistic capabilities. One reason
for this certainly is the fact that modern natural language processing (NLP)
technology has found its reflection in the architecture of MT systems, resulting
in linguistically more ambitious processing strategies with more sophisticated
accounts for various types of lexico-grammatical and semantic phenomena. Rel
ative to the abilities of human translators, however, the translation results
produced by MT systems are still in need of improvement.
One of the things that still is significantly different between human and ma
chine translation is the kind of linguistic unit on which the translation processes
operate: A human translator hardly ever translates single sentences in isola
tion. Instead, sentences and expressions to be translated are interpreted in
the context of other sentences and expressions and against the background of
some situational and cultural environment. These environments provide var
ious types of information a human translator exploits during the translation
process. However, what is natural for human language processing, i.e., the
interpretation of natural language expressions with respect to the textual and
situational context in which they occur, poses very hard problems for a com
putational modelling and processing.
A prerequisite for any computational processing is the formal representation of
the information resources that come into play. Much progress has been made
in the representation and processing of syntactic and semantic information
at sentence level. This is an area of research that by now is relatively well
understood as reflected by the number of computational grammar theories and
formalisms that have been developed and successfully applied in different fields
of NLP in the past few years. Pollard and Sag's HPSG [Pollard and Sag,
1987] could be mentioned here as a prototypical example. With respect to the
modelling of linguistic information that becomes effective beyond the boundary
of single sentences, i.e., concerning a treatment of text or discourse phenomena
(we will use the notions 'text' and 'discourse' more or less synonymously here),
however, the situation still looks quite different.
In NLP in general, investigations into discourse phenomena have been per
formed from various methodological, theoretical, and applicational perspec-8 Studies in MT and NLP, Volume 6
tives, such as artificial intelligence (AI), formal semantics, or text linguistics,
to mention just a few. Due to the complexity of the field and the types of
information involved, however, most of them tend to be relatively restricted or
specifically tailored with respect to aspects like
• the discourse phenomena being addressed, and/or
• the type of NLP application they are being designed for, and/or
• the type of text they can be applied to.
In machine translation, the application area we are concerned with in this
volume, approaches to discourse representation and processing are even less
frequent and less developed than in other fields of NLP. Although a human
translator hardly ever translates sentences in isolation, currently more or less
all existing MT systems operate in a sentence-based mode without taking into
account the textual context in which the sentences to be translated occur.
This holds as well for the EUROTRA machine translation system which has
also been developed as a sentence-based MT system, although the necessity of
taking into account textual and contextual information in the machine trans
lation process had long been recognized here as well. In the EUROTRA context,
for instance, investigations into the relevance of discourse information for the
translation process were performed in the context of two project-internal re
search tenders (documented in [ANAPH89, 1989], [Dorrepaal et al, 1990b],
and [Dorrepaal et al., 1990a]). This research, among others, led to the formu
lation of a number of general requirements on a MT architecture in order to
be able to cope with discourse phenomena. Some of the research presented in
the contributions in this volume originated from this EUROTRA activity. One
conclusion the Eurotra discourse research arrived at was that simply taking
over approaches to discourse representation that had been developed in other
NLP application areas than MT would not be an appropriate solution, since
the requirements posed by the process of MT differ significantly from those of
other NLP areas. Another conclusion was that various linguistic-theoretical
and methodological positions can be motivated as a starting point for address
ing discourse phenomena in a (machine) translation context, each providing a
different perspective on the topic and leading to different suggestions as to how
to approach it. Approaches building on different linguistic foundations more
over typically focus on different aspects of the topic, leading, e.g., to different
discourse phenomena being in the center of investigations. Furthermore, it was
concluded that, due to the complexity of the problems involved, it is unlikely
that the outcome would be a one-shot solution to discourse processing in MT.
Therefore one aim of this volume is to make explicit some of the theoretical
and methodological options at hand.

Univers
Ebooks
Livres audio
Presse
Podcasts
BD
Documents

Publié par	EUROPEAN-COMMISSION
Nombre de lectures	6
Langue	English
Poids de l'ouvrage	2 Mo

Text and context in machine translation

Information processing

Linguistics

Machine translation

Programming language

Terminology

YouScribe

Le catalogue

Le service

Les conditions