Adaptive processing of structural data: from sequences to trees and beyond [Elektronische Ressource] / Andreas Küchler, Andreas

universitat_ulm

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

233 pages

English

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

A propos
Informations
Extrait

Description

Sujets

Informatik

Informations

Publié par	universitat_ulm
Publié le	01 janvier 2000
Nombre de lectures	35
Langue	English
Poids de l'ouvrage	1 Mo

Extrait

Abteilung Neuroinformatik
Prof. Dr. Gun ther Palm
Adaptive Processing of Structural Data
From Sequences to Trees and Beyond
Dissertation zur Erlangung des Doktorgrades
Dr.rer.nat. der Fakult at fur Informatik der Universit at Ulm
von
Andreas Kuc hler
aus Roth
1999Amtierender Dekan: Prof. Dr. Uwe Sch oning
Erster Gutachter: Prof. Dr. Gun ther Palm
Zweiter Gutachter: Prof. Dr. Uwe Sch oning
Externer Gutachter: Prof. Dr. Marco Gori
Tag der Promotion: 15.02.2000Abstract
There are several challenging real-world problems, for instance in the elds
of medical and technical diagnosis, molecular biology, or document and image
processing, where the objects of interest are signi c antly structured, and their
component parts have a continuous nature and are subjected to di eren t types
of noise. For a number of practical tasks, the solution can be described by
example data, but there is no or only partial and uncertain prior expert knowl-
edge aboutthe relevant structural concepts. Thus, it would beadvantageous to
have methods and tools to (automatically) infer the solutions, i.e. the desired
input-output mapping, from the given example data.
In computer science, structural (for causal, topological, or hierar-
chical) relationships between parts of a object are commonly represented by
symbolic formalisms such as graphs, terms or diagrams. Symbolic machine
learning approaches can deal with these representations, but fail if the range
of the intended mapping is of continuous nature. On the other hand, exist-
ing analog models of computation andlearning are tailored to the processing of
continuous information. However, thesemodelsassumethat dataare organized
according relatively poor structures, by and large, arrays and sequences.
This work contributes in bridging this gap. We propose tree-recursive dy-
namical systems (TRDS), a new class of deterministic state machines that op-
erate in a continuous state space. These machines enable the representation
and the inductive inference of structure mappings. The most general admissi-
ble domain is characterized by rooted labeled ordered trees (and a certain class
of rooted labeled directed ordered acyclic graphs)whose vertices can be labeled
by continuous feature vectors. The range of these mappings may either be (a
subspace of) the Euclidean vector space or a nite set of categorical values.
Adaptivity is incorporated into TRDS by choosing parameterized functions
forthestate transitionmapandtheoutputmap. Inductivelearningtasks, such
as the classi cation or the regression of tree structures, can be re-formulated
as the optimization (minimization) of an error criterion. If the given error cri-
terion is stated by a continuously di eren tiable function, then gradient-based
optimization methods are usually taken into account to solve the learning task.
We develop and analyze two di eren t algorithms for the calculation of gradi-
ent information, backpropagation through structure (BPTS) and tree-recursive
gradient computation (TRGC). Both algorithms can be used to calculate the
rst-or der gradient for arbitrary continuous and di eren tiable criteria where
tree structures are embedded via TRDS mappings. This enables attacking in-
ductive learning tasks on structural data by means of TRDS and a variety ofiv
well-established gradient-based optimization methods.
Theadequacyof thisframework isdemonstratedwiththehelpof theneural
folding architecture (FA), an appealing \neural network" instance of TRDS.
Thestate transition mapandthe output mapare implemented by conventional
multilayer feedforward neural networks. Recently, the FA has been proven to
possess universal approximation power for structure mappings. The algorithm
BPTS is believed to be optimal for the FA concerning the computational e ort
in time for a given input tree.
In order to get an initial understanding of the computational power of the
FA, we view these models as recognizers of formal languages. We relate several
variations of the architecture to nite deterministic bottom-up tree automata
(DTA),thediscretecounterpartofTRDS.Weexploretheminimalrequirements
for the FA to achieve DTA power. By borrowing known results on the circuit
complexity of linear threshold circuits, the resource e ciency of di eren t FA
implementations of DTA is measured. We identify the tree languages used
in former experiments to belong to di eren t classes of tree pattern languages.
This provides a reasonable explanation for some phenomena (in regards to
the intrinsic \hardness" of certain inductive learning tasks) observed with the
empirical results.
Since tree-recursive dynamical systems generalize a well-known class of
discrete-time dynamical systems, we are able to apply many concepts, proof
techniques and known results from sequence to tree processing. Conversely,
novel results are obtained for discrete-time dynamical systems (and discrete-
time recurrent neural networks) andthe processingof sequential information as
a special case.
On a broader perspective, this thesis contributes indirectly to the long-
standing discussion about the fundamental relationships between \symbolic"
(or discrete), \subsymbolic" (or continuous, analog), and \hybrid" information
processing. Both symbolic and subsymbolic aspects of the application domain
can be integrated into one hybrid data structure. TRDS and FA are state
machines operating in a continuous state space. Learning is understood as a
continuous optimization problem in a continuous parameter space. However,
state transitions are controlled by the symbolic and recursive nature of the
input structures.
Theproposedapproach ts somewherebetweentheclassicalschoolsofstruc-
tural/syntactical and statistical pattern recognition, and between the elds of
symbolic and statistical machine learning. The presented results, techniques
and algorithms provide a solid foundation and a new toolkit for building ma-
chine learning devices that directly operate on structural data. We survey
some fascinating application areas that have been explored (and are under in-
vestigation) by several research groups, ranging from document processing and
computationalchemistrytotheadaptivecontrolofsymbolicdeductionsystems.Zusammenfassung
Zur L osung einiger interessanter Aufgabenstellungen, zum Beispiel aus den
Bereichen der medizinischen und technischen Diagnose, der Molekularbiologie,
oder der Dokumenten- und Bildverarbeitung, ist es notwendig, die \inh arente
Struktur" der in der Dom ane relevanten Objekte zu erfassen, und zudem die
\kontinuierlichen" MerkmaleihrerKomponenten,diem oglicherweisedurchver-
schiedene Rauschquellen gest ort sind, geeignet zu beruc ksichtigen. Fur eine
Reihe von Aufgabenstellungen kann die L osung zwar anhandvon Beispieldaten
beschrieben werden, jedoch liegt kein oder nur partielles und unsicheres Wis-
sen ub er die zugrundeliegenden strukturellen Konzepte vor. Zur Behandlung
dieserF allew arenMethodenundWerkzeuge vonVorteil, dieesgestatten, mehr
oder weniger automatisch die gewunsc hte Eingabe-Ausgabe Abbildung aus den
vorhandenen Beispieldaten zu inferrieren.
In der Informatik werden strukturelle Beziehungen (zum Beispiel kausaler,
topologischer, oder hierarchischer Natur) zwischen den Komponenten des zu
beschreibendenObjektesublic herweisedurchFormalismenweGraphen,Terme,
oder Diagramme repr asentiert. Symbolische Verfahren des Maschinellen Ler-
nens sind zur Verarbeitung derartiger Repr asentationen geeignet; sie versagen
jedoch, wenn der Bildbereich der intendierten Abbildung als kontinuierlicher
Raumgegeben ist. Bestehende analoge Berechnungs- undLernmodellesindauf
die Verarbeitung von Information kontinuierlicher Natur zugeschnitten. Diese
Modelle verlangen jedoch, dass die Daten relativ \schwach" struktiert sind, im
grossenundganzeninVektorenfesterL ange,undSequenzenvonVektoren. Die
vorliegende Arbeit tr agt dazu bei, diesen Graben zu ub erbruc ken.
Wir stellen eine neue Klasse von deterministischen Zustandsmaschinen vor,
sogenannte Tree-Recursive Dynamical Systems (TRDS), die in einem kon-
tinuierlichen Zustandsraum operieren. Diese Zustandsmaschinen erm oglichen
dieRepr asentationunddieinduktiveInferenzvonStrukturabbildungen. Derall-
gemeinste Urbildbereich dieser Abbildungenist durchgewurzelte und geordnete
B aume (und durch eine Teilklasse von gerichteten, geordneten und azyklischen
Graphen) charakterisiert, deren Knoten durch kontinuierliche Merkmalsvek-
toren annotiert sind. Als Bildbereich ist sowohl der Euklid’sche Vektorraum
(oder ein Teilraum davon), als auch eine endliche Menge kategorialer Werte
zugelassen.
Adaptivit at und \Lernverm ogen" wird in das TRDS Modell eingebracht,
indem parameterisierte Funktionen als Zustandsub ergangsfunktion und Aus-
gabefunktion verwendet werden. Induktive Lernaufgaben, wie zum Beispiel
die Klassi kation oder Regression von Baumstrukturen, lassen sich nun alsvi
Optimierung (Minimierung) eines Fehlerkriteriums reformulieren. Wenn das
Fehlerkriterium als stetig di erenzierbare Funktion ausdruc kbar ist, kommen
ublic herweisesogenannte Gradienten-basierte OptimierungsmethodeninBetra-
cht um die Lernaufgabe anzugehen.
Wir entwickeln, undvergleichen zwei verschiedene Algorithmen zur Berech-
nung von Gradienteninformation fur TRDS: Backpropagation Throug