Gaussian process models for robust regression, classification, and reinforcement learning [Elektronische Ressource] / vorgelegt von Malte Kuß

technischen_universitat_darmstadt

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

205 pages

Deutsch

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

A propos
Informations
Extrait

Description

Sujets

Gaussian Process Models
for Robust Regression, Classi cation, and
Reinforcement Learning
Vorgelegt von
Diplom Informatiker Malte Ku
aus Wolfsburg
Marz 2006
Genehmigte Dissertation zur Erlangung des akademischen Grades
Doctor rerum naturalium (Dr. rer. nat.)
am Fachbereich Informatik der Technischen Universitat Darmstadt
(Hochschulkennzi er D17)
Erstreferent: Prof. Dr. Thomas Hofmann Eingereicht am 13. Februar 2006
Korreferenten: PhD. Carl E. Rasmussen Tag der Disputatiton 21. Marz 2006
Prof. Dr. Bernt SchieleErklarung
Hiermit erkl are ich, da ich die vorliegende Arbeit—mit Ausnahme der in
ihr ausdrucklich genannten Hilfen—selbst andig verfasst habe.
Wissenschaftlicher Werdegang
10/96 – 02/02 Studium der Informatik an der Technischen Universit at Berlin
Nebenfach Wirtschaftswissenschaften (VWL)
Studienschwerpunkte: Statistik, Maschinelles Lernen, Soft-
waretechnik, Datenbanken, Mikro okonomie, Spieltheorie
Diplomarbeit am Lehrstuhl fur Wirtschaftsmathematik und
StatistikzumThema,,Non-linearMultivariateAnalysiswith
Geodesic Kernels” (Prof. Kockelkorn)
Diplom mit Auszeichnung
07/02 – 03/06 Doktorand am Max-Planck-Institut fur biologische Kybernetik,
Tub ingen
Arbeitsgruppe fur empirische Inferenz (Prof. Scholk opf)
Forschungsinteressen: Bayesianische Statistik, Entschei-
dungstheorie, Monte Carlo Methoden
Promotion an der Technischen Universit at Darmstadt
(Prof. Hofmann)
Referenz
@PhdThesis{Kuss:06,
author = {M. Kuss},
title = {Gaussian Process Models for Robust Regression,
Classification, and Reinforcement Learning},
school = {Technische Universit{\"a}t Darmstadt},
year = {2006}
}Zusammenfassung
Die vorliegende Arbeit beschaftigt sich mit Erweiterungen und Anwendungen einer
KlassevonstatistischenModellen,densogenanntenGau -ProzessModellen. Methoden
des uberwachten Lernens, wie sie z.B. in der Regressions- und Diskriminanzanalyse ver-
wendet werden, zielen darauf ab, Abhangigkeiten zwischen Variablen zu identi zieren
unddassogewonneneVersant dnisuberdendatengenerierendenProzesszurVorhersage
zu nutzen. Die in dieser Arbeit untersuchten Modelle beruhen auf der Annahme, dass
diese Abh angigkeiten in einen systematischen Zusammenhang und eine zuallif ge Kom-
ponente zerlegt werden konnen, wobei die systematische Zusammenhang mittels einer
latenten Funktion beschrieben werden kann. Als Gau -Prozess Modelle bezeichnet man
statistische Modelle, in denen ein Gau -Prozess verwendet wird, um die Bayesianische
a priori Unsicherheit ub er diese latente Funktion zu beschreiben.
NacheinerkurzenEinfuhrungindieBayesianischeStatistikinKapitel2wirdinKapi-
tel 3 die Klasse der Gau -Process Modelle detailliert beschrieben. Daruber hinaus wird
darauf eingegangen, wie der Gau -Prozess zur Beschreibung der a priori Unsicherheit
verstanden werden kann.
Der konzeptionellen Klarheit des Bayesianischen Ansatzes stehen oftmals praktis-
che Schwierigkeiten gegenub er, da die auftretenden Integrale nicht analytischosbarl
sind. Approximationstechniken sind daher von zentraler Bedeutung fur die Anwen-
dung Bayesianischer Methoden in der praktischen Datenanalyse. In Kapitel 4 werden
Laplaces Methode, Expectation Propagation und Markov chain Monte Carlo Verfahren
beschrieben sowie deren Anwendung in Gau -Prozess Modellen.
Unter den Gau -Prozess Modellen sticht das Regressionmodell mit normalverteilter
Storgr o e heraus, da unter diesen Annahmen Bayesianische Inferenz analytisch hand-
habbar ist und die a posteriori Unsicherheit uber die latente Funktion ebenfalls durch
einen Gau -Process beschrieben werden kann. Allerdings macht die Annahme der Nor-
malverteilung das Modell sensitiv gegenub er Ausreissern, d.h. Beobachtungen die stark
von der systematischen Struktur abweichen. Kapitel 5 beschreibt verschiedene Gau -
Prozess Modelle fur nichtlineare robuste Regressionsanalyse. In diesen robusten Re-
gressionsmodellenwirddieVerteilungderSorgr t o edurcheineleptokurtotische(heavy-
tailed) Verteilungen beschrieben.
Kapitel 6 beschafti gt sich mit dem Gau -Prozess Modell zur binare n Klassi kation-
sanalyse. In der Literatur nden sich verschiedene Ansatze, wie man Bayesianische
Inferenz in diesem Modell approximieren kann. Allerdings bestand bisher Unklarheit
darub erwieakkuratdieseN aherungsverfahrensindundwelchesinderPraxiszubevorzu-
gen ist. Dieses Fragen werden sowohl theoretisch durch eine Betrachtung der Struktur
deraposterioriVerteilungalsauchexperimentelldurcheinenVergleichmitaufwendigen
iMarkov chain Monte Carlo Simulationen beantwortet.
Als Reinforcement Lernen bezeichnet man die das adaptive Lernen in sequentiellen
Entscheidungsproblemen. Kapitel7beschreibtAnwendungenvonGau -ProzessRegres-
sionsmodellenfurReinforcementLerneninProblemmitkontinuierlichenZustandsraume n.
Dabei werden verschiedene M oglichkeiten vorgestellt wie man Gauss-Prozesse nutzen
kann, um die E ekte der Entscheidungen vorherzusagen und um die so genannte Value
Funktion zu repr asentieren.
iiSummary
Gaussian process models constitute a class of probabilistic statistical models in which
a Gaussian process (GP) is used to describe the Bayesian a priori uncertainty about a
latent function.
After a brief introduction of Bayesian analysis, Chapter 3 describes the general
construction of GP models with the conjugate model for regression as a special case
(O’Hagan,1978). Furthermore, itwillbe discussedhowGPcanbeinterpretedaspriors
over functions and what beliefs are implicitly represented by this.
The conceptual clearness of the Bayesian approach is often in contrast with the prac-
ticaldi cultiesthatresultfromitsanalyticallyintractablecomputations. Thereforeap-
proximation techniques are of central importance for applied Bayesian analysis. Chap-
ter 4 describes Laplace’s method, the Expectation Propagation approximation, and
Markov chain Monte Carlo sampling for approximate inference in GP models.
The most common and successful application of GP models is in regression problems
where the noise is assumed to be homoscedastic and distributed according to a normal
distribution. In practical data analysis this assumption is often inappropriate and infer-
ence is sensitive to the occurrence of more extreme errors (so called outliers). Chapter 5
proposesseveralvariantsofGPmodelsforrobustregressionanddescribeshowBayesian
inference can be approximated in each. Experiments on several data sets are presented
inwhichtheproposedmodelsarecomparedwithrespecttotheirpredictiveperformance
and practical applicability.
Gaussian process priors can also be used to de ne exible, probabilistic classi cation
models. Again, exact Bayesian inference is analytically intractable and various approx-
imation techniques have been proposed, but no clear picture has yet emerged, as to
when and why which algorithm should be preferred. Chapter 6 presents a detailed ex-
aminationofthemodel,focusingonthequestionwhichapproximationtechniqueismost
appropriatebyinvestigatingthestructureoftheposteriordistribution. Anexperimental
study is presented which corroborates the theoretical insights.
Reinforcement learning deals with the problem of how an agent can optimise its be-
haviour in a sequential decision process such that its utility over time is maximised.
Chapter 7 addresses applications of GPs for model-based reinforcement learning in con-
tinuous domains. If the environment’s response to the agent’s actions can be predicted
using GP regression models, probabilistic planning and an approximate policy iteration
algorithm can be implemented. A core concept in reinforcement learning is the value
function, which describes the long-term strategic value of a state. Using GP models we
are able to solve an approximate continuous equivalent of the Bellman equations, and
it will be shown how this can be used to estimate value functions.
iiiContents
Acknowledgements ix
Symbols & Abbreviations xi
1. Introduction 1
2. Bayesian Analysis 5
2.1. Bayesian Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2. Bayesian Decision Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3. Model Comparison and Model Selection . . . . . . . . . . . . . . . . . . 10
2.3.1. Bayesian Model Comparison . . . . . . . . . . . . . . . . . . . . 10
2.3.2. Model Selection by Evidence Maximisation . . . . . . . . . . . . 12
2.4. Bibliographical Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3. Gaussian Process Models 15
3.1. Structure of Gaussian Process Models . . . . . . . . . . . . . . . . . . . 15
3.2. Regression with Normal Noise . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.1. Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.2. Preprocessing of Data and Nonzero Mean Functions . . . . . . . 23
3.3. Gaussian Processes & Covariance Functions . . . . . . . . . . . . . . . 24
3.3.1. Gaussian Processes . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3.2. Covariance Functions . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3.3. Geometrical Properties of Gaussian Processes . . . . . . . . . . . 27
3.3.4. Examples of Covariance Functions . . . . . . . . . . . . . . . . . 29
3.4. Alternative Interpretations of Gaussian Process Priors . . . . . . . . . . 33
3.4.1. The Weight Space View & Kernel Machines . . . . . . . . . . . . 33