Bayesian inference and experimental design for large generalised linear models [Elektronische Ressource] / vorgelegt von Hannes Nickisch

technische_universitat_berlin - Hannes Nickisch

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

190 pages

English

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

A propos
Informations
Extrait

Description

Sujets

Informatik

Informations

Publié par	technische_universitat_berlin
Publié le	01 janvier 2010
Nombre de lectures	2
Langue	English
Poids de l'ouvrage	6 Mo

Extrait

Bayesian Inference and Experimental Design for
Large Generalised Linear Models
vorgelegt von
Dipl.-Inf. Hannes Nickisch
aus Leipzig
Von der Fakultät IV - Elektrotechnik und Informatik
der Technischen Universität Berlin
zur Erlangung des akademischen Grades
Doktor der Naturwissenschaften
– Dr. rer. nat. –
genehmigte Dissertation.
Promotionsausschuss:
Vorsitzender: Prof. Dr. Klaus-Robert Müller
Berichter: Prof. Dr. Manfred Opper
Berichter: PhD. Matthias W. Seeger PhD. Carl E. Rasmussen
Sachverständiger: Prof. Dr. Klaus Obermayer Prof. Dr. Felix Wichmann
Tag der wissenschaftlichen Aussprache: 17. September 2010
Berlin 2010
D83Acknowledgements
The most important ingredient for this work was the open, cordial and scientiﬁc atmosphere
at the Empirical Inference Department of the Max Planck Institute for Biological Cybernetics,
Tübingen. I am thankful to Bernhard Schölkopf for creating this stimulating as well as produc-
tive environment and giving input and feedback whenever necessary.
Thanks to Carl Rasmussen, I learned about Gaussian processes, approximate inference, dif-
ferentiating between the “big picture” and “technical details” and concise scientiﬁc program-
ming, experimenting and writing.
Matthias Seeger’s enthusiasm, experience in probabilistic modelling and knowledge on
mathematics and numerical optimisation where a constant driving force and source of ideas.
Without his supervision, this thesis wouldn’t have been possible.
Manfred Opper provided formal supervision, valuable discussions and convivial recep-
tions in Berlin.
Last but not least, I would like to thank my family; especially Susanne and Oskar for en-
during and also mentally supporting the writing and the defence.
iiiZusammenfassung
Zu Entscheidungen zu gelangen trotz unsicherer und unvollständiger Informationen, ist eines
der zentralen Themen der Statistik und des maschinellen Lernens. Probabilistische Bayesiani-
sche Modelle stellen dabei einen strengen mathematischen Rahmen für die Formalisierung der
Datengewinnung zur Verfügung, in dem getroffene Annahmen sowie vorhandenes Vorwissen
explizit gemacht werden. Die resultierende a-posteriori-Verteilung repräsentiert den Wissens-
stand des Modells und ist Ausgangspunkt für sich anschließende Entscheidungen.
Trotz aller begrifﬂichen Klarheit der Bayesianischen Inferenz haben die notwendigen Be-
rechnungen meist die Form analytisch unlösbarer hochdimensionaler Integrale, was in der
Praxis zu einer Reihe von randomisierten und deterministischen Näherungsverfahren führt.
Die vorliegende Arbeit entwickelt, studiert und wendet Algorithmen zur näherungsweisen
Inferenz und Versuchsplanung auf generalisierte lineare Modelle (GLM) an. Ein besonderer
Schwerpunkt liegt auf algorithmischen Eigenschaften wie Konvexität, numerische Stabilität
und Skalierbarkeit hin zu großen Mengen an wechselwirkenden Größen.
Nach einer Einführung in GLMs stellen wir die vielversprechendsten Ansätze zum Schät-
zen, zur näherungsweisen Inferenz und zur Versuchsplanung vor.
Wir untersuchen detailliert einen speziellen Ansatz und leiten Konvexitäts-Eigenschaften
her, was zu einem generischen und skalierbaren Inferenzverfahren führt. Desweiteren sind wir
in der Lage, den Zusammenhang zwischen Bayesianischer Inferenz und dem regularisierten
statistischen Schätzen genau zu beschreiben: Schätzen ist ein Spezialfall von Inferenz und In-
ferenz kann durch eine Folge von geglätteten Schätzern berechnet werden.
Im Anschluss daran vergleichen wir eine Reihe von Inferenzverfahren, angewendet auf
die binäre probabilistische Klassiﬁkation mittels eines kernbasierten GLMs, dem sogenannten
Gauß-Prozess-Modell. Eine Reihe empirischer Experimente ermittelt den EP-Algorithmus als
das genaueste Näherungsverfahren.
In einem nächsten Schritt wenden wir den EP-Algorithmus auf die sequenzielle Optimie-
rung der Messarchitektur eines Bilderfassungssystems an. Dies unter Verwendung von Com-
pressive Sampling (CS), bei dem die intrinsische Redundanz in Signalen benutzt wird, um den
Messprozess zu beschleunigen. In vergleichenden Experimenten beobachten wir Unterschiede
zwischen dem Verhalten von adaptivem CS in der Praxis und dem theoretisch untersuchten
Szenario.
Durch Kombination der gewonnenen Erkenntnisse über adaptives CS mit unserem konve-
xen Inferenzverfahren sind wir in der Lage, die Messsequenz von Magnetresonanztomographie-
Systemen (MRT) zu verbessern, indem wir das Bayesianische Kriterium zur Versuchsplanung
optimieren. Unsere MRT-Anwendung auf Bildern realitischer Größe ermöglicht kürzere Mess-
zeiten bei gleichbleibender Bildqualität.
ivAbstract
Decision making in light of uncertain and incomplete knowledge is one of the central themes
in statistics and machine learning. Probabilistic Bayesian models provide a mathematically
rigorous framework to formalise the data acquisition process while making explicit all relevant
prior knowledge and assumptions. The resulting posterior distribution represents the state of
knowledge of the model and serves as the basis for subsequent decisions.
Despite its conceptual clarity, Bayesian inference computations take the form of analytically
intractable high-dimensional integrals in practise giving rise to a number of randomised and
deterministic approximation techniques.
This thesis derives, studies and applies deterministic approximate inference and experi-
mental design algorithms with a focus on the class of generalised linear models (GLMs). Special
emphasis is given to algorithmic properties such as convexity, numerical stability, and scalabil-
ity to large numbers of interacting variables.
After a review of the relevant background on GLMs, we introduce the most promising
approaches to estimation, approximate inference and experiment design.
We study in depth a particular approach and reveal its convexity properties naturally lead-
ing to a generic and scalable inference algorithm. Furthermore, we are able to precisely char-
acterise the relationship between Bayesian inference and penalised estimation: estimation is a
special case of inference and inference can be done by a sequence of smoothed steps.
We then compare a large body of inference algorithms on the task of probabilistic binary
classiﬁcation using a kernelised GLM: the Gaussian process model. Multiple empirical com-
parisons identify expectation propagation (EP) as the most accurate algorithm.
As a next step, we apply EP to adaptively and sequentially design the measurement ar-
chitecture for the acquisition of natural images in the context of compressive sensing (CS),
where redundancy in signals is exploited to accelerate the measurement process. We observe
in comparative experiments differences between adaptive CS results in practise and the setting
studied in theory.
Combining the insights from adaptive CS with our convex variational inference algorithm,
we are able – by sequentially optimising Bayesian design scores – to improve the measurement
sequence in magnetic resonance imaging (MRI). In our MRI application on realistic image sizes,
we achieve scan time reductions for constant image quality.
vContents
Acknowledgements iii
Zusammenfassung iv
Summary v
Contents vi
List of Figures x
List of Algorithms x
List of Tables xi
Notation xii
1 Introduction 1
1.1 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Probabilistic models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Summary of the contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Outline of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Publication record . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Inference and Design in Linear Models 5
2.1 Statistical inference and decision theory . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Frequentist decision theory . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Bayesian perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 The Gaussian linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Frequentist estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Bayesian inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 The generalised linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 Frequentist estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.2 Bayesian inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 The Gaussian process model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 Approximate Bayesian inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5.1 Modelling framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5.2 Inference algorithm properties . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5.3 Approximations to achieve tractability . . . . . . . . . . . . . . . . . . . . 14
2.5.4 The Gaussian linear model . . . . . . . . . . . . . . . .