Inference of large phylogenetic trees on parallel architectures [Elektronische Ressource] / Michael Ott

technische_universitat_munchen

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

135 pages

English

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

A propos
Informations
Extrait

Description

Sujets

Wissenschaft

Informations

Publié par	technische_universitat_munchen
Publié le	01 janvier 2010
Nombre de lectures	12
Langue	English
Poids de l'ouvrage	1 Mo

Extrait

Inference of Large Phylogenetic
Trees on Parallel Architectures
Michael OttTECHNISCHE UNIVERSITAT
MUNCHEN
Lehrstuhl fur Rechnertechnik und Rechnerorganisation /
Parallelrechnerarchitektur
Inference of Large Phylogenetic Trees on Parallel Architectures
Michael Ott
Vollst andiger Abdruck der von der Fakult at fur Informatik der Technischen Universit at
Munc hen zur Erlangung des akademischen Grades eines
Doktors der Naturwissenschaften (Dr. rer. nat.)
genehmigten Dissertation.
Vorsitzender: Univ.-Prof. Dr. H. M. Gerndt
Prufer der Dissertation: 1. Univ.-Prof. Dr. A. Bode
2. TUM Junior Fellow Dr. A. Stamatakis
Die Dissertation wurde am 15.07.2010 bei der Technischen Universit at Munc hen eingereicht und
durch die Fakult at fur Informatik am 08.10.2010 angenommen.Abstract
Due to high computational demands, the inference of large phylogenetic trees from molecular
sequence data requires the use of HPC systems in order to obtain the necessary computational
power and memory. The continuous explosive accumulation of molecular data, which is driven
by the development of cost-e ective sequencing techniques, ampli es this requirement addition-
ally. Furthermore, a continuously increasing degree of parallelism is necessary in order to exploit
the performance of emerging multi-core processors e ciently.
This dissertation describes scalable parallelization schemes for the inference of large phylo-
genetic trees as well as tangible implementations of those which also eliminate memory require-
ments as a limiting factor for phylogenetic analyses. Additionally, it pinpoints the properties of
current multi-core shared and distributed memory architectures and describes novel approaches
for their e cient exploitation.Acknowledgments
Many people have contributed to the success of this work. First of all, I would like to thank
Prof. Dr. Arndt Bode for providing such a supportive working environment at the Lehrstuhl
fur Rechnertechnik und Rechnerorganisation (LRR) and for the freedom he granted me for my
research over the last years. I am particularly grateful to Dr. Alexandros Stamatakis who has
been accompanying and supporting me since my undergraduate studies and spent a great e ort
on supervising my work. Furthermore, I am very thankful to the numerous helpful colleagues
at the LRR { too many to be named individually { who have o ered their advice when it was
needed and supported me in many di erent ways.
My position at the LRR was funded by KONWIHR, the Bavarian Competence Network
for Technical and Scienti c High Performance Computing. KONWIHR is an initiative by the
Bavarian State Ministry of Sciences, Research and the Arts for supporting research in compu-
tationally challenging scienti c domains that require the use of High Performance Computing
in order to obtain new knowledge and insights.Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Scienti c Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Phylogenetic Tree Inference 5
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 The Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Models of Sequence Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.1 The Substitution Rate Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.2 The Probability Matrix . . . . . . . . . . . . . . . . . . . . . 11
2.3.3 Rate Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Distance Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.1 Least-Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.2 UPGMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.3 Neighbor-Joining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.4 Objections Against Distance Methods . . . . . . . . . . . . . . . . . . . . 14
2.5 Maximum Parsimony . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.6um Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.6.1 The Phylogenetic Likelihood Function . . . . . . . . . . . . . . . . . . . . 16
2.6.2 The Pulley Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.6.3 Optimization of the Likelihood Score . . . . . . . . . . . . . . . . . . . . . 21
2.7 Bootstrapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.8 Bayesian Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.9 Programs for Phylogenetic Tree Inference . . . . . . . . . . . . . . . . . . . . . . 27
2.9.1 RAxML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.9.2 PHYLIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.9.3 GARLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.9.4 MrBayes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.9.5 PAUP* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30VIII CONTENTS
3 Shared Memory Multiprocessors 31
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Multi- and Many-Cores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2.1 The Power Wall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2.2 From Single- to Multi-Core . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 The Memory Subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.1 The Memory Wall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.2 Memory Hierarchies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3.3 UMA and NUMA Architectures . . . . . . . . . . . . . . . . . . . . . . . 38
3.4 Shared Memory Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Distributed Memory on Shared Memory Architectures . . . . . . . 40
3.6 E cient Exploitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.6.1 Shared Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.6.2 Dealing with the Memory Wall . . . . . . . . . . . . . . . . . . . . . . . . 42
3.6.3 Thread Pinning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.7 An Outlook on Automated Thread Pinning: autopin . . . . . . . . . . . . . . . 49
4 Distributed Memory Systems 55
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2 Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3 The Message-Passing Interface MPI . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3.1 MPI-1 Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.3.2 The MPI Pro ling Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 An Outlook on Automated Performance Analysis . . . . . . . . . . . . . . . . . . 63
4.4.1 Automated Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . 63
4.4.2 The Architecture of Periscope . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.4.3 Identifying MPI Performance Properties . . . . . . . . . . . . . . . . . . . 67
5 Parallel Phylogenetic Tree Inference 71
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 Sources of Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.2.1 Embarrassing Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.2.2 Inference Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2.3 Loop-level P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.3 Parallel Implementations of RAxML . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.3.1 Exploitation of Loop-level Parallelism . . . . . . . . . . . . . . . . . . . . 75
5.3.2 PThreads Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.3.3 Simultaneous Exploitation of Loop-level and Embarrassing Parallelism
with MPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84