Cet ouvrage fait partie de la bibliothèque YouScribe
Obtenez un accès à la bibliothèque pour le lire en ligne
En savoir plus

Reverse engineering of genetic networks with time delayed recurrent neural networks and clustering techniques [Elektronische Ressource] / presented by David Camacho Trujillo

De
149 pages
Reverse engineering of genetic networks with time delayed recurrent neural networks and clustering techniques Dissertation submitted to the Combined Faculties for the Natural Sciences and for Mathematics of the Ruperto-Carola University of Heidelberg, Germany for the degree of Doctor of Natural Sciences presented by M. Sc. David Camacho Trujillo born in México City, México Oral-examination: ................................................ Reverse engineering of genetic networks with 2 time delayed recurrent neural networks and clustering techniques Reverse engineering of genetic networks with 3time delayed recurrent neural networks and clustering techniques ............................................................. .................................................. .................................................. Referees: Prof. Dr. Ursula Kummer . P.D. Dr.
Voir plus Voir moins




Reverse engineering of genetic networks
with time delayed recurrent neural networks
and clustering techniques


Dissertation

submitted to the
Combined Faculties
for the Natural Sciences and for Mathematics
of the Ruperto-Carola University of Heidelberg, Germany

for the degree of


Doctor of Natural Sciences










presented by

M. Sc. David Camacho Trujillo
born in México City, México
Oral-examination: ................................................ Reverse engineering of genetic networks with 2
time delayed recurrent neural networks and clustering techniques

Reverse engineering of genetic networks with 3
time delayed recurrent neural networks and clustering techniques

.............................................................
..................................................
..................................................






















Referees: Prof. Dr. Ursula Kummer
. P.D. Dr. Ursula Klingmüller


Reverse engineering of genetic networks with 4
time delayed recurrent neural networks and clustering techniques

Reverse engineering of genetic networks with 5
time delayed recurrent neural networks and clustering techniques




D edicated to:

Sarah

&

Tere

&


A rturito




Reverse engineering of genetic networks with 6
time delayed recurrent neural networks and clustering techniques

Reverse engineering of genetic networks with 7
time delayed recurrent neural networks and clustering techniques
INDEX

Summary.......................................................................... 9
Zusammenfassung ....................................................... 10
Personal Words............................................................. 11
List of abbreviations ..................................................... 13
General Motivation........................................................ 17
1. Biological context ..................................................... 19
1.1 Gene regulation ............................................................................................. 19
1.2 Basal transcription apparatus ......................................................................... 19
1.3 Transcription factors...................................................................................... 21
1.4 Enhancers-Insulators...................................................................................... 22
1.5 Post-transcriptional regulation of the mRNA ................................................. 23
1.5.1 Alternative splicing................................................................................. 23
1.5.2 RNA interference.................................................................................... 24
1.5.3 Dimensional in-homogeneities................................................................ 26
2. Reverse engineering and modelling of genetic
network modules........................................................... 29
2.1 Related work ................................................................................................. 29
2.2 General concepts ........................................................................................... 30
2.3 Dimensionality reduction by data selection.................................................... 32
2.4 Theoretical works .......................................................................................... 36
2.4.1 Boolean Networks................................................................................... 36
2.4.2 Differential equation systems.................................................................. 38
2.4.3 Stochastic Models................................................................................... 44
2.4.4 Bayesian networks .................................................................................. 45
3. Methods ..................................................................... 50
3.1 Workflow ...................................................................................................... 50
3.2 Data pre-processing, Quality control.............................................................. 51
3.3 Data normalization ........................................................................................ 53

Reverse engineering of genetic networks with 8
time delayed recurrent neural networks and clustering techniques
3.4 Dimensionality problem. The use of interpolation approaches ....................... 55
3.5 Data fitting .................................................................................................... 57
3.6 Models........................................................................................................... 62
3.6.1 The CTRNN model................................................................................. 62
3.6.2 The TDRNN model................................................................................. 66
3.6.3 Robust parameter determination.............................................................. 67
3.6.4 Graph generation and error distance measurements ................................. 68
3.6.5 Clustering of results ................................................................................ 68
3.6.6 Dynamic Bayesian Network.................................................................... 71
4. Results ....................................................................... 73
4.1 Synthetic benchmark: The Repressilator ........................................................ 74
4.1.1 Parameter space selection........................................................................ 75
4.1.2 Required data length. .............................................................................. 86
4.1.3 Robustness against noise......................................................................... 92
4.1.4 Robustness against incomplete information: Clustering improves the
standard reverse engineering task, quantitatively and qualitatively................... 97
4.2 The yeast cell cycle...................................................................................... 103
4.2.1 TDRNN shows superior inference and predictive power than previous
models on experimental data.......................................................................... 104
4.2.2 Bootstrapping validation ....................................................................... 106
4.2.3 Clustering improves the RE process with real data ................................ 107
4.3 Reverse engineering of keratinocyte-fibroblast communication.................... 109
5. Discussion ............................................................... 127
5.1 Model choice and data driven experiments................................................... 128
5.2 Data selection .............................................................................................. 129
5.3 Data interpolation, implications ................................................................... 130
5.4 Data fitting and inference power relationship............................................... 131
5.5 Reverse engineering framework, improving the robust parameter selection.. 135
6. Conclusions............................................................. 137
7. Bibliography ............................................................ 139


Reverse engineering of genetic networks with 9
time delayed recurrent neural networks and clustering techniques

Summary

In the iterative process of experimentally probing biological networks and
computationally inferring models for the networks, fast, accurate and flexible
computational frameworks are needed for modeling and reverse engineering
biological networks. In this dissertation, I propose a novel model to simulate gene
regulatory networks using a specific type of time delayed recurrent neural networks.
Also, I introduce a parameter clustering method to select groups of parameter sets
from the simulations representing biologically reasonable networks. Additionally, a
general purpose adaptive function is used here to decrease and study the connectivity
of small gene regulatory networks modules.

In this dissertation, the performance of this novel model is shown to simulate the
dynamics and to infer the topology of gene regulatory networks derived from
synthetic and experimental time series gene expression data. Here, I assess the quality
of the inferred networks by the use of graph edit distance measurements in
comparison to the synthetic and experimental benchmarks. Additionally, I compare
between edition costs of the inferred networks obtained with the time delay recurrent
networks and other previously described reverse engineering methods based on
continuous time recurrent neural and dynamic Bayesian networks. Furthermore, I
address questions of network connectivity and correlation between data fitting and
inference power by simulating common experimental limitations of the reverse
engineering process as incomplete and highly noisy data.

The novel specific type of time delay recurrent neural networks model in combination
with parameter clustering substantially improves the inference power of reverse
engineered networks. Additionally, some suggestions for future improvements are
discussed, particularly under the data driven perspective as the solution for modeling
complex biological systems.

Reverse engineering of genetic networks with 10
time delayed recurrent neural networks and clustering techniques
Zusammenfassung

Für den iterativen Prozess der experimentellen Erforschung biologischer Netzwerke
und der computergenerierten Ableitung von Modellen für diese Netzwerke werden
schnelle, fehlerfreie und flexible Programmiergerüste benötigt, um biologische
Netzwerke zu modellieren und um sie zu rekonstruieren. In dieser Arbeit stelle ich ein
neuartiges Modell vor, das genregulierte Netzwerke darstellt, indem zeitverzögerte,
rekurrente, neuronale Netzwerke benutzt werden. Zudem führe ich eine Methode des
Parameter-Clusterings ein, die Parameter-Set-Gruppen, die biologisch sinnvolle
Lösungen darstellen, aus den Simulationen auswählt. Zusätzlich wird hier eine
generelle, lernfähige Funktion eingesetzt, um die Konnektivität kleiner genregulierter
Netzwerke zu verringern und um diese zu untersuchen.
In dieser Dissertation wird die Leistungsfähigkeit dieses neuartigen Modells, die
Dynamik genregulierter Netzwerke aus synthetischen und experimentellen
Datensätzen von Zeitreihen der Gen-Expression zu simulieren und deren Topologie
abzuleiten, aufgezeigt. Die Qualität der abgeleiteten Netzwerke bestimme ich mit
Hilfe von Graph-Edit-Messungen im Vergleich zu den synthetischen und
experimentellen Bezugswerten. Außerdem vergleiche ich den Arbeitsaufwand der von
den zeitverzögerten rekurrenten Netzwerken abgeleiteten Netzwerke und anderer
bereits beschriebener Rekonstruktionsmethoden, die auf zeitkontinuierlichen-
rekurrenten und dynamischen-bayesischen Netzwerken basieren. Darüber hinaus
befasse ich mich mit Fragen der Netzwerk-Konnektivität und der Korrelation
zwischen der Datenanpassung und der statistischen Power der Inferenz, indem ich
bekannte experimentelle Einschränkungen des Rekonstruktionsprozesses, wie
unvollständige oder höchst rauschbehaftete Datensätze, simuliere.
Dieses neuartige und spezielle, zeitverzögerte, rekurrente, neuronale Netzwerk
verbessert zusammen mit dem Parameter-Clustering wesentlich die Ableitungskraft
der rekonstruierten Netzwerke. Zudem werden einige Anregungen für zukünftige
Verbesserungen erörtert, insbesondere aus der datengestützen Perspektive als der
Lösungsstrategie für die Modellierung komplexer biologischer Systeme.