A chemoinformatics approach to identify bioisosteric scaffold replacements in compounds of PubChem bioassays [Elektronische Ressource] / vorgelegt von Kerstin Höhfeld

De
A Chemoinformatics Approach to Identify Bioisosteric Scaffold Replacements in Compounds of PubChem Bioassays Der Naturwissenschaftlichen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg zur Erlangung des Doktorgrades vorgelegt von Kerstin Höhfeld aus Wipperfürth Als Dissertation genehmigt von der Naturwissenschaftlichen Fakultät der Universität Erlangen-Nürnberg Tag der mündlichen Prüfung: 28.10.2009 Vorsitzender der Promotionskommission: Prof. Dr. Eberhard Bänsch Erstberichterstatter: Prof. Dr. Timothy Clark Zweitberichterstatter: Prof. Dr. Harald Gröger List of Contents List of Contents Lists of Figures and Tables ............................................................................................................ ii Zusammenfassung ........................ 1 Abstract ......................................... 4 Introduction .................................................................................................... 6 Methods ....... 11 Search Method ........................................................................................................................ 11 Selection of Examples ............ 19 Scaffold Alignment .............................................................................................................. 20 Results and Discussion ...............
Publié le : vendredi 1 janvier 2010
Lecture(s) : 17
Source : D-NB.INFO/100125385X/34
Nombre de pages : 128
Voir plus Voir moins

A Chemoinformatics Approach to
Identify Bioisosteric Scaffold
Replacements in Compounds of
PubChem Bioassays



















Der Naturwissenschaftlichen Fakultät
der Friedrich-Alexander-Universität Erlangen-Nürnberg
zur
Erlangung des Doktorgrades



vorgelegt von

Kerstin Höhfeld



aus Wipperfürth






























Als Dissertation genehmigt von der Naturwissenschaftlichen
Fakultät der Universität Erlangen-Nürnberg


Tag der mündlichen Prüfung: 28.10.2009
Vorsitzender der Promotionskommission: Prof. Dr. Eberhard Bänsch
Erstberichterstatter: Prof. Dr. Timothy Clark
Zweitberichterstatter: Prof. Dr. Harald Gröger
List of Contents

List of Contents

Lists of Figures and Tables ............................................................................................................ ii
Zusammenfassung ........................ 1
Abstract ......................................... 4
Introduction .................................................................................................... 6
Methods ....... 11
Search Method ........................................................................................................................ 11
Selection of Examples ............ 19
Scaffold Alignment .............................................................................................................. 20
Results and Discussion ............... 25
Results of Systematic Search ................................................................................................. 25
Special Scaffold Replacements .......................... 28
Selection of Examples ............................................................................................................ 31
Factor Xa ............................................................................................................................ 34
Pyruvate Kinase Assay (Assay ID 361) .............. 37
Protein Kinase A (Assay ID 548) ........................ 44
Protein Tau (Assay ID 596) ................................................................................................ 52
Steroidogenic Factor 1 (Assay ID 600) .............. 55
Estrogen Receptor-Beta (Assay ID 733) ............................................................................ 57
Factor XIa (Assay ID 846) .................................. 62
Factor XIIa (Assay ID 852) ................................................................. 66
Hydroxysteroid (17-Beta) Dehydrogenase 4 (Assay ID 893) ............. 69
Acetylcholine Muscarinic M1 Receptor (Assay ID 944) ..................................................... 74
Thrombin (Assay ID 1215).................................................................. 76
Neuropeptide Y Receptor Y2 (Assay ID 1272)................................... 82
B-cell Lymphoma 2 Family proteins (Assay ID 1320) ........................ 84
Signal Transducer and Activator of Transcription (STAT) Family (Assay ID 1397) ........... 86
Conclusions ................................................................................................................................. 88
Scaffold Properties and Substituent Orientation 88
Evaluation Scheme ............................................................................................................. 89
Chemical Descriptors .............. 93
General Discussion ................. 99
Outlook .................................................................................................................................. 101
Acknowledgement ..................... 104
References 105
Appendix ................................................................................................................................... 112
i List of Figures


List of Figures

Figure 1 Example of a successful scaffold hopping for a Factor Xa Inhibitor. .............................. 6
Figure 2 Example of exit vectors. .................................................................................................. 7
Figure 3 Example of the longest shortest path in a molecule and resulting exterior atoms. ....... 13
Figure 4 Example of two common substructures (orange) of two molecules, compounds
648322 and 660787 from assay ID 548. ....................................................................... 15
Figure 5 Connecting part between two common substructures: [CH2][S]. . 15
Figure 6 Scheme of search workflow. ......................................................................................... 17
Figure 7 Start and endpoint of two according pairs of exit vectors and the resulting
PyMOL fit of the corresponding four atoms (RMS: 0.275 Å). ....................................... 19
Figure 8 Left: Example of the Cresset field points representation for compound 648322.
Right: The ParaSurf isodensity surface of the same molecule (CID 648322). ............. 21
Figure 9 Scheme for investigation of selected examples. ........................................................... 23
Figure 10 Histogram of connecting structure sizes, all results of the search method are
considered (n=685). ...................................................................... 26
Figure 11 Histogram of replacement types, only replacements with an atom number
greater than zero of both molecules are concerned (n=410). ....... 26
Figure 12 Histogram of size differences of according connecting structures, all results of
the search method are considered (n=684). ................................................................. 27
Figure 13 Molecule with original substituents (left) and scaffold saturated with methyl
groups (right). ................................................................................................................ 31
Figure 14 Adenosintriphosphat (ATP). ........................ 47
Figure 15 Two different presentations of the same docking pose of compound 648322 in
the crystal structure of protein 2F7E. ............................................................................ 48
Figure 16 GOLD docking results for compound 648322 (orange) and 660787 (turquoise)
in the protein with PDB code 2FE7. .............. 49
Figure 17 Best FieldAlign result for compound 648322-S (orange) and 660787-S
(turquoise). .................................................................................................................... 49
Figure 18 Compound 124727 (green) and 648322 (turquoise). .................. 50
Figure 19 Evaluation scheme. ..................................................................................................... 90


ii
List of Tables

List of Tables

Table 1: List of selected PubChem bioassays that show at least one example mentioned
in the chapter Results and Discussion. ......................................................................... 12
Table 2 Short result of a MySQL table with information stored to each exterior atom. .............. 14
Table 3 Top 10 replacements of connecting structures found by the search method. ............... 28
Table 4 Replacements for amides ([C]=O[NH]) in all results of the PubChem Search. ............. 29
Table 5 Identical replacements in different assays. .................................................................... 30
Table 6 Tanimoto coefficients of selected examples, the column match no is introduced to
make a difference between several matches of one assay. ......... 32
Table 7 Structures of two Factor Xa inhibitors with activity information. ..................................... 34
Table 8 FieldAlign results for the two selected Factor Xa inhibitors. .......... 35
Table 9 ParaFit results shown together with the FieldAlign result for the Factor Xa
inhibitors. ....................................................................................................................... 36
Table 10 Match of compounds 657882 and 893238 and activity information. ........................... 37
Table 11 FieldAlign results for match of compound 657882-S and 893238-S. .......................... 38
Table 12 ParaFit alignment and FieldAlign result for compound 657882-S and 893238-S........ 38
Table 13 Matched compounds 619837 and 657882 with activity information. ........................... 39
Table 14 Best results of FieldAlign and ParaFit for 619837-S and 657882-S. 39
Table 15 Matched compound 346489 and 2999802 and activity information. 40
Table 16 FieldAlign results for compounds 2999802-S and 346489-S. ..................................... 41
Table 17 Best results of FieldAlign and ParaFit for compounds 2999802-S and 346489-S. ..... 41
Table 18 Match of 2357867 and 3236025 with activity information. ........... 42
Table 19 FieldAlign results for compounds 2357867-S and 3236025-S. ................................... 42
Table 20 ParaFit result with options IEL plus surface for compounds 2357867-S and
3236025-S. ................................................................................................................... 43
Table 21 Match of CID 660787 and CID 1247272 and activity information. ............................... 44
Table 22 FieldAlign results for compounds 1247272-S and 660787-S. ..................................... 45
Table 23 FieldAlign and ParaFit alignments for compounds 1247272-S and 660787-S. ........... 45
Table 24 Scaffolds of CID 648322 and CID 660787 with activity information. ........................... 46
Table 25 FieldAlign results for compounds 648322-S and 660787-S. FieldAlign alignment
with best match of exit vectors. Ranking position 6. ..................................................... 46
Table 26 ParaFit and FieldAlign results for 648322-S and 660787-S. ....... 47
Table 27 Scaffold conformations resulting from GOLD and results of FieldAlign. ...................... 50
Table 28 Match of compounds 1213867 and 5307622 with activity information. ....................... 52
Table 29 Compounds 3233726 and 1113965 with activity information. ..................................... 53
Table 30 FieldAlign results for compounds 3233726-S and 1113965-S. ... 53
Table 31 FieldAlign and ParaFit alignments for compounds 3233726-S and 1113965-S.
ParaFit is performed with the FieldAlign result conformation. ...................................... 54
iii List of Tables


Table 32 Match of compounds 1301263-S and 2537704-S with activity information. ................ 55
Table 33 FieldAlign results for compounds 1301263-S and 2537704-S. .................................... 56
Table 34 FieldAlign and ParaFit alignments for compounds 1301263-S and 2537704-S. ......... 56
Table 35 Compounds 2228581 and 2375583 and activity information. ...... 57
Table 36 FieldAlign results for 2228581-S and 2375583-S. ....................................................... 58
Table 37 Query structure 2375583-S and MOE minimized query structure. .............................. 58
Table 38 ParaFit alignment of query 2375583-S with the best FieldAlign output
conformation of 2228581-S. .......................................................................................... 59
Table 39 Match of 848717 and 2346231 with activity information. ............. 59
Table 40 Match of compounds 662711 and 5309163 with activity information. ......................... 60
Table 41 FieldAlign results for compounds 662711-S and 5309163-S. ...................................... 60
Table 42 ParaFit results for compounds 662711-S and 5309163-S. .......... 61
Table 43 Match of compounds 1121074 and 3243128 with activity information. ....................... 62
Table 44 FieldAlign results for compounds 1121074-S and 3243128-S. .................................... 63
Table 45 ParaFit alignment of 1121074-S and 3243128-S, left: results for the initial
conformation, right: the results with FieldAlign output conformation. ........................... 63
Table 46 Match of compounds 1472474 and 5069358 with activity information. ....................... 64
Table 47 FieldAlign results for compounds 1472474-S and 5069358-S. .................................... 64
Table 48 ParaFit alignments of compounds 1472474-S and 5069358-S. .. 65
Table 49 Match of compounds 753280 and 4189325 with activity information. ......................... 66
Table 50 FieldAlign results for 753280-S and 4189325_1-S. ..................................................... 67
Table 51 ParaFit alignments of 753280-S and the FieldAlign conformation of 4189325-S ........ 67
Table 52 Match of compounds 753277 and 418932 and activity information. ............................ 67
Table 53 FieldAlign results for compounds 753277-S and 418932_3-S. .................................... 68
Table 54 Match of compounds 664194 and 3586643 with activity information. ......................... 69
Table 55 FieldAlign results for compounds 664194-S and 3586643-S. ...... 70
Table 56 ParaFit alignments of compounds 664194-S and FieldAlign output conformation
of 3586643-S. ................................................................................................................ 70
Table 57 Match of compounds 743142 and 3235480 with the Log of AC50. ............................. 71
Table 58 FieldAlign results for compounds 743142-S and 3235480-S. ...................................... 71
Table 59 ParaFit alignments of compounds 743142-S and 3235480-S. .... 72
Table 60 Match of compounds 389699 and 6604860 with activity information. ......................... 72
Table 61 FieldAlign results of compounds 389699-S and 6604860-S. ....................................... 72
Table 62 FieldAlign and ParaFit alignments of compounds 389699-S and 6604860-S. ............ 73
Table 63 Match of compounds 682802 and 6603723 with activity information. ......................... 74
Table 64 FieldAlign results for compounds 682802-S and 6603723-S. ...................................... 75
Table 65 ParaFit alignments of compounds 682802-S and 6603723-S. .... 75
Table 66 Match of compounds 828654 and 974955 with activity information. ........................... 76
Table 67 FieldAlign results for compounds 828654-S and 974955-S. ........................................ 77
iv
List of Tables

Table 68 ParaFit alignments of 828654-S and 974955-S with two different options:
IEL/surf and EAL/surf. ................................................................................................... 77
Table 69 Match of compounds 1088428 and 2972465 with activity information. ....................... 78
Table 70 FieldAlign results for 1088428-S and 2972465-S. ................................ 78
Table 71 ParaFit alignment of 2972465-S with the FieldAlign output conformation of
1088428-S. ................................................................................... 79
Table 72 Match of compounds 746912 and 2972880 with activity information. ......................... 79
Table 73 FieldAlign results for compounds 746912-S and 2972880-S. ..................................... 80
Table 74 ParaFit alignments of query 746912-S and 2972880-S with the initial input
conformation and the FieldAlign result of 2972880-S. Optimized for IEL and
surface. ......................................................................................................................... 80
Table 75 Match of compounds 2725318 and 5854588 with activity information. ....................... 82
Table 76 FieldAlign results for 2725318-S and 5854588-S. 83
Table 77 ParaSurf alignment of 2725318-S and the FieldAlign output conformation of
5854588-S. ................................................................................................................... 83
Table 78 Match of compounds 665592 and 705445 with activity information. ........................... 85
Table 79 FieldAlign results for 665592-S and 705445-S. 85
Table 80 Match of compounds 754802 and 923600 with activity information. 86
Table 81 FieldAlign results for compounds 754802-S and 923600-S. ....................................... 87
Table 82 ParaFit alignments of compound 754802-S and the initial conformation of
923600-S as well as the FieldAlign output conformation of 923600-S. ........................ 87
Table 83 Active compounds of assay ID 1215. Each column contains compounds with
nearly identical scaffolds. .............................................................................................. 89
Table 84 Similarity scores and RMSD of exit vectors for FieldAlign and ParaFit results of
selected examples. The final confidence level is given in the last column: no (-),
moderate (±) and high (+) confidence for a bioisosteric scaffold replacement. ............ 92
Table 85 Identified potential bioisosteric replacements with a high confidence and their
differences in chemical properties. ............................................................................... 94
Table 86 Identified potential bioisosteric replacements with a moderate confidence level
and their difference in chemical properties. .................................................................. 96
Table 87 Examples with no confidence for a bioisosteric potential and their difference in
chemical properties. ...................................... 97
v Zusammenfassung

Zusammenfassung

Der bioisostere Austausch von zentralen oder Kernelementen, so genannter Scaffolds,
einer Leitstruktur spielt eine wichtige Rolle in der Leitstrukturoptimierung. Bei einem
bioisosteren Austausch geht es darum, mit einer anderen Struktur eine ähnliche biologi-
sche Funktion zu erreichen. Dieser strukturelle Austausch kann verschiedene Auswir-
kungen haben, z.B. kann er die biologische Verfügbarkeit der Leitstruktur verbessern,
toxische Effekte verhindern oder bereits patentierte Strukturen ersetzen oder umgehen.
Es gibt verschiedene Programme, die mögliche Scaffold-Austausche vorschlagen und
dabei sowohl das Interaktionspotential des Scaffolds erhalten als auch die Ausrichtung
von angrenzenden Substituenten oder funktionellen Gruppen. Diese Programme haben
verschiedene theoretische Ansätze. Sie verwenden z.B. elektronische Felder oder
Pharmacophore, um die Ähnlichkeit verschiedener Scaffolds zu bestimmen. Retrospek-
tive Beispiele zur Validierung dieser Methoden sind hingegen rar und daher ist die
Beurteilung der Qualität dieser Werkzeuge keine einfache Aufgabe.

Es ist das Ziel dieser Arbeit, die Anzahl der bekannten retrospektiven Beispiele von
erfolgreichen bioisosteren Scaffold-Austauschen zu vergrößern. Aus diesem Grund
wurde eine systematische Suche nach diesen Austauschen entwickelt. Diese
systematische Suche wurde in den Daten von aktiven Liganden aus 71 ausgewählten
Bioassays der PubChem Datenbank durchgeführt, alle Bioassays gemeinsam enthalten
11020 Moleküle. In den Assay Daten wird nach Molekül-Paaren gesucht, die mindes-
tens zwei gemeinsame aber unverbundene Substrukturen enthalten. In dem Fall, in dem
die zurückbleibenden unterschiedlichen Strukturen alle gemeinsamen Strukturen der
zwei Moleküle verbinden, besteht die Möglichkeit, dass es sich um einen Scaffold-
Austausch handelt.

Um herauszufinden, wie groß die Übereinstimmung des Interaktionspotentials der bei-
den Moleküle des Scaffold-Paares ist, wurden einige ausgewählte Beispiele mit zwei
verschiedenen Programmen überlagert. Die verwendeten Programme sind FieldAlign
und ParaFit. Es wurden unterschiedliche Programme ausgesucht, um unabhängige Er-
gebnisse zu erhalten.
1

Soyez le premier à déposer un commentaire !

17/1000 caractères maximum.