Sequence-structure relations of pseudoknot RNA
19 pages
English

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Sequence-structure relations of pseudoknot RNA

-

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus
19 pages
English
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

Description

The analysis of sequence-structure relations of RNA is based on a specific notion and folding of RNA structure. The notion of coarse grained structure employed here is that of canonical RNA pseudoknot contact-structures with at most two mutually crossing bonds (3-noncrossing). These structures are folded by a novel, ab initio prediction algorithm, cross, capable of searching all 3-noncrossing RNA structures. The algorithm outputs the minimum free energy structure. Results After giving some background on RNA pseudoknot structures and providing an outline of the folding algorithm being employed, we present in this paper various, statistical results on the mapping from RNA sequences into 3-noncrossing RNA pseudoknot structures. We study properties, like the fraction of pseudoknot structures, the dominant pseudoknot-shapes, neutral walks, neutral neighbors and local connectivity. We then put our results into context of molecular evolution of RNA. Conclusion Our results imply that, in analogy to RNA secondary structures, 3-noncrossing pseudoknot RNA represents a molecular phenotype that is well suited for molecular and in particular neutral evolution. We can conclude that extended, percolating neutral networks of pseudoknot RNA exist.

Informations

Publié par
Publié le 01 janvier 2009
Nombre de lectures 1
Langue English
Poids de l'ouvrage 3 Mo

Extrait

BioMed CentralBMC Bioinformatics
Open AccessResearch
Sequence-structure relations of pseudoknot RNA
Fenix WD Huang, Linda YM Li and Christian M Reidys*
Address: Center for Combinatorics, LPMC-TJKLC, Nankai University, Tianjin 300071, PR China
Email: Fenix WD Huang - fenixprotoss@163.com; Linda YM Li - liyanmei@cfc.nankai.edu.cn; Christian M Reidys* - reidys@nankai.edu.cn
* Corresponding author
from The Seventh Asia Pacific Bioinformatics Conference (APBC 2009)
Beijing, China. 13–16 January 2009
Published: 30 January 2009
BMC Bioinformatics 2009, 10(Suppl 1):S39 doi:10.1186/1471-2105-10-S1-S39
<supplement> <title> <p>Selected papers from the Seventh Asia-Pacific Bioinformatics Conference (APBC 2009)</p> </title> <editor>Michael Q Zhang, Michael S Waterman and Xuegong Zhang</editor> <note>Research</note> </supplement>
This article is available from: http://www.biomedcentral.com/1471-2105/10/S1/S39
© 2009 Huang et al; licensee BioMed Central Ltd.
This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Background: The analysis of sequence-structure relations of RNA is based on a specific notion
and folding of RNA structure. The notion of coarse grained structure employed here is that of
canonical RNA pseudoknot contact-structures with at most two mutually crossing bonds (3-
noncrossing). These structures are folded by a novel, ab initio prediction algorithm, cross, capable
of searching all 3-noncrossing RNA structures. The algorithm outputs the minimum free energy
structure.
Results: After giving some background on RNA pseudoknot structures and providing an outline
of the folding algorithm being employed, we present in this paper various, statistical results on the
mapping from RNA sequences into 3-noncrossing RNA pseudoknot structures. We study
properties, like the fraction of pseudoknot structures, the dominant pseudoknot-shapes, neutral
walks, neutral neighbors and local connectivity. We then put our results into context of molecular
evolution of RNA.
Conclusion: Our results imply that, in analogy to RNA secondary structures, 3-noncrossing
pseudoknot RNA represents a molecular phenotype that is well suited for molecular and in
particular neutral evolution. We can conclude that extended, percolating neutral networks of
pseudoknot RNA exist.
other hand, RNA, being less structurally constrained thanBackground
Three decades ago, Michael Waterman pioneered the its chemical relative DNA, does fold into 3D-structures,
combinatorics and ab initio prediction of the at that time representing the phenotypic executive. Therefore one
rather exotic ribunucleic acid (RNA) secondary structures molecule stands for both: geno- and phenotype.
[1-5]. The motivation for this work was coming from a
fundamental dichotomy represented by RNA. On one Indeed, a vast variety of RNA activities was found: the dis-
hand RNA is described by its primary sequence, a linear covery of catalytic RNAs, or ribozymes, in 1981 proved
string composed of the nucleotides A, G, U and C. The pri- that RNA could catalyze reactions just as proteins. RNA
mary sequence embodies the genotypic legislative. On the can act also as a messenger between DNA and protein in
Page 1 of 19
(page number not for citation purposes)BMC Bioinformatics 2009, 10(Suppl 1):S39 http://www.biomedcentral.com/1471-2105/10/S1/S39
20 ribosomal RNA [10]. They are conserved in the catalytic
core of group I introns, in plant viral RNAs pseudoknots
mimic tRNA structure and in in vitro RNA evolution [11],
where experiments produced families of RNA structures
30 with pseudoknot motifs, when binding HIV-1 reverse
10
transcripts. Important mechanisms like ribosomal frame
3’end shifting [12] also involve pseudoknot interactions.
50 56
For prediction algorithms the implications of cross-serial
40 dependencies are severe-they imply a higher level of for-
mal language: context-sensitive. In general, on this level of
formal languages it is not clear whether or not polynomial
time ab initio folding algorithms exist. Indeed, Lyngsø et
1
al. [13] showed that "reasonable" classes of RNA pseudo-5’end
knots require exponential time algorithms. There exist
however, polynomial time folding algorithms, capable of
the energy based prediction of certain pseudoknots: Rivas
3’end et al. [14], Uemura et al. [15], Akutsu [16] and Lyngsø5’end 1 10 20 30 40 50 56
[13]. The output of these algorithms, however, remains
.((((((((...............[[[[[[[[[[.))))))))..]]]]]]]]]].
somewhat "mysterious"-it is not clear which types of
pseudoknots can be generated.Figure 1RNA pseudoknot structures
RNA pseudoknot structures. Three representations of
In analogy to the case of RNA secondary structures, thethe UTR-pseudoknot structure of the mouse hepatitis virus.
identification of key combinatorial properties of the out-First, the planar graph representation, second the diagram
representation and finally the output produced by cross. put class offers deeper understanding. The combinatorial
properties of RNA pseudoknot structures discussed in the
following have indeed profound implications: first
the form of transfer RNA. The realization that RNA com- sequence-structure maps will generate exponentially
bines features of proteins with DNA led to the "RNA many structures with neutral networks of exponential
world" hypothesis for the origin of life. The idea was that size. Second, the latter will come close to each other in
DNA and the much more versatile proteins took over sequence space, thereby allowing for efficient evolution-
RNA's functions in the transition from the "RNA-world" ary search. None of these findings depend on the particu-
to the "DNA/protein-world". lar choice of loop-energies or the partition function [17].
Furthermore, without combinatorial specification, as it is
Let us have a closer look at RNA phenotypes. RNA mole- the case for the above mentioned DP based pseudoknot
cules form "helical" structures by folding, i.e. pairing their folding algorithms [14], one arrives at an impossibly large
nucleotides and thereby lowering their minimum free configuration space.
energy (mfe). Originally, these bonds were subject to strict
combinatorial constraints, for instance "noncrossing" in For instance, the inductive generation of gap-matrices
RNA secondary structures. For the latter, dynamic pro- produces arbitrarily high number of mutually crossing
gramming (DP) algorithms, predicting the minimum free arcs. The results in [18] prove, that the exponential growth
energy configuration were given 1980 [5,6]. It is well- rate of pseudoknot structures is linear in the crossing
known, however, that RNA structures are far more com- number. Accordingly, via gap-matrices, an uncontrollably
plex than secondary structures. One particularly large output class is being generated. Nevertheless, the
prominent feature is the existence of cross-serial depend- DP-routine using pairs of gap-matrices cannot generate
encies [7], that is crossing arcs or pseudoknots, see Figure any 3-noncrossing nonplanar pseudoknot structure.
1, where we display the natural UTR-pseudoknot structure
of the mouse hepatitis virus. Cross also folds into the nat- We will show that the notion of k-noncrossing diagrams
ural structure given in Figure 1. In Figure 2 we present [19] allows us to specify a suitable output-class for pseu-
another RNA pseudoknot structure, the HDV-pseudo- doknot folding algorithms. Recall that a diagram is a
knot. We present here the structure as folded by cross and graph over the vertex set [n] = {1, ..., n} with vertex degree
also its natural structure [8]. less than or equal to one. It is represented by drawing the
vertices in a horizontal line and its arcs (i, j), where i <j, in
In fact, RNA pseudoknots are "everywhere". They occur in the upper half-plane. The vertices and arcs correspond to
functional RNA, like for instance RNAseP [9] as well as nucleotides and Watson-Crick (A-U, G-C) and (U-G) base
Page 2 of 19
(page number not for citation purposes)BMC Bioinformatics 2009, 10(Suppl 1):S39 http://www.biomedcentral.com/1471-2105/10/S1/S39
12 3 4 5 6 7 8 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87
a
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87
b
Figure 2HDV structure
HDV structure. (a) Diagram representation of Hepatitis Delta Virus structure folded by our algorithm. (b) Diagram repre-
sentation of natural Hepatitis Delta Virus.
pairs, respectively. A diagram is k-noncrossing if it con- structures λ is greater than or equal to four. We call dia-
tains at most k - 1 mutually crossing arcs. Diagrams have grams with a minimum stack-length τ, τ-canonical and if
t

  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents