Volume 13 Supplement 3
Selected articles from the IEEE International Conference on Bioinformatics and Biomedicine 2011: Genomics
Simultaneous prediction of RNA secondary structure and helix coaxial stacking
 Pooya Shareghi^{1}Email author,
 Yingfeng Wang^{1},
 Russell Malmberg^{2, 3} and
 Liming Cai^{1, 2}Email author
DOI: 10.1186/1471216413S3S7
© Shareghi et al.; licensee BioMed Central Ltd. 2012
Published: 11 June 2012
Abstract
Background
RNA secondary structure plays a scaffolding role for RNA tertiary conformation. Accurate secondary structure prediction can not only identify doublestranded helices and single strandedloops but also help provide information for potential tertiary interaction motifs critical to the 3D conformation. The average accuracy in ab initio prediction remains 70%; performance improvement has only been limited to short RNA sequences. The prediction of tertiary interaction motifs is difficult without multiple, related sequences that are usually not available. This paper presents research that aims to improve the secondary structure prediction performance and to develop a capability to predict coaxial stacking between helices. Coaxial stacking positions two helices on the same axis, a tertiary motif present in almost all junctions that account for a high percentage of RNA tertiary structures.
Results
This research identified energetic rules for coaxial stacks and geometric constraints on stack combinations, which were applied to developing an efficient dynamic programming application for simultaneous prediction of secondary structure and coaxial stacking. Results on a number of noncoding RNA data sets, of short and moderately long lengths, show a performance improvement (specially on tRNAs) for secondary structure prediction when compared with existing methods. The program also demonstrates a capability for prediction of coaxial stacking.
Conclusions
The significant leap of performance on tRNAs demonstrated in this work suggests that a breakthrough to a higher performance in RNA secondary structure prediction may lie in understanding contributions from tertiary motifs critical to the structure, as such information can be used to constrain geometrically as well as energetically the space of RNA secondary structure.
Introduction
RNA secondary structure plays the critical role of scaffolding the tertiary structure (i.e., 3D conformation) [1–5]. In the secondary structure, WatsonCrick (AU and GC) and wobble GU pairs form doublestranded helices that enclose unpaired, singlestrand loops [6]. The distinguishable pattern of canonical base pairs has enabled ab initio prediction of the secondary structure, typically by minimization of the global free energy associated with involved structure elements [7–10]. In the past three decades, considerable success has been made in secondary structure prediction, e.g., with average accuracy of about 70% [11–13], and offered a viable venue toward RNA tertiary structure prediction [4, 5, 13, 14]. However, prediction performance breakthroughs have been limited to short RNA sequences; improvements on the accuracy for longer RNA sequences have relied on multiple related sequences [15, 16], which are often not available, or profile based alignments [17, 18], which can only be effective for known structures.
Elements of the secondary structure are interrelated with tertiary interaction motifs [2, 4, 19], which consist of less understood noncanonical base pairs, with some just being revealed recently [19, 20]. Such motifs bundle and connect helices to form and stabilize the tertiary structure. As a common local motif, two helices sharing a contiguous backbone strand may coaxially stack resulting in an energetically more stable pseudocontiguous helix [21, 22]. Coaxial helices are prevalent in known RNA tertiary structures, for instance accounting for 32% of 613 tertiary interactions in 54 highresolution RNA structures investigated by Schlick group [23]. In particular, they are present at about 84% of multiple loop junctions involved in these structures. Since junctions are singlestrand loops joined and enclosed by helices, computational methods effective on prediction of coaxial stacking would substantially improve the performance of the secondary structure prediction as well.
There were only a few previous results in computational investigation of RNA helix coaxial stacking. Walter et al [24] demonstrated in a case study that basepair to basepair stacks between terminal base pairs of two neighboring helices provide free energy improvement for the predicted secondary structure. Tyagi and Mathews [22] tested the idea of predicting coaxial stacking by free energy minimization using nearestneighborhood parameters on known RNA secondary structures. They showed the potential to predict coaxial stack with free energy minimization when the number of intervening mismatches between stacked helices is small. In the comparative analysis of 3way junctions joined by three helices, Lescoute and Westhof measured distance distributions between the two coaxially stacked helices within the junctions [25]. For junctions of fourways and of higher orders, it was observed by Schlick group [26, 27] that coaxial stacking occurs preferentially in helices adjacent to loops of small size and rich in adenine. In this paper, we present a new method for the prediction of RNA secondary structure and coaxial stacking. Different from previous secondary structure prediction methods, ours can produce information of coaxially stacked helices included in the predicted structure. Unlike prediction of coaxial stacking upon an already predicted secondary structure, the new method offers the simultaneous prediction of the two. We discovered and applied rules of coaxial stacking, including both sequential and structural patterns, to the prediction of secondary structure. Such rules constrain possible energetic and geometric relationships between helices to be predicted, resulting in a reduced space of alternative structures and a potential improvement in secondary structure prediction.
The new method has been developed into a dynamic programming application (called RNAcoast). We conducted tests on five families of ncRNAs, of total 386 sequences, to evaluate the capability of the new method in simultaneous prediction of secondary structure and coaxial stacking. These ncRNAs were retrieved from Rfam database [28], of short to moderately long lengths, with and without coaxial stacking in the tertiary structure. RNAcoast produced comparable predictions as the stateoftheart program RNAfold on all cases, it outperformed the latter on tRNAs, where coaxial stacks are present, by an additional 17% accuracy, a significant leap from the average performance (i.e., 6070% of the number of correct base pairs) achievable by previous energy based models on tRNAs. The test results demonstrate that coaxial stacking rules can successfully narrow down a possibly large number of alternative structures within 510% of the predicted minimum energy, which would otherwise be difficult to distinguish.
Results
We implemented the algorithm into a program named RNAcoast. We tested five ncRNA sets of sequences on our program and compared the predicted secondary structure with the original Rfam annotation to evaluate its accuracy. We also tested these ncRNA sequences on the stateoftheart secondary structure prediction program RNAfold [9, 29], and made performance comparisons between the mentioned programs. All test data and results are available at: http://www.cs.uga.edu/~shareghi/RNAcoast.
Data preparation
Sensitivity based on the number of correctly predicted base pairs
ncRNA  Num. of sequences  Avg len.  Min len.  Max len.  Sensitivity (RNAcoast)  Sensitivity (RNAfold) 

Hh3  84  55  40  82  85.04%  95.71% 
tRNA  95  74  66  93  81.67%  64.59% 
IntrongII  98  87  42  154  81.94%  83.71% 
P4P6  30  126  58  191  57.42%  64.62% 
HCV  79  112  85  116  83.01%  78.43% 
Performance in secondary structure prediction
We conducted two types of evaluations on the predicted structures. One is to consider the percentage of base pairs correctly predicted by the programs. The other is to consider the number of sequences whose overall structure topology is correctly predicted. Shown in the next section, we also evaluated the capability of RNAcoast in predicting coaxial stacks.
where TP is the number of true positives (i.e. correctly predicted base pairs) and FN the number of false negatives (i.e. missed base pairs). The results show that for short sequences of simpler secondary structures, i.e., Hammerhead type III and Intron Group II, both RNAcoast and RNAfold performed well, with RNAcoast slightly less accurate than RNAfold. Also, for longer sequences in HCV IRES domain III dataset, both programs performed well, with RNAcoast slightly more accurate than RNAfold.
Test results on the tRNA data set demonstrates the true advantage of incorporating coaxial stacking into prediction of ncRNAs that may contain coaxial stacking motifs. RNAcoast outperformed RNAfold by more than 17% accuracy, a significant leap from the average performance (i.e., 6070%) achievable by previous energy based models on tRNAs. The coaxial stacking rules successfully narrowed down a possibly large number of alternative structures within 510% of the predicted minimum energy, which would otherwise be difficult to distinguish [12].
Sensitivity based on the number of correctly predicted topologies
ncRNA  Topology sen. (%)  Adjusted topology sen. (%)  

RNAcoast  RNAfold  RNAcoast  RNAfold  
Hh3  75  92.86  N/A  N/A 
tRNA  72.63  24.21  86.32  27.37 
IntrongII  75.51  84.69  N/A  N/A 
P4P6  30  56.67  66.67  86.67 
HCV  74.68  75.95  N/A  N/A 
We point out that the relatively low sensitivity for RNAcoast on Hammerhead ribozyme type III shown in Table 2 was due to the extra stemloop it predicted within the threeway junction, much as the situation of the variable loop of tRNAs.
For longer sequences of P4P6, counting correctly predicted base pairs appeared to distance RNAcoast a little more from RNAfold; but neither programs achieved a satisfactory sensitivity. The underperformance may be explained by the nature of the P4P6 sequences and the reference consensus structure from Rfam. Out of the thirty sequences tested, 12 of them have lengths exceeding 150 (but under 191), 6 sequences have lengths below 90, and another 12 have lengths in between. The consensus structure from Rfam was based on the smallest group of short sequences, leaving a long inserted region for others. Though both programs were able to predict the substructure formed in the inserted region, but the small number of base pairs annotated in the consensus made them easy to be missed by both programs. However, in spite of the low number of base pairs correctly predicted by RNAcoast, the program was able to achieve an adjusted 67% sensitivity in topology prediction.
Performance in coaxial stacking prediction
where FP stands for false positive, the number of incorrectly predicted coaxial stacks.
PPV and sensitivity based on the number of correctly predicted coaxial stackings
ncRNA  Num of sequences  TP  FP  PPV(%)  Sensitivity(%) 

tRNA  95  130  44  74.71  68.42 
Hh3  84  59  37  61.45  70.23 
HCV  79  74  38  66.07  46.83 
We compare these results with a previous work by Tyagi and Mathews who tested the idea of coaxial stack prediction using the energy minimization with nearestneighbor parameters [22] on 31 ncRNAs (with known secondary structures and crystal tertiary structures). We notice that there were 17 tRNA sequences among these 31 sequences, for which the average PPV and sensitivity reported in the literature [22] were 58% and 66%, respectively on k = 0 and k = 1, where k is the number of unpaired nucleotides at the point of backbone joining of the two coaxially stacked helices.
Discussion
While our program, RNAcoast, produced comparable predictions as the stateoftheart program RNAfold on all cases, it outperformed the latter on tRNAs, where coaxial stacks are present, by more than 17% accuracy, a significant leap from the average performance (i.e., 6070% of the number of correct base pairs) achievable by previous energy based models on tRNAs. Furthermore, RNAcoast predicted 86% of secondary structure topologies correctly, while RNAfold only predicted 27% of topologies correctly. Our coaxial stacking rules can successfully pick out the most plausible one from a possibly large number of alternative structures within 510% of the predicted minimum energy, which would otherwise be difficult to distinguish [12]. Such a performance is encouraging to solving the problem of RNA tertiary structure prediction.
We point out the small differences in performance between RNAcoast and RNAfold on Hammerhead type III and Intron Group II were most likely due to the simple strategy to exclude loop energies which was built into the current version RNAcoast. This was a little more of an issue for long sequences in the P4P6 dataset, which became serious when the consensus structure did not include a long inserted region. While improving the performance of RNAcoast can be achieved by incorporating the dismissed loop energies, a strategy different from evaluating predictions against the consensus structure may help as well.
We did not use the positive predictive value (PPV) to measure the performance in the correctly predicted base pairs. This was because some base pairs not belonging to the consensus structure but predicted by the programs may be valid if they fall in inserted regions of the consensus structure. Counting such base pairs as false positives would be bias against sequences substantially longer than the consensus. The situation was evident by our tests on these sequences, typically tRNAs where the variable loop may contain an extra stemloop.
We have also examined the coaxial stacking prediction on the P4P6 sequences by RNAcoast. In contrast to RNAfold that predicted 8 threeway junctions in the long inserted region, our program predicted 5 threeway junctions in that region, with the same left nested coaxial stack predicted for 3 out of the 5 threeway junctions. Such a predicted coaxial stack has yet to be verified as it was counted as a real motif in one work [25] while was not by another [22].
The outcome of the tests on tRNAs is most interesting. The secondary structures of tRNAs were difficult to predict from individual sequences with energybased methods, in spite of the conserved native structure across types and species. This is because a tRNA may have many alternative structures with free energies within 510% of the minimum free energy.
Conclusions
This work introduced a new method for simultaneous prediction of RNA secondary structure and coaxial stacking between helices. The aim of the incorporation of coaxial stacking detection included improving the performance of energybased ab initio secondary structure prediction. Our research identified sequential, energetic, and geometric rules for helix coaxial stacking to apply to a dynamic programming algorithm for secondary structure prediction. Results from testing the implemented program RNAcoast on five ncRNA datasets obtained from Rfam demonstrated the effectiveness of our method.
The significant leap of performance on tRNAs in this work suggests that a breakthrough to a higher performance in RNA secondary structure prediction may lie in understanding contributions from tertiary motifs critical to the structure, as such information can be used to constrain geometrically as well as energetically the space of RNA secondary structure. Since coaxial stacking is still a local tertiary motif, incorporating information of tertiary motifs of higher orders, such junctions, may further improve the prediction performance.
Methods
Coaxial stacking rules
Previous investigations on threeway junctions [25] and junctions of higher orders [22, 26, 27] have revealed the small number k of unpaired nucleotides present at the joining loop between the two helices involved in a coaxial stacking. To verify this phenomenon for a wider spectrum of ncRNAs, we conducted a survey on the 51 sets of ncRNA seed alignments from Rfam [28], which had been used by software Infernal [18] as benchmarks. We computed the thermodynamic free energy of every helix instance using the RNAeval component of the Vienna RNA Package [9, 29].
Based on this survey, we were able to identify two energy thresholds: less than 2.5 Kcal/mol for semistable helices, and less than 3.7 Kcal/mol for stable helices [30]. Both require at least three base pairs in which at least one is a GC pair. We discovered that semistable helices are overwhelmingly very close to other helices in backbone positions. This confirms our conjecture that semistable helices interact with other helices on a contiguous strand, i.e., through coaxial stacking [30]. This also suggests a small distance k between coaxial stacked helices, consistent with the findings by others [22, 25–27]. In this preliminary work, we used k ≤ 1 as a necessary condition for two neighboring helices to coaxially stack. In our method, coaxial stacking may occur in a twoway junction consisting of two helices sharing both connecting loops or in a multiway junction joined by multiple helices.
 1.
Coaxial stack (X, Y) is nested if max L(X) < min L(Y) and max H(Y) < min H(X).
 2.
Coaxial stack (X, Y) is parallel if max H(X) < min L(Y).
In particular, coaxial stacks in twoway junctions are always nested. In multipleway junctions, coaxial stacks may be either nested or parallel (see Figures 1 and 2). For example, Figure 1 shows two coaxial stacks: a parallel stack between the D helix and the anticodon helix, and a nested stack between the T_{Ψ}C helix and the acceptor helix.
The amount of reduced energy, attributed to a coaxial stack, is defined as the free energy contributed from the two stacked base pairs on the interface (see Figure 2). The amount of energy, thus computed via software RNAeval, ranges from 0.9 Kcal/mol to 3.4 Kcal/mol. This is close to the parameter used by Tyagi and Mathews [22].
Geometric constraints
 1.
Stacks (X, Y) and (Z, X) are nested stacks,
 2.
Stack (X, Y) is nested and stack (X, Z) or (Z, X) is parallel,
 3.
Stack (Y, X) or (X, Y) is parallel and stack (X, Z) is nested.
Algorithm
We developed our method into an algorithm for ab initio and simultaneous prediction of secondary structure and coaxial stacks. There are two major phases: preprocessing and prediction. Given a query RNA sequence, the preprocessing step finds all semistable, stable, and ultrastable helices (see the Algorithm overview section below), and also all potential coaxially stacked helix pairs. The computed information is then passed onto the prediction phase, which uses a dynamic programming algorithm in the spirit of Nussinov's algorithm. However, our algorithm is established at helixlevel instead of nucleotidelevel for the purpose of incorporating coaxial stacking. Since helices cannot be sorted in a linear order, the dynamic programming algorithm design became a nontrivial task.
Preprocessing of helices
The preprocessing step picks up helix candidates and identifies potential coaxial stacks. A semiglobal alignment algorithm is used for searching helix candidates [30]. In a helix candidate, either backbone is allowed to contain at most one unpaired nucleotide. The free energy of helix candidates is measured using RNAeval, a component of the Vienna RNA Package [9, 29].
Two helices are recognized as a potential coaxial stack if they share a contiguous singlestrand backbone with at most one unpaired nucleotide. Potential coaxial stacks are classified into parallel and nested stacks based on the conditions given in the section above about Coaxial stacking rules. The extra energy reduction of a coaxial stacking is computed from the two terminal base pairs of the helices as if they were two contiguous base pairs (see the same section).
Prediction via dynamic programming
We adopted the idea in Nussinov's algorithm [7] to develop a dynamic programming algorithm for simultaneous prediction of secondary structure and coaxial stacks. Nussinov's algorithm and alike [9, 11] use a simple dynamic programming approach, at the nucleotide level, to predict the secondary structure of an RNA. For each subsequence from position i to j, Nussinov's algorithm computes the substructure with the maximum number of base pairs. In contrast, however, our algorithm works at the helix level in order to incorporate coaxial stacking information. Since helix candidates cannot be sorted in a linear order, the dynamic programming is not straightforward. We addressed this issue by employing partial orderings.
Candidates and orderings
A helix consists of two base pairing regions; each region is a contiguous backbone consisting of a number consecutive nucleotides. A helix found by the preprocessing step can be viewed as two base pairing regions. Throughout this section we will refer to candidate regions simply as candidates.
On an RNA sequence x_{1} ... x_{ n }, for each subsequence from position x_{ a } to x_{ b } our algorithm goes through every pair of candidates i and j, where i starts at position x_{ a } and j ends at position x_{ b }. The preprocessing may generate several candidates that start at the same position or end at the same position. Therefore, the order in which we visit such overlapping candidates is important to ensure that we always move from smaller subproblems to larger ones. In other words, for the mentioned subsequence, we want to consider the longest candidate i and the longest candidate j before considering shorter ones. Hence, we assign two different indices to each candidate according to Starting Position Order (SPO) and Ending Position Order (EPO), i.e., for two candidate regions r and s, assuming b(r) gives the starting position of region r, and e(r) gives its ending position, we have:

r ≤ _{ SPO }s, if b(r) < b(s), or if b(r) = b(s) &e(r) < e(s).

r ≤ _{ EPO }s, if e(r) < e(s), or if e(r) = e(s) &b(r) < b(s).
If two candidates occupy the exact same region on the sequence, then one of them gets the lower index in a consistent manner throughout the algorithm.
The recurrence relations in our dynamic programming algorithm have the general form F(i, j), where F is a recursive function defined with specific semantic constraints; it gives the maximum score for the optimal substructure (following the mentioned constraints) of the subsequence that starts from the beginning of the candidate with SPO index i and ends at the end of the candidate with EPO index j. Henceforth, for convenience, i always refers to an SPO index, and j always refers to an EPO index.
Algorithm overview
Similar to Nussinov's algorithm, four different cases can happen when finding the optimal structure of the subsequence spanned from candidate i to j:

Region i forms a helix (or pairs) with region j.

Region i does not participate in the optimal structure.

Region j does not participate in the optimal structure.

The optimal structure is formed by putting together the optimal substructures of the subsequence from region i to region k, and of the subsequence from region k + 1 to region j, for some k.
 1.
An mway junction or a single helix not enclosed by any other helix. Such an mway junction is without a coaxial stacking.
 2.An mway junction enclosed by a helix:
 (a)
an mway junction, without coaxial stacking,
 (b)
a 2way junction where the helices coaxially stack,
 (c)
an mway junction, m > 2, with left or right nested coaxial stacking,
 (d)
an mway junction, m > 2, where two of the helices form a parallel stacking.
 (a)
Each of the three types of helices, defined earlier (see the Preprocessing section), contributes differently to building the above topological constructs. A semistable helix can appear in the predicted structure only if (a) it coaxially stacks with a stable helix and does not enclose any other helices, or (b) it participates in a 2way junction and the two helices together are strong enough to act as a stable helix. The only restriction for a stable helix is that it cannot immediately enclose an mway junction. In addition to semistable and stable helices, we also have ultrastable helices. An ultrastable helix has a free energy level lower than 4.6 Kcal/mol and has more than 5 base pairs. For an mway, m > 2, junction to exist, it needs to be enclosed by an ultrastable helix. An exception to this rule is when two helices involved in a 2way junction, put together, are strong enough to act as an ultrastable helix, then they also can enclose an mway junction. We define different types of recurrences for generating an optimal secondary structure made of the above topological constructs such that the geometric constraints are met and the coaxial stacking rules are followed as well.

M (i, j) in which the substructure for the subsequence from region i to j is not enclosed by any helix; it generates construct 1,

Functions of the form M_{ xy }(i, j) where it is assumed that the structure is enclosed by a helix outside the mentioned subsequence; henceforth, referred to as the enclosing helix. Such recurrences do not immediately cause a bifurcation, therefore, they do not immediately enclose an mway, m > 2, junction. Instead they may form a helix between i and j, or ignore i and/or j in order to move to a smaller subproblem. The subscript xy is used to determine the type of the recurrence, and it can be any of

2W: i and j form a helix, and together with the outside enclosing helix they form a 2way junction that may or may not involve a coaxial stacking.

@ϵ: the leftmost helix of the substructure coaxially stacks with the enclosing helix.

ϵ@: the rightmost helix of the substructure coaxially stacks with the enclosing helix.

ϵ : the rightmost helix of the substructure forms a parallel coaxial stack with a helix to the right of the subsequence from region i to region j.

 ϵ: the leftmost helix of the substructure forms a parallel coaxial stack with a helix to the left of the subsequence from region i to region j.

ϵϵ: none of the helices in the substructure coaxially stack with any outside helices.


Functions of the form ${M}_{xy}^{B}\left(i,\phantom{\rule{2.77695pt}{0ex}}j\right)$, where B stands for bifurcation, and different cases of the subscript xy are defined similar to the ones above. Here the important assumption is that the structure is surrounded by an ultrastable helix outside the mentioned subsequence. If the outside enclosing helix is not strong enough, but when put together with the possible i, j helix they can act as an ultrastable helix, that is also acceptable. In these recurrences, the substructure predicted for the subsequence from region i to region j will be an mway junction, m > 2, that may or may not be enclosed by a possible helix formed by i and j. The main difference between a function M_{ xy }(i, j) and a function ${M}_{xy}^{B}\left(i,\phantom{\rule{2.77695pt}{0ex}}j\right)$ is that the latter may immediately cause a bifurcation, whereas the former may not.
Notation
We use the following notation throughout this section:

i and i' are used for referring to indices of candidates in the Starting Position Order (SPO).

j and j' are used for referring to indices of candidates in the Ending Position Order (EPO).

d (a, b) is the distance between candidate regions a and b. It is the shortest nucleotide distance between the end of candidate a and the beginning of candidate b, assuming a ends before where b starts.

i + 1 is the candidate region after (and possibly overlapping with) i in the SPO.

j  1 is the candidate region before (and possibly overlapping with) j in the EPO.

s_{≥ x}(i) is the first nonoverlapping successor of i in SPO at a distance greater than or equal to x.

p_{≥ y}(j) is the first nonoverlapping predecessor of j in EPO at a distance greater than or equal to y.

${s}_{\le 2}^{\prime}\left(i\right)$ represents any nonoverlapping successor of i in SPO at distance at most 2 from i.

${p}_{\le 2}^{\prime}\left(j\right)$ represents any nonoverlapping predecessor of j in EPO at distance at most 2 from j.

A_{ ij }is the weight of any helix (semistable, stable, or ultrastable) formed by i and j, or  ∞ if no such helix exists.

S_{ ij }is the weight of a helix i, j that is stable or ultrastable, or  ∞ if no such helix exists.

U_{ ij }is the weight of a helix i, j that is ultrastable, or  ∞ if no such helix exists.

CS is the reward for a coaxial stacking. Its value depends on the terminal base pairs of the helices involved.

In rules of the form F(i, j) = A_{ ij }+ max_{i',j'}{M_{2W}(i', j')} the requirement is that the helices formed by candidates i < i' < j' < j do not coaxially stack, d(i, i')  d(j', j)  ≤ 4, d(i, i') ≤ 11, d(j', j) ≤ 11, and that d(i, i') and d(j', j) cannot both be 0.
Recurrences
In the above function, the first case, with M_{2W}, is only allowed when A_{ ij } and A_{i'j'}put together are strong enough to act as a stable helix. The second case, with ${M}_{2W}^{B}$, is only allowed when A_{ ij } and A_{i'j'}put together are strong enough to act as an ultrastable helix. The situation is similar for cases 3 and 4 where we have 2way junctions with coaxial stacking. In case 5, with ${M}_{@\u03f5}^{B}$, helix U_{ ij } forms a left nested coaxial stack with a helix that it encloses, but since the enclosed structure is an mway junction, m > 2, helix U_{ ij } has to be ultra stable. Cases 6 is defined and constrained similarly for a right nested coaxial stack. In cases 7 and 8, the helix U_{ ij } encloses an mway junction, m > 2, that may either immediately or later on include a coaxial stacking, therefore it has to be ultrastable. Case 7 results in an mway junction that may later on include a coaxial stacking, whereas case 8 results in an mway junction with a parallel coaxial stacking. Similar constraints are applied to the recurrences below.
Abbreviations
 SPO :

Starting Position Order
 EPO :

Ending Position Order.
Declarations
Acknowledgements
This article has been published as part of BMC Genomics Volume 13 Supplement 3, 2012: Selected articles from the IEEE International Conference on Bioinformatics and Biomedicine 2011: Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/13/S3.
This research project was supported in part by the NSF MRI 0821263 grant, the NIH BISTI R01GM07208001A1 grant, the NIH ARRA Administrative Supplement to NIH BISTI R01GM07208001A1, and the NSF IIS grant of award No: 0916250. We used the software Graphviz (available at http://www.graphviz.org) to draw the predicted secondary structures.
Authors’ Affiliations
References
 Cech TR, Damberger SH, Gutell RR: Representation of the Secondary and Tertiary Structure of Group I Introns. Nat Struct Biol. 1994, 1: 273280. 10.1038/nsb0594273.View ArticlePubMedGoogle Scholar
 Batey RT, Rambo RP, Doudna JA: Tertiary Motifs in RNA Structure and Folding. Angew Chem Int Ed Engl. 1999, 38: 23262343. 10.1002/(SICI)15213773(19990816)38:16<2326::AIDANIE2326>3.0.CO;23.View ArticlePubMedGoogle Scholar
 Tinoco I, Bustamante C: How RNA Folds. J Mol Biol. 1999, 293: 271281. 10.1006/jmbi.1999.3001.View ArticlePubMedGoogle Scholar
 Masquida B, Westhof E: A Modular and Hierarchical Approach for AllAtom RNA Modeling. The RNA World. Edited by: Gesteland R, Cech T, Atkins J. 2006, CSHL Press, 659681. 3Google Scholar
 Sim AYL, Levitt M: Clustering to Identify RNA Conformations Constrained by Secondary Structure. Proc Natl Acad Sci USA. 2011, 108: 35903595. 10.1073/pnas.1018653108.PubMed CentralView ArticlePubMedGoogle Scholar
 Mathews DH, Schroeder SJ, Turner DH, Zuker M: Predicting RNA Secondary Structure. The RNA World. Edited by: Gesteland R, Cech T, Atkins J. 2006, CSHL Press, 631657. 3Google Scholar
 Nussinov R, Jacobson AB: Fast Algorithm for Predicting the Secondary Structure of Singlestranded RNA. Proc Natl Acad Sci USA. 1980, 77: 630913. 10.1073/pnas.77.11.6309.PubMed CentralView ArticlePubMedGoogle Scholar
 Zuker M, Steigler P: Optimal Computer Folding of Larger RNA Sequences Using Thermodynamics and Auxiliary Information. Nucleic Acids Res. 1981, 9: 133148. 10.1093/nar/9.1.133.PubMed CentralView ArticlePubMedGoogle Scholar
 Hofacker IL, Fontana W, Stadler PF, Bonhoeffer LS, Tacker M, Schuster P: Fast Folding and Comparison of RNA Sequence Structures. Monatsh Chemistry. 1994, 125: 167168. 10.1007/BF00818163.View ArticleGoogle Scholar
 Mathews DH: Using an RNA Secondary Structure Partition Function to Determine Confidence in Base Pairs Predicted by Free Energy Minimization. RNA. 2004, 10: 11781190. 10.1261/rna.7650904.PubMed CentralView ArticlePubMedGoogle Scholar
 Zuker M: Calculating Nucleic Acid Secondary Structure. Curr Opin Struct Biol. 2000, 10: 303310. 10.1016/S0959440X(00)000889.View ArticlePubMedGoogle Scholar
 Eddy SR: How Do RNA Folding Algorithms Work?. Nat Biotechnol. 2004, 22: 14571458. 10.1038/nbt11041457.View ArticlePubMedGoogle Scholar
 Laing C, Schlick T: Computational Approaches to RNA Structure Prediction, Analysis and Design. Curr Opin Struct Biol. 2011, 21: 306318. 10.1016/j.sbi.2011.03.015.PubMed CentralView ArticlePubMedGoogle Scholar
 Shapiro BA, Yingling YG, Kasprzak W, E B: Bridging the Gap in RNA Structure Prediction. Curr Opin Struct Biol. 2007, 17: 15765. 10.1016/j.sbi.2007.03.001.View ArticlePubMedGoogle Scholar
 Knudsen B, Hein J: Pfold: RNA Secondary Structure Prediction Using Stochastic Contextfree Grammars. Nucleic Acids Res. 2003, 31: 34233428. 10.1093/nar/gkg614.PubMed CentralView ArticlePubMedGoogle Scholar
 Hamada K, Sato K, Kiryu H, Mituyama T, Asai K: CentroidAlign: Fast and Accurate Aligner for Structured RNAs by Maximizing Expected Sumofpairs Score. Bioinformatics. 2009, 25: 32363243. 10.1093/bioinformatics/btp580.View ArticlePubMedGoogle Scholar
 Durbin R, Eddy SR, Krogh A, Mitchison GJ: Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. 1998, Cambridge, UK: Cambridge University PressView ArticleGoogle Scholar
 Nawrocki EP, Kolbe DL, Eddy SR: Infernal 1.0: Inference of RNA Alignments. Bioinformatics. 2009, 25: 13351337. 10.1093/bioinformatics/btp157.PubMed CentralView ArticlePubMedGoogle Scholar
 Leontis NB, Stombaugh J, Westhof E: The NonWatsonCrick Base Pairs and Their Associated Isostericity Matrices. Nucleic Acids Res. 2002, 30: 34973531. 10.1093/nar/gkf481.PubMed CentralView ArticlePubMedGoogle Scholar
 Leontis NB, Westhof E: Geometric Nomenclature and Classification of RNA Base Pairs. RNA. 2001, 7: 499512. 10.1017/S1355838201002515.PubMed CentralView ArticlePubMedGoogle Scholar
 Strobel SA, Doudna JA: RNA Seeing Double: Closepacking of Helices in RNA Tertiary Structure. Trends Biochem Sci. 1997, 22: 262266. 10.1016/S09680004(97)010566.View ArticlePubMedGoogle Scholar
 Tyagi R, Mathews DH: Predicting Helical Coaxial Stacking in RNA Multibranch Loops. RNA. 2007, 13: 939951. 10.1261/rna.305307.PubMed CentralView ArticlePubMedGoogle Scholar
 Xin Y, Laing C, Leontis NB, Schlick T: Annotation of Tertiary Interactions in RNA Structures Reveals Variations and Correlations. RNA. 2008, 14: 24652477. 10.1261/rna.1249208.PubMed CentralView ArticlePubMedGoogle Scholar
 Walter AE, Turner DH, Kim J, Matthew HL, Muller P, Mathews DH, Zuker M: Coaxial Stacking of Helices Enhances Binding of Oligoribonucleotides and Improves Predictions of RNA Folding. Proc Natl Acad Sci USA. 1994, 91: 92189222. 10.1073/pnas.91.20.9218.PubMed CentralView ArticlePubMedGoogle Scholar
 Lescoute A, Westhof E: Topology of Threeway Junctions in Folded RNAs. RNA. 2006, 12: 8393. 10.1261/rna.2208106.PubMed CentralView ArticlePubMedGoogle Scholar
 Laing C, T S: Analysis of Fourway Junctions in RNA Structures. J Mol Biol. 2009, 390: 547559. 10.1016/j.jmb.2009.04.084.PubMed CentralView ArticlePubMedGoogle Scholar
 Laing C, Jung S, Iabal A, Schlick T: Tertiary Motifs Revealed in Analyses of Higherorder RNA Junctions. J Mol Biol. 2009, 393: 6782. 10.1016/j.jmb.2009.07.089.PubMed CentralView ArticlePubMedGoogle Scholar
 GriffithsJones S, Moxon S, Marshall M, Khanna A, Eddy SR, Bateman A: Rfam: Annotating NonCoding RNAs in Complete Genomes. Nucleic Acids Res. 2005, 33: D121D141.PubMed CentralView ArticlePubMedGoogle Scholar
 Hofacker IL: Vienna RNA Secondary Structure Server. Nucleic Acids Res. 2003, 31: 34293431. 10.1093/nar/gkg599.PubMed CentralView ArticlePubMedGoogle Scholar
 Wang Y, Manzour A, Shareghi P, Shaw T, Li Y, Malmberg R, Cai L: Stable Stem Enabled Shannon Entropies Distinguish Noncoding RNAs from Random Backgrounds. BMC Bioinformatics. 2012, 13 (Suppl 5): S110.1186/1471210513S5S1.View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.