Skip to main content
Figure 1 | BMC Genomics

Figure 1

From: Widespread signatures of local mRNA folding structure selection in four Dengue virus serotypes

Figure 1

A. Flow diagram of the study. Our study includes the following general steps (details are in the main text): I. Coding regions of 1,670 Dengue genomes from 4 different serotypes were collected. II. The coding regions were aligned. III. Each of the wild type sequences was randomized 1000 times based on two different randomization models (evolutionary, and dinucleotide constrained). IV. Local minimum free folding energy (MFE) profiles were predicted for each wild type and randomized sequences separately. V. Profiles of sequence variability along the aligned coding regions were computed. VI. Wild type and randomized MFE profiles were compared to identify positions suspected to have a strong/weak local folding signal (p-value < 0.05). VII. Positions with MFE signals significantly conserved across different viral variants were identified. B. Evolutionary-constrained randomization model - synonymous codons in each column in the multiple alignment were permuted; if more than one amino acid was present (different colors) the permutations were restricted to the corresponding sets of synonymous codons. C. Prediction of MFE in 39 nt windows (red arrow) along the coding sequence (brown); green arrow - 44 nt sequence interval corresponding to signal conservation and sequence variability analyses (the size of the interval was determined by the MFE prediction window size + allowed shift in signal position in conservation analysis). D. One-Versus-Rest (OVR) model - in each randomized variant, randomized MFE signals were identified by a position-wise comparison to the rest of the randomized variants from the same wild-type origin. E. Signals conservation - suspected MFE related signals (yellow) were defined as conserved if they appear in a significantly high (p-value < 0.001 with respect to randomized conservation levels based on OVR randomized signals) number of different sequences within a 5nt vicinity to each other (red). Two different clusters, each one consisting of two positions with a conserved MFE related signal are illustrated (distinguished by vertical dot lines); by definition, positions belong to the same cluster if they correspond to 44 nt length partially - overlapping genomic windows.

Back to article page