- Research article
- Open Access
The mitochondrial genome of the ascalaphid owlfly Libelloides macaronius and comparative evolutionary mitochondriomics of neuropterid insects
BMC Genomicsvolume 12, Article number: 221 (2011)
The insect order Neuroptera encompasses more than 5,700 described species. To date, only three neuropteran mitochondrial genomes have been fully and one partly sequenced. Current knowledge on neuropteran mitochondrial genomes is limited, and new data are strongly required. In the present work, the mitochondrial genome of the ascalaphid owlfly Libelloides macaronius is described and compared with the known neuropterid mitochondrial genomes: Megaloptera, Neuroptera and Raphidioptera. These analyses are further extended to other endopterygotan orders.
The mitochondrial genome of L. macaronius is a circular molecule 15,890 bp long. It includes the entire set of 37 genes usually present in animal mitochondrial genomes. The gene order of this newly sequenced genome is unique among Neuroptera and differs from the ancestral type of insects in the translocation of trnC. The L. macaronius genome shows the lowest A+T content (74.50%) among known neuropterid genomes. Protein-coding genes possess the typical mitochondrial start codons, except for cox1, which has an unusual ACG. Comparisons among endopterygotan mitochondrial genomes showed that A+T content and AT/GC-skews exhibit a broad range of variation among 84 analyzed taxa. Comparative analyses showed that neuropterid mitochondrial protein-coding genes experienced complex evolutionary histories, involving features ranging from codon usage to rate of substitution, that make them potential markers for population genetics/phylogenetics studies at different taxonomic ranks. The 22 tRNAs show variable substitution patterns in Neuropterida, with higher sequence conservation in genes located on the α strand. Inferred secondary structures for neuropterid rrnS and rrnL genes largely agree with those known for other insects. For the first time, a model is provided for domain I of an insect rrnL. The control region in Neuropterida, as in other insects, is fast-evolving genomic region, characterized by AT-rich motifs.
The new genome shares many features with known neuropteran genomes but differs in its low A+T content. Comparative analysis of neuropterid mitochondrial genes showed that they experienced distinct evolutionary patterns. Both tRNA families and ribosomal RNAs show composite substitution pathways. The neuropterid mitochondrial genome is characterized by a complex evolutionary history.
Insect mitochondrial genomes (mtDNAs) are usually a double-strand circular molecule containing 13 protein-coding genes (PCGs), 22 transfer RNAs (tRNAs) and 2 ribosomal RNAs, i.e., the small and large subunits . A notable exception is represented by the mtDNA of the human body louse Pediculus humanus, which consists of 18 miniature circular chromosomes containing one to three genes . Insect mtDNA size is typically in the range of 14 to 20 kbp, but genomes exceeding these values are known . In these latter cases, the size increase is connected to the expansion of the main non-coding region, named the AT-rich or control region . The gene order is a feature of mtDNA that can provide important evidence to establish evolutionary relationships among taxa at high and/or low taxonomic levels [4, 5]. The most widespread gene order in insect mtDNAs is shown in Figure 1. This gene order, initially determined for Drosophila yakuba mtDNA, is considered ancestral for the entire class Insecta [5–7]. Several gene orders departing from the ancestral arrangement exist and can be restricted to single species as in the strepsipteran Mengenilla australiensis or common to whole groups of higher taxonomic rank, ranging from family (e.g., Culicidae) to order (e.g., Lepidoptera) (Figure 1) [8–12]. In the latter cases, the peculiar gene order becomes an important marker to delimit taxonomic boundaries and constitutes a major signature of mtDNA. To date, full-length mtDNAs in insects have been sequenced in an imbalanced manner, with whole orders still lacking any published information or being poorly represented in data banks. In this respect, the neuropterid orders Megaloptera, Neuroptera (also named Planipennia), and Raphidioptera are underrepresented taxa. Only recently have sequences become available for these taxa [13–15]. Currently, full-length mtDNAs are known for the megalopterans Corydalus cornutus and Protohermes concolorus, both members of the Corydalidae family, and Sialis hamata, of the Sialidae [13–15]. Three mtDNAs are available for the neuropterans Ascaloptynx appendiculatus (Ascalaphidae), Ditaxis latistyla (Mantispidae), and Polystoechotes punctatus (Polystoechotidae) [13, 14]. A complete mtDNA sequence exists for the raphidiopteran Mongoloraphidia harmandi (Raphidiidae) . Finally, partial mtDNA sequences are available for the neuropteran Myrmelon immaculatus (Myrmelontidae) and the raphidiopteran Agulla sp. (Raphidiidae) .
Here we determined the complete mtDNA sequence of the ascalaphid owlfly Libelloides macaronius (Scopoli, 1763) (Neuroptera, Ascalaphidae). The newly determined genome was compared with available neuropterid mtDNAs (complete and partial) as well as with genomes obtained from other holometabolous insects (Table 1) [6, 9–52]. The analyses were performed following a comparative and evolutionary perspective.
Results and discussion
The mtDNA genome of L. macaronius is a circular molecule 15,890 bp long (Figure 2). It contains the entire set of 37 genes usually present in animal mtDNAs and 14 non-coding portions. Three of these spanning at least 15 bp are labeled as intergenic spacers (s1-s3) in Figure 2.
Atp8 and atp6 are the only genes, located on the same strand, that overlap. The atp8-atp6 partial superimposition is a common feature of neuropterid mtDNAs and, in general, of animal mtDNAs [1, 13, 15]. Other genes located on the same strand (both α and β) are contiguous (e.g., nad4L and nad4) or separated by a few nucleotides (e.g., nad3 and trnA). Genes on opposite strands overlap (e.g., nad2 and trnC), are contiguous (e.g., trnQ and trnM), or are separated by some nucleotides (e.g., trnT and trnP) or intergenic spacers (e.g., trnS2 and nad1) (Figure 2). Similar patterns can be observed in other neuropterid mtDNAs. It must be noted that nad4L and nad4 are contiguous in A. appendiculatus and L. macaronius (both members of the family Ascalaphidae). Conversely, in all other sequenced neuropterid mtDNAs, nad4L and nad4 overlap by 7 bp [13, 15].
The gene order of L. macaronius mtDNA differs from the insect ancestral gene order in the translocation of trnC, which is located upstream of trnW in contrast to its traditional location downstream of trnW (Figures 1, 2). This translocation is shared by and exclusive to all neuropteran mtDNAs that are fully or partly sequenced to date. Conversely, Megaloptera and Raphidioptera exhibit the typical insect ancestral gene order as shown in Figure 1[13, 15]. This figure presents the distribution of various gene orders in holometabolous insects, mapped on the tree obtained by Wiegmann et al. , but alternative phylogenetic hypotheses exist ([e.g., ]).
According to Aspöck et al., Neuroptera includes three major groups, Hemerobiiformia, Myrmelontiformia and Nevrorthiformia . The first two taxa are sister groups and encompass most neuropteran species, while Nevrorthiformia includes the single small family Nevrorthidae, which represents an early branching-off taxon within Neuroptera [55, 56]. This view has been challenged by Winterton et al., who favor a paraphyletic Hemerobiiformia with respect to Myrmelontiformia and identifies the Coniopterygidae (Hemerobiiformia) family as the sister taxon of all other Neuroptera . Irrespective to the arrangement of internal taxa, all studies strongly support a monophyletic Neuroptera order [55–57].
Available neuropteran mtDNAs, fully or partly sequenced, have been obtained from species belonging to distinct phyletic lines within Neuroptera. Indeed D. latistyla and P. punctatus belong to the divergent hemerobiiform families Mantispidae and Polystoechotidae, while M. immaculatus (Myrmelontidae), and A. appendiculatus + L. macaronius (Ascalaphidae) are members of two myrmelontiform families [13, 14], present paper]. The known distribution of trnC translocations was mapped (data not shown) based on the recent phylogeny of Neuropterida produced by Winterton et al., which also contains estimations of the divergence times of major phyletic lineages . It appears that such genomic re-arrangement occurred at minimum 238 million years ago. The estimation could be further extended back to 294 million years if this genomic change is confirmed for the whole order Neuroptera.
Further work is needed to establish whether the trnC translocation represents a strong molecular signature for the entire Neuroptera clade.
Within Endopterygota, the translocation of trnM characterizes all lepidopteran species sequenced to date and seems to be a strong molecular mitochondrial signature for the whole order (Figure 1) (e.g., [11, 12]). At the family level, the mosquitoes (family Culicidae) are characterized by the trnR-trnA arrangement instead of the traditional trnA-trnR (Figure 1) [9, 10]. Several other re-arrangements are known for endopterygote mtDNAs but are restricted to single species (e.g., Xenos vesparum) or not consistent within broad taxonomic groups (e.g., Hymenoptera) [37, 39, 58–63]. The insect ancestral gene order remains the most widespread arrangement among holometabolous (Figure 1) and heterometabolous insects, suggesting that the gene order cannot play a pivotal role in resolving the relationships at the interordinal level among pterygote insects . Conversely, mtDNA re-arrangements represent strong potential signatures to define taxa at the order and lower taxonomic levels (i.e., family, genus and lower) [9–14, 39], present paper].
Base composition and AT- and GC-skews in the mitochondrial genomes of endopterygote insects
The nucleotide-compositional behavior of mitochondrial genomes is usually investigated through the calculation of the three parameters: AT-skew, GC-skew, and A+T content (AT%), expressed in percentages (e.g., [8, 12]). Here these parameters were studied for 84 complete endopterygotan mtDNAs (Table 2; Additional file 1, Table S1) and combined, for the first time, in a single three-dimensional scatter-plot analysis (Figure 3).
Arbitrary partitioning schemes, further detailed below, were applied to the estimated values to facilitate the presentation/discussion of the results. Each empirical clustering scheme was defined in order to maximize the appreciation of the uneven distribution of various mtDNAs within the continuum of variation of the considered parameter.
The range of variation of AT% spanned from 66.99 to 87.41, with the average value equal to 77.77. Different mtDNAs exhibited an uneven distribution within this range. Five principal clusters (A1-A5) were obtained that divided the whole range of AT% into intervals encompassing 5% of the variation.
The majority (61/84) of endopterygote mtDNAs exhibited an AT% in the range of 74-82 (A2 partially; A3; A4 partially), with a strong concentration in the interval 75-80 (43/84) (A3). Most of neuropterids were included in cluster A3, while L. macaronius and C. cornutus belonged to cluster A2. The mtDNA of M. harmandi was comprised in cluster A4. Very high AT% (>84) contents (A4 partially; A5) were characteristic of and limited to a few hymenopterans and dipterans. Conversely, low AT%s (< 70) were restricted to coleopterans (A1).
The average AT-skew was 0.039, while the whole range of variation extended from -0.047 to 0.248. The endopterygote mtDNAs were grouped into six clusters (B1-B6) through the partitioning of the whole range of AT-skews in intervals spanning 0.050.
The majority of mtDNAs exhibited a positive AT-skew (71/84) with two peaks, in the intervals 0.000-0.050 (44/84; B2) and 0.050-0.100 (16/84; B3). In cluster B2 were placed the mtDNAs of C. cornutus, S. hamata, D. biseriata and M. harmandi. The cluster B3 contained also the mtDNAs of A. appendiculatus and L. macaronius. Only some coleopteran mtDNAs exceed values of 0.100 (B4-B6), and the skews were higher than 0.200 solely in C. fulgidissima and T. bruchi; ( B6). Negative AT-skews (13/84) were modest (>-0.005; B1), and lepidopteran taxa predominated in this group. The mtDNAs of P. punctatus and P. concolorus exhibited negative AT-skews.
The average GC-skew value was -0.213, and all 84 mtDNAs displayed negative skews. The whole set of genomes was grouped into six clusters (C1-C6), each spanning 0.050 of the range of GC-skew variation.
The majority of the GC-skews were located within the range -0.30/-0.15 (68/84; C3-C5), with two peaks represented by clusters C4 (30 genomes) and C5 (24 mtDNA). The neuropterids C. cornutus and P. concolorus were included in C3, M. harmandi and A. appendiculatus in C4, while D. biseriata, L. macaronius, S. hamata, and P. punctatus were inserted in cluster C5.
When the three variables analyzed above were considered together, some patterns emerged. Even if the parameter distribution followed a continuum range of variation, most of the sequenced endopterygote mtDNAs exhibited A+T contents in the range 70%-80%, with the highest density in the interval 75%-80%; AT-skews in the range 0.003-0.010; and GC-skews in the interval -0.300-0.150. This behavior can be represented by the cluster formula: A2/A3,B2/B3,C3-C5. Most neuropterid mtDNAs are included in this formula, with the exceptions of P. punctatus and P. concolorus (B1).
While no genome exhibited extreme values for any of the three parameters, some mtDNAs markedly departed from the most represented ranges for the behavior of one or two parameters. Very low AT content was coupled with high AT-skew in some coleopteran mtDNAs. Very high A+T contents (>84%) were restricted to a small group of hymenopterans and dipterans. Highly negative GC-skews (< -0.300) were found mostly in hymenopterans, plus the moth O. lunifer and the beetle T. castaneum.
The comparisons performed at the intra-ordinal level showed that in several cases the extremes of the ranges of variation of the analyzed parameters were occupied by the same mtDNAs. This behavior could be the result of limited taxon sampling. Thus, further data are necessary to verify if the observed patterns are consistent over a broader taxon sampling.
Generally, the A+T content, AT-skew, and GC-skew exhibited complex behaviors that do not appear to be tightly linked, at least at the order level.
Start/stop codons in neuropterid PCGs
The inferred start/stop codons for neuropterid PCGs are listed in Table 3. ATN, GTG, TTG, and GTT are accepted canonical mitochondrial start codons for invertebrate mtDNs and most of PCGs exhibited these start codons .
The cox1 start codon for L. macaronius was inferred to be the triplet ACG, a non-canonical start codon for mtDNA genes. Despite the fact that cox1 is one of more conserved mitochondrial genes, the identification of its start codon has been problematic (see ). An ATC codon located 27 bp upstream of ACG, within the trnY gene that is encoded on the β strand, could be a start codon for L. macaronius cox1. However, amino acid sequence comparison with a broad selection of endopterygote insects led us to the more probable conclusion that the ACG triplet was the start codon for L. macaronius cox1. An ACG start codon for cox1 has also been proposed for M. immaculatus. The non-canonical start codons TTA and TCG have previously been inferred for cox1 in A. appendiculatus and P. punctatus. The remaining neuropterid cox1 genes exhibited an ATT start codon. Non-canonical start codons in cox1 have been inferred repeatedly for endopterygote insects (e.g., [12, 65, 66]).
Stop codons for the 13 neuropterid PCGs were almost invariably complete (TAA) or incomplete TAa/Taa. The only exception was observed in nad1, where the two ascalaphids L. macaronius and A. appendiculatus and P. punctatus exhibited TAG as stop codon.
Codon usage and amino acids abundance
All analyzed neuropterid mtDNAs use the standard invertebrate mitochondrial genetic code. This result was determined using the server GenDecoder [67, 68]. The behavior of codon families in PCGs was investigated and the results are presented in Figures 4, 5. First codons, as well as stop codons, complete and incomplete, were excluded from the analysis to avoid biases due to unusual putative start codons and incomplete stop codons. The abundance of codon families and relative synonymous codon usage (RSCU) were calculated together with the other statistics listed below.
The A+T content, AT-skew, G+C content and GC-skew, which were calculated in pooled-α + β, pooled-α and pooled-β PCGs, exhibited behaviors identical, similar to or contrasting with those observed for the whole genomes (see Additional file 2, Table S2).
The total number of codons was different in analyzed species (Figure 4; Additional file 2, Table S2). The codon families exhibiting more than 50 codons per thousand cds varied (7-9) and the majority of them were two-fold degenerate in codon usage and AT-rich. In almost all mtDNAs, the four most represented codon families were Leu2, Ile, Phe and Met with Leu2 being the most common. The only exception was P. punctatus, in which Ser2 was the fourth most abundant codon family, while Met family occupied only the seventh position. Leu, Ile, Ser and Phe were the most abundant amino acids, and their sum varied from 42.75% (L. macaronius) to 46.62% (M. harmandi) of the total number of amino acids.
Four-fold degenerate codon usage was A/T biased in the third position, and T was the preferred nucleotide, except in the Arg, Gly and Ser1 codon families, where A was the most common nucleotide in the third position. Two-fold degenerate codon usage showed a similarly biased pattern, with A/T favored over G/C in the third position (Figure 5). Both patterns were in agreement with the AT-biased content exhibited by pooled-α+β PCGs.
In pooled-α PCGs, the four-fold degenerate codon usage followed a pattern similar to that of the whole PCG set, with the exceptions of Ala, Pro and Val codon families (Figure 5 vs. Additional file 3, Figure S1).
In the pooled-β PCGs, the four-fold degenerate codon usage partly mirrored the pattern observed for pooled-α+β PCGs. Among the discrepancies, the loss of synonymous GC-rich codons was the easiest to detect (Figure 5 vs. Additional file 4, Figure S2).
The behavior of pooled-α+β PCGs in different species varied with respect to the usage of the whole set (62) of codons encoding amino acids. All missed codons were GC-rich. Codon usage could not be directly linked to total number of codons nor to A+T content (Figure 5; Additional file 2, Table S2). The reduction in codon usage could be possibly linked to AT/GC-skews. Indeed, P. punctatus, which used the smallest number of codons, also exhibited the most negative AT-skewα+β and the highest GC-skewα+β. The number of used codons decreased when PCGs encoded on a single strand were analyzed (Additional file 2, Table S2; Additional file 3, Figure S1; Additional file 4, Figure S2). The pattern observed for pooled-α+β PCGs was mostly mirrored by pooled-α PCGs. On the β strand, codon reductions appeared to be species specific. The lost codons belonged to GC-rich and comparatively A-rich codon-families. Finally, on the β strand, the pattern of codon loss appeared to be linked to the single/combined effects of GC content and AT-/GC-skews.
The total number of α-codons vs. the total number of β-codons (ratioα/β) was calculated for every species (Table 4). The overall (o) average (avg) ratioα/β (1.59 ± 4 × 10-3) was used as a measure to identify possible distributional biases of different codon families. Several codon families exhibited ratioα/β values close to oavg-ratioα/β, suggesting that their distribution is not markedly diverse in the PCGs coded on different strands. Codon families His, Leu1, Pro, and Thr exhibited ratioα/β values strongly biased toward the α strand. Conversely, codon families Cys, Leu2 and Ser1 presented ratioα/β values clearly β strand-oriented.
The most obvious bias was due to structural/functional requirements of PCGs. This point is well represented by the distribution of Cys codon family. Cys was the rarest amino acid in mtDNA proteins and was more abundant in polypeptides (NADH subunits) encoded on the β strand than in the proteins encoded on the α strand. The Cys residues can produce intra- and inter-chain disulfide bridges (S-S) that play a fundamental structural role. Disulfide bonds and interactions between pairs of Cys residues have been postulated to have an important structural role in NADH subunits of primate mtDNA . The abundance of Cys residues in neuropterid NADHs, and in general in endopterygotes, appears to agree with findings derived from primates. Several marked departures from oavg-ratioα/β can be explained in terms of structural/functional requirements of single polypeptides and are not explored here.
A compositional bias effect was evident in the behavior of Leu1 (CTN) and Leu2 (TTR) codon families. As mentioned above, Leu was the most abundant amino acid in the PCGs. The pooled superfamily Leu = Leu1 + Leu2 exhibited an avg-ratioα/β = 1.28 ± 0.11. This result supports a moderate β strand bias in the distribution of Leu. However, Leu1 and Leu2 codon families exhibited an opposite bias in term of strand distribution. These strand biases can be linked to the AT-skew and GC content shown by pooled-α and pooled-β PCGs.
Behavior of PCGs in neuropterid mtDNAs
Different parameters have been used to evaluate the performance of mitochondrial single/combined PCGs for phylogenetic, population genetics, and taxon molecular identification purposes (e.g., [11, 12]). The suitability of different PCGs is linked to, influenced by the evolutionary history of each molecular marker. The calculation of p-distance values has been used to test the level of divergence within each PCG (e.g., ). Recently, codon usage has also been investigated to study the differentiation of mitochondrial PCGs . In the present paper, the evolutionary behavior of PCGs, at both the nucleotide and amino acid levels, was studied in a combined analysis based on the calculation/estimation of p-distances, effective number of codon usage (ENC), phylogenetic signal and the saturation of the substitution process present in single/combined PCGs.
The percent of fully resolved quartets (%FRQ), obtained in a maximum likelihood mapping analysis, was used as a measure of phylogenetic signal . The saturation of the substitution process was established by calculating the m-slope of the regression line passing through the origin in a scatter plot analysis of p-distances vs. GTR+I+G - mtREV24/JTT + G distances (see Methods for further details).
Parameters were estimated for pooled-α+β, pooled-α, pooled-β, and single PCGs at both the nucleotide and amino acid levels, with the exception of atp8. This gene was not analyzed individually because it contains a limited number of codons (< 55) to provide reliable estimations of ENC . Conversely, atp8 was included in the pooled-α+β/-α sets. The results of the global analysis are summarized in Figure 6 and Table 5.
As a general rule, proteins, individually or pooled, exhibited levels of saturation of the substitution process lower than those observed in the nucleotidic counterparts. This result can be explained in terms of the substitution process, which permits more alternatives in the amino acid replacements. The m-slope values show that all PCGs/proteins experienced some degree of saturation of the substitution process. This phenomenon was particularly marked at the third codon positions of some PCGs, where the GTR+I+G-distance values largely exceeded 1 (data not shown), a result strongly supporting a complete saturation of the substitution process .
Pooled-α+β PCGs/proteins and pooled-α PCGs/proteins showed the best phylogenetic signals and clearly surpassed pooled-β PCGs/proteins. Because pooled-α+β and pooled-α proteins also exhibited a less pronounced saturation of the substitution process than their nucleotide counterparts, they represented the first choice as markers to investigate the phylogenetic relationships among Neuropterida at the inter-order taxonomic level.
The single genes exhibiting the best phylogenetic signals at the inter-order taxonomic level were cob, cox1 and cox3 (%FRQ> 94.00). Intermediate values were observed in atp6, cox2, nad4, nad5 and nad1 (89.00 < %FRQ > 75.00) The lowest signals were present in nad2, nad3, nad4L and nad6 (%FRQ < 70.00).
Pooled and single PCGs having the best phylogenetic signals usually exhibited a combination of high avg-ENCs, small avg-pds, and high m-slopes in their protein counterparts (e.g., cox1), but exceptions occurred. Conversely, PCGs with low phylogenetic signals usually showed high avg-pds and low values of m-slope at both the nucleotide and amino acid levels (e.g., nad2).
Reduction of phylogenetic signal with respect to the best-performing PCGs can be explained in terms of three different processes/properties: a) accelerated evolution; b) retarded evolution; and c) possession of an intrinsically lower phylogenetic signal.
A marked accelerated evolution mechanism can be invoked for nad2, nad3 and nad6. These PCGs showed avg-pds, at both the nucleotide and amino acid levels, which were clearly higher than those observed in cox1, cox3 and cob. Furthermore, their m-slope values favored the occurrence of a more pronounced saturation of the substitution process, a finding corroborated by the lowest values of fully resolved quartets. All of these results are compatible only with an acceleration of the substitution process. Thus, nad2, nad3 and nad6 should be excellent markers for studies performed at medium/very low taxonomic levels (i.e., family, genus, species).
The mechanism of accelerated evolution, but less pronounced than that described above, can be invoked also for atp6, nad5 and nad4. These genes showed avg-pds higher than cox1, cox3 and cob, at both the nucleotide and amino acid levels. They exhibited phylogenetic signals higher than nad2, nad3 and nad6 but lower than the best-performing PCGs. Finally, their m-slope values were comparable to those of cox1, cox3 and cob. All these findings suggest that atp6, nad5 and nad4 evolved slightly faster than the best-performing PCGs and that these genes can be optimally used at the order or lower taxonomic levels.
The nad1 and cox2 genes had avg-pds similar to or even smaller than cox1, cox3 and cob, while their levels of saturation of the substitution process appeared to be less marked than PCGs exhibiting the highest %FRQ values. These results suggest that the reduction of phylogenetic signal in cox2 and nad1 was probably due to a slower rate of divergence of these two genes in comparison with cox1, cox3 and cob. Thus, they should be considered favored markers at high/very high taxonomic ranks.
The behavior of nad4L provided contradictory evidence. The low %FRQ and relatively high avg-pd value suggested that this is a fast evolving gene and that the loss of phylogenetic signal was due to an accelerated rate of multiple substitutions. However, the m-value was the highest obtained in the analysis and did not favor a strong saturation of the substitution process in the evolution of this gene. Thus, it could be that even evolving in a way not dissimilar to cox1, cox3 and cob, nad4L has an intrinsically lower phylogenetic signal due to its reduced size.
Taken together, these data indicate that the 13 PCGs encoded by mtDNA exhibit complex evolutionary trajectories that can be investigated using the combination of the parameters considered above. No single parameter seems able to fully describe the behavior of PCGs.
Comparison of tRNA structure in neuropterid genomes
The mtDNA of L. macaronius contained all 22 tRNAs genes found in other neuropterid mtDNAs and typical of animal mitochondrial genomes . All neuropterid mtDNAs had 14 α-strand tRNAs and 8 β-strand tRNAs (Figures 1, 2). Among the 22 tRNAs, only trnS1 did not exhibit the common cloverleaf structure, due to the absence of DHU stem. Loss of this stem in trnS1 is a feature common to many insect mtDNAs (e.g., ). The results of comparative analyses on secondary structures of neuropterid tRNAs are provided in Figures 7, 8. Multiple alignments of tRNAs genes were produced, and the percent of identical nucleotides (%INUC) was calculated for each group of orthologous sequences (Additional file 5, Table S3).
The pattern of nucleotide conservation was markedly α strand-biased. Indeed, trnG, trnK, trnN, trnM and trnE, which showed the highest levels of nucleotide conservation (%INUC ≥ 70), were all located on the α strand.
TrnL2, trnP, trnQ, trnS1, trnS2, trnT, trnW and trnY showed 60 < %INUC < 70. Only three of them were located on the β strand (Figure 2). Seven remaining tRNAs exhibited 50% < %INUC < 60. Four of them were located on the α strand, i.e., trnI, trnA, trnD, and trnR, while three were on the β strand, i.e., trnL1, trnH and trnV. %INUC values < 50% characterized the β-strand genes trnF and trnC. While the level of conservation was positively α strand-biased, no clear pattern could be identified for single tRNAs located at various points along the genome. For example, trnK, one of the most conserved tRNAs, was located immediately upstream of trnD (Figure 2), a much less conserved gene (see above).
The tRNAs closest to the control region (i.e., trnV and trnI), which is one of the start points of mtDNA replication, were not among most/less conserved. The same applied to trnS2, immediately upstream of the s2 spacer, where another center of mtDNA replication was located (see below). The abundance of codon families did not appear to be linked to the level of tRNA conservation. Indeed, none of the most abundant codon families (Leu2, Ile and Phe) exhibited the highest %INUCs.
Irrespective of the level of conservation, some tRNAs presented mismatched pairs in stems (e.g., A-A in the anticodon stem of trnK; T-T in the acceptor stem of trnM). These mismatches are common in arthropod mtDNAs and can be corrected through editing processes (e.g., ) or may represent unusual pairings . Further research at the transcript level is necessary to better characterize this feature in neuropterid mtDNAs.
In the most conserved tRNAs the nucleotide substitutions are largely restricted to TΨC and DHU loops and extra arms (Figures 7, 8), with changes on acceptor and anticodon stems reduced to 0-3 fully compensatory base changes (cbcs) (e.g., G-C vs. A-T in the anticodon stem of trnG) and/or hemi-cbc (e.g., G-T vs. A-T on the acceptor stem of trnM) . Note that in trnG, the most conserved tRNA, it was not possible to model the substitution pattern in the TΨC loop due to a high level of variation among orthologous sequences (Figure 7). With the increased variation, the number of cbcs and hemi-cbcs increased in stems, usually with a more marked substitution process in the TΨC stem (e.g., trnI, trnV). In several tRNAs it was not possible to model properly the substitution patterns in TΨC and DHU loops due to the high level of divergence among sequences. Cbcs and hemi-cbcs were restricted to a single species or characterized taxa at a higher taxonomic rank (family/order). An example of the first type is the G-C pair found in the trnH acceptor harm of P. concolorus, which was mirrored by A-T in all other neuropterids (Figure 7). An example of a full cbc characterizing a unique family is the G-C pair found in the acceptor stem of trnEs of family Ascalaphidae (A. appendiculatus and L. macaronius), while other taxa exhibited the A-T pair (Figure 7). A substitution pattern involving two full cbcs characterizing the trnS1 acceptor arm of C. cornutus and P. concolorus (Corydalidae megalopterans) is another example of high-taxonomic rank cbcs (Figure 8). Figures 7, 8 depict several more examples.
The presence of hemi-cbcs and, particularly, full cbcs characterizing taxa at different taxonomic levels underscores the potential phylogenetic value of tRNA sequences, especially when secondary structures are taken into account. Until now, tRNAs have been rarely considered in the study of insect phylogenetic relationships [76, 77]. Our results suggest that these markers deserve much more attention and should be more routinely used, as demonstrated by analyses performed in other animal groups (e.g., ).
The study of the level of conservation in tRNA families was extended to other endopterygotan orders in which at least two fully sequenced mtDNAs exist. The results of this analysis are summarized in Figure 9.
The pattern of %INUC variation exhibited by codon families was not consistent among different endopterygotan taxa. The highest level of conservation was observed in lepidopteran trnK (%INUC = 84.51), while the minimum was found in hymenopteran trnC (%INUC = 20.55). Lepidoptera had nine codon families with %INUC ≥ 70, whereas five such families were found in Neuropterida, two in Diptera, and none in Coleoptera or Hymenoptera. Lepidopteran codon families were the most conserved (avg-%INUC = 66.25 ± 8.49), immediately followed by neuropterid codon families (avg-%INUC = 61.46 ± 9.84). Dipteran codon families showed slightly more divergence (avg-%INUC = 60.84 ± 8.59), while coleopteran (avg-%INUC = 38.92 ± 8.84) and hymenopteran (avg-%INUC = 37.19 ± 8.22) codon families exhibited much lower levels of conservation.
Neuropterid %INUCs were in some cases very similar (e.g., trnA, trnH, trnM) or even surpassed (trnE, trnG, trnS1) their lepidopteran counterparts (Figure 9). This finding is striking because it indicates that the level of variation among the three neuropterid orders is very low, taking into account that Lepidoptera have very conserved tRNAs among Insecta . Conversely, Hymenoptera and Coleoptera had diverging codon families. Indeed, all hymenopteran tRNAs showed %INUC values ≤ 47.14, while no coleopteran tRNA exhibited %INUC > 58.21. The %INUC scores indicate a clear-cut discrepancy between the morphological diversity observed in neuropterid orders and the low rate of substitution that characterizes the evolution of their codon families. If the level of divergence is considered a measure of taxonomic diversity, then the %INUC values would suggest that Neuroptera, Megaloptera and Raphidioptera should be treated as members of a single order.
Comparison of neuropterid rrnS and rrnL structures
The inferred secondary structure model for rrnS of L. macaronius is provided in Figure 10. This is the first prediction for a neuropterid insect. The overall structure, including three domains and 34 helices (progressively numbered in Figure 10), is largely in agreement with those proposed for other endopterygote orders (i.e. Coleoptera, Diptera, Hymenoptera, and Lepidoptera) [11, 17, 74, 79]. A limited number of non-canonical pairings (e.g., G-A on helix H23-434) were present in the rrnS of L. macaronius. The multiple alignment of neuropterid rrnS s spanned 812 positions and contained 353 conserved (43.47%) and 459 variable (56.53%) positions, respectively. Nucleotide variability among domains and helices was unevenly distributed (Figure 10).
The inferred secondary structure model for the rrnL of L. macaronius is provided in Figure 11. This is the first prediction for a neuropterid insect. The structure of this rrnL largely overlaps with previously published structures for endopterygote insects (i.e. Coleoptera, Diptera, Hymenoptera, and Lepidoptera) [11, 17, 74, 79]. It presents the five canonical domains (I-II, IV-VI) of insect rrnLs that do not contain domain III . In the present paper domain I was fully modeled. This is the first time that a secondary structure has been provided for this domain, which in previous studies has been left unmodeled (e.g., [8, 11, 56, 74, 79]). The predicted secondary structure of domain I included eight helices and is consistent among all neuropterid taxa (Additional file 6, Figure S3). The proposed structure can also be fitted to rrnL of D. melanogaster (Additional file 6, Figure S3), further extending the taxonomic range. Domain I contained a limited number of non-canonical pairings (Additional file 6, Figure S3). More studies will be necessary to corroborate the validity of the new structure presented here. Taking into account the secondary structure of domain I, L. macaronius rrnL includes 50 helices. The multiple alignment of neuropterid rrnL s extended over 1376 positions and contains 591 conserved (42.95%) and 785 variable sites (57.05%). Conserved nucleotides were unevenly distributed throughout the rrnL secondary structure, with highest level of invariable positions located on domain IV and lowest level observed in domains I-II (Figure 11).
Comparison of neuropterid mtDNA genomic spacers
The L. macaronius mtDNA contained 14 non-coding portions (ncps), extending from 1 to 1049 nucleotides, interspersed through PCGs, tRNAs and rrnL and rrnS genes (Figure 2). Three of them spanned more than 15 bp and are labeled as spacers s1-s3 in Figures 2 and 12. More or less similar patterns, but not fully overlapping, were observed in other neuropteran mtDNAs, where the ncps varied from a maximum of 15 (D. biseriata) to a minimum of 10 (P. punctatus), with 13 non-coding portions found in A. appendiculatus. The dobsonflies C. cornutus and P. concolorus exhibited the same number (10) of ncps even if their distributional patterns did not coincide. Conversely, the other megalopteran, S. hamata, had 12 ncps. Finally, the raphidiopteran M. harmandi exhibited the most compact mtDNA, having only six npcs (Figure 12).
The spacers of L. macaronius are described more in detail below. The s1 spacer was located between trnI and trnQ and spanned 55 bp. Spacer s1 was present in all partly/fully sequenced mtDNAs available for Neuroptera, i.e., A. appendiculatus, P. punctatus, M. immaculatus and D. biseriata[13, 14], present paper)]. It was also found in the megalopteran S. hamata, while it was absent in the other megalopterans C. cornutus and P. concolorus and in the raphidiopteran M. harmandi (Figure 12) [13–15]. The L. macaronius s1 can be aligned confidently only with the counterpart found in A. appendiculatus.
The s1 sequences of L. macaronius and A. appendiculatus shared a CCCCCC repeat located near the boundary with trnI and the CAA(A/G)TTAA(A/C)TAAAT(TA/GT)A(C/T)GCA motif adjacent to trnQ. A. appendiculatus and L. macaronius belong to the same family, Ascalaphidae. Thus, it seems that s1 sequences, when present, diverge very fast among different families of the same order.
The L. macaronius s2 was 20 bp long and was located between trnS2 and nad1. The s2 spacer was present in all partly or fully sequenced neuropterid mtDNAs (Figure 12) [[13–15], present paper]. The s2 spacer is supposed to contain  the binding site for the DmTFF bidirectional transcription termination factor . Spacer s2 is a common feature of insect mtDNAs (e.g., ).
The L. macaronius s3 spacer was composed of a 1049-bp AT-rich region. The A+T content was 84.46%. This finding implies that s3 contained 10% more A+T than the whole α strand. Conversely, s3 was much less AT-skewed than the complete α strand (0.029 vs. 0.072). Strings of repeated single nucleotides (T)3-8, (A)3-11, (C)2-5, (G)2-5 characterized the s3 sequence. Furthermore, the abundance of A/T was mirrored by the presence of repeated motifs containing mostly/only A/T, a common feature of insect control regions . These motifs were represented by shorter identical strings as well as longer consensus patterns sharing 75% of identical positions. The motif AT was the most abundant, occurring 192 times. ATA occurred 70 times, ATAT 29, AATAATTA 5, and ATATAAATA 6. Finally, five distinct A/T-rich 12-mers were repeated twice each (data not shown). When a 75% minimum identity was allowed among repeated motifs, two strings were identified, each containing 37 nucleotides and extending, respectively, from base 551 to base 591 and from 811 to 847. Their consensus sequence was ATATTA (C/T) ATAT (GT/AA) ATA (AT/TG) TA (T/A) TTATTA (A/T) TATATAAAT. Thus, the L. macaronius AT-rich region contains several repeated motifs of different sizes. A similar trend also occurs in other neuropterids (data not shown).
Pair-wise as well as multiple alignments between/among neuropterid AT-rich regions were performed by applying very relaxed alignment parameters (i.e., gap-opening = 1 vs. standard 15; and gap-extending = 3 vs. standard 6.6). This strategy was applied to maximize the matches of these fast-evolving genomic portions (see below). This approach allowed us to align quite diverse sequences but did not guarantee that the positional homology principle was retained when the more diverging sequences were compared. The obtained results are summarized in Table 6.
What is immediately evident is that the level of conserved positions dropped rapidly when comparisons were extended above the taxonomic rank of family. Indeed, L. macaronius and A. appendiculatus, both members of the family Ascalaphidae, shared 71.59% nucleotide identity and a common GC-rich motif TCCCCGGCCCCCCAGGAT located 96 bp downstream of the 5'-end of rrnS in L. macaronius mtDNA. This motif could be a molecular signature for the whole family, but a better taxon sampling is necessary to assess this point.
When all Neuroptera were considered, the identical positions diminished to 30.15% in the multiple alignment. Thus, a substantial drop in nucleotide identity shared by AT-rich regions occurs at the order level, even permitting very permissive gap costs.
Similarly, C. cornutus and P. concolorus, both members of the family Corydalidae, shared 63.50% nucleotide identity in their control regions. Conversely, the identity level decreased to 45.91% when all megalopteran AT-rich regions were compared. Finally, the alignment at the interordinal taxonomic rank of neuropterid control regions produced only 14.80% identical positions. Provided that gaps are quite numerous, this result leads to the conclusion that the AT-rich region is a fast-evolving genomic region. This behavior of neuropterid control regions is a common feature of insect AT-rich regions (e.g., ) and is shared with other animal groups. In this respect, a well-documented group is represented by mammals (e.g., [82, 83]).
The utility of the AT-rich region as a phylogenetic marker should be most effective at low taxonomic rank (family level and below).
The mtDNA of L. macaronius, the fourth genome sequenced for the order Neuroptera, exhibits the peculiar translocation of the trnC gene with respect to the ancestral gene order of insects. This structural modification represents an exclusive feature of partly or fully sequenced neuropteran mtDNAs and could be a peculiar genomic marker characterizing the entire order Neuroptera.
The analysis of the AT%, AT-skew and GC-skew parameters, which were performed on a set of 84 holometabolous insect mtDNAs, shows that these characteristics exhibit complex patterns of variations. Such patterns can be linked or completely unlinked in their variation process.
The neuropterid mtDNAs sequenced to date use the standard invertebrate mitochondrial genome. The distribution of codon families in the PCGs located on the α and β strands is influenced by both structural/functional requirements of the corresponding proteins and by the base composition and AT-/GC-skews. The comparison among orthologous neuropterid PCGs shows that different genes have been subject to different rates of molecular evolution and form a pool of markers suitable for different phylogenetic purposes.
The evolution of neuropterid tRNAs seems to have been variable both in terms of sequence conservation and nucleotide substitution processes.
Neuropterid rrnL and rrnS structures, as demonstrated by the models produced for L. macaronius, appear similar to those determined for other insects. In their structural domains, they show diverse levels of conservation, influenced by different rates of substitution.
An important advance in our comprehension of the structure of rrnL is the production of a model for its domain I. To our knowledge, this is the first time that such a structural modeling has been attempted for an arthropod rrnL.
Neuropterid mtDNAs are punctuated by non-coding portions highly variable in size. Among them, the most extended segment is the control region (AT-rich region), which appears to be a fast-evolving genomic region characterized by short to medium-size repeated motifs/AT-rich patterns.
Sample origin and DNA extraction
The L. macaronius specimen used to determine the mtDNA was collected in June 2007 by EN on arid prairies (45°47'35.75"N 13°35'50.63"E, 98 m elevation) located in the Triestine karst along the road connecting San Giovanni al Timavo to Medeazza (Friuli Venezia Giulia region, northeastern Italy).
Total DNA was purified using the Invisorb DNA extraction kit (Invitec). The quality of DNA was assessed through electrophoresis in a 1% agarose gel and staining with SYBR-safe DNA gel stain (Invitrogen).
PCR amplification and sequencing of L. macaronius mtDNA
PCR amplification was performed using a mix of insect universal primers and primers specifically designed against the L. macaronius sequences [84, 85]. The complete list of successful primers is available upon request. The PCR products were visualized by electrophoresis in a 1% agarose gel and staining with SYBR-safe DNA gel stain. Each PCR product represented by a single electrophoretic band was purified with the ExoSAP-IT kit (Amersham Biosciences) and directly sequenced. Sequencing of both strands was performed at the BMR Genomics service (http://www.bmr-genomics.it/) on automated DNA sequencers mostly employing the primers used for PCR amplification.
Sequence assembly and annotation
The mtDNA final consensus sequence was assembled using the SeqMan II program from the Lasergene software package (DNAStar, Madison, WI). Gene and strand nomenclature used in this paper were described by Negrisolo et al. .
Sequence analysis was performed as follows. Initially, the mtDNA sequence was translated in silico into putative proteins using the Transeq program available at the EBI web site. The true identity of these polypeptides was established using the BLAST program available at the NCBI web site [87, 88]. Gene boundaries were determined as follows. The 5' ends of PCGs were inferred to be at the first legitimate in-frame start codon (ATN, GTG, TTG, GTT) in the open reading frame that was not located within the upstream gene encoded on the same strand [64, 89]. The only exceptions were atp6 and nad4, which overlap with their upstream genes (atp8 and nad4L, respectively) in many mtDNAs (e.g., ). The PCG terminus was inferred to be at the first in-frame stop codon encountered. When the stop codon was located within the sequence of a downstream gene encoded on the same strand, a truncated stop codon (T or TA) adjacent to the beginning of the downstream gene was designated the termination codon. This codon was thought to be completed by polyadenylation to a complete TAA stop codon after transcript processing. Finally, pair-wise comparisons with orthologous proteins were performed with ClustalW program to better define the limits of PCGs .
The transfer RNA genes were identified using the tRNAscan-SE program or recognized manually as sequences having the appropriate anticodon and capability of folding into the typical cloverleaf secondary structure [64, 93].
The boundaries of the ribosomal rrnL gene were assumed to be delimited by the ends of the trnV-trnL1 pair. The 3' end of the rrnS gene was assumed to be delimited by the start of trnV, while the 5' end was determined through comparison with orthologous genes of other previously sequenced neuropterid insect mtDNAs.
The published neuropterid genomes were re-annotated using the criteria listed above. This approach led us to propose different start/end positions for some genes.
Nucleotide composition was calculated with the EditSeq program included in the Lasergene software package. GC-skew = (G-C)/(G+C) and AT-skew = (A-T)/(A+T) were used to measure the base-compositional difference between the different strands or between genes coded on the alternative strands . The prediction of the genetic code used by each mtDNA was performed using the GenDecoder web server [67, 68].
Relative synonymous codon usage (RSCU) values were calculated with MEGA 4 program .
Codon usage in neuropterid mtDNAs was studied by calculating the effective number of codon used index (ENC) with the INCA 2.1 program [71, 96]. Multiple alignments of genes were produced with the ClustalW algorithm implemented in MEGA 4 [90, 95]. Both p-distances and the numbers of different nucleotides in pair-wise comparisons were calculated with the PAUP* program .
Sequence motifs in the AT-rich region were identified using the Spectral Repeat Finder program . Not only simple motifs were searched but also longer consensus patterns with a minimum match of 75% among different strings.
Testing for phylogenetic signal and saturation of the substitution process
An a priori estimation of the phylogenetic signal present in the multiple alignments was performed by maximum likelihood mapping . The phylogenetic signal was evaluated using the TREEPUZZLE 5.2 program .
For the nucleotide sequences, the level of saturation of the substitution process was estimated by plotting uncorrected p-distances (based on observed substitutions) against GTR+I+G estimated distances for multiple alignment of single PCGs as well as concatenated PCGs. In the case of amino acid sequences, the level of saturation was estimated by plotting uncorrected p-distances against mtREV24/JTT + G models implemented in MEGA4 and TREEPUZZLE 5.2. After fitting a regression line, its slope (m) was used as a measure of saturation. If m = 1, no saturation is inferred, while for m <<1, a high level of saturation has occurred.
rrnL and rrnS homology modeling
- atp6 and atp8:
ATP synthase subunits 6 and 8
- cob :
- cox1-3 :
cytochrome c oxidase subunits 1-3
- nad1-6 and nad4L:
NADH dehydrogenase subunits 1-6 and 4L
- rrnS and rrnL:
small and large subunit ribosomal RNA (rRNA) genes
- trnX :
transfer RNA (tRNA) genes: where X is the one-letter abbreviation of the corresponding amino acid
mitochondrial genomic spacers
- A+T region:
the putative control region
relative synonymous codon usage
effective number of codons used
Boore JL: Animal mitochondrial genomes. Nucleic Acids Res. 1999, 27: 1767-1780. 10.1093/nar/27.8.1767.
Shao R, Kirkness EF, Barker SC: The single mitochondrial chromosome typical of animals has evolved into 18 minichromosomes in the human body louse, Pediculus humanus. Genome Res. 2009, 19: 904-912. 10.1101/gr.083188.108.
Boyce TM, Zwick ME, Aquadro CF: Mitochondrial DNA in the bark weevils: size, structure and heteroplasmy. Genetics. 1989, 123: 825-836.
Boore, JL, Collins TM, Stanton D, Daehler LL, Brown WM: Deducing the pattern of arthropod phylogeny from mitochondrial DNA rearrangements. Nature. 1995, 376: 163-165. 10.1038/376163a0.
Boore JL, Lavrov DV, Brown WM: Gene translocation links insects and crustaceans. Nature. 1998, 392: 667-668. 10.1038/33577.
Clary DO, Wolstenholme DR: The mitochondrial DNA molecular of Drosophila yakuba: nucleotide sequence, gene organization, and genetic code. J Mol Evol. 1985, 22: 252-271. 10.1007/BF02099755.
Cameron SL, Beckenbach AT, Dowton M, Whiting MF: Evidence from mitochondrial genomics on interordinal relationships in insects. Arthropods Syst Phyl. 2006, 64: 27-34.
McMahon DP, Hayward A, Kathirithamby J: The mitochondrial genome of the 'twisted-wing parasite' Mengenilla australiensis (Insecta, Strepsiptera): a comparative study. BMC Genomics. 2009, 10 (603): 1-15.
Beard CB, Hamm DM, Collins FH: The mitochondrial genome of the mosquito Anopheles gambiae: DNA sequence, genome organization, and comparisons with mitochondrial sequences of other insects. Insect Mol Biol. 1993, 2: 103-124. 10.1111/j.1365-2583.1993.tb00131.x.
Mitchell SE, Cockburn AF, Seawright JA: The mitochondrial genome of Anopheles quadrimaculatus species A: complete nucleotide sequence and gene organization. Genome. 1993, 36: 1058-1073. 10.1139/g93-141.
Cameron SL, Whiting MF: The complete mitochondrial genome of the tobacco hornworm, Manduca sexta, (Insecta: Lepidoptera: Sphingidae), and an examination of mitochondrial gene variability within butterflies and moths. Gene. 2008, 408: 112-123. 10.1016/j.gene.2007.10.023.
Salvato P, Simonato M, Battisti A, Negrisolo E: The complete mitochondrial genome of the bag-shelter moth Ochrogaster lunifer (Lepidoptera, Notodontidae). BMC Genomics. 2008, 9 (331): 1-15.
Beckenbach AT, Stewart JB: Insect mitochondrial genomics 3: the complete mitochondrial genome sequences of representatives from two neuropteroid orders: a dobsonfly (order Megaloptera) and a giant lacewing and an owlfly (order Neuroptera). Genome. 2009, 52: 31-38. 10.1139/G08-098.
Cameron SL, Sullivan A, Song H, Miller KB: A mitochondrial genome phylogeny of the Neuropterida (lace-wings, alderflies and snakeflies) and their relationship to the other holometabolous insect orders. Zool Scr. 2009, 38: 575-590. 10.1111/j.1463-6409.2009.00392.x.
Hua J, Li M, Dong P, Xie Q, Bu W: The mitochondrial genome of Protohermes concolorus Yang et Yang 1988 (Insecta: Megaloptera: Corydalidae). Mol Biol Rep. 2009, 36: 1757-1765. 10.1007/s11033-008-9379-0.
Sheffield NC, Song H, Cameron SL, Whiting MF: Nonstationary evolution and compositional heterogeneity in beetle mitochondrial phylogenomics. Syst Biol. 2009, 58: 381-394. 10.1093/sysbio/syp037.
Sheffield NC, Song H, Cameron SL, Whiting MF: A comparative analysis of mitochondrial genomes in Coleoptera (Arthropoda: Insecta) and genome descriptions of six new beetles. Mol Biol Evol. 2008, 25: 2499-2509. 10.1093/molbev/msn198.
Hong MY, Jeong HC, Kim MJ, Jeong HU, Lee SH, Kim I: Complete mitogenome sequence of the jewel beetle, Chrysochroa fulgidissima (Coleoptera: Buprestidae). Mitochondrial DNA. 2009, 20: 46-60.
Stewart JB, Beckenbach AT: Phylogenetic and genomic analysis of the complete mitochondrial DNA sequence of the spotted asparagus beetle Crioceris duodecimpunctata. Mol Phylogent Evol. 2003, 26: 513-526. 10.1016/S1055-7903(02)00421-9.
Kim KG, Hong MY, Kim MJ, Im HH, Kim MI, Bae CH, Seo SJ, Lee SH, Kim I: Complete mitochondrial genome sequence of the yellow-spotted long-horned beetle Psacothea hilaris (Coleoptera: Cerambycidae) and phylogenetic analysis among coleopteran insects. Mol Cells. 2009, 27: 429-441. 10.1007/s10059-009-0064-5.
Bae JS, Kim I, Sohn HD, Jin BR: The mitochondrial genome of the firefly, Pyrocoelia rufa: complete DNA sequence, genome organization, and phylogenetic analysis with other insects. Mol Phylogent Evol. 2004, 32: 978-985. 10.1016/j.ympev.2004.03.009.
Arnoldi FG, Ogoh K, Ohmiya Y, Viviani VR: Mitochondrial genome sequence of the Brazilian luminescent click beetle Pyrophorus divergens (Coleoptera: Elateridae): mitochondrial genes utility to investigate the evolutionary history of Coleoptera and its bioluminescence. Gene. 2007, 405: 1-9. 10.1016/j.gene.2007.07.035.
Li X, Ogoh K, Ohba N, Liang X, Ohmiya Y: Mitochondrial genomes of two luminous beetles, Rhagophthalmus lufengensis and R. ohbai (Arthropoda, Insecta, Coleoptera). Gene. 2007, 392: 196-205. 10.1016/j.gene.2006.12.017.
Friedrich M, Muqim N: Sequence and phylogenetic analysis of the complete mitochondrial genome of the flour beetle Tribolium castanaeum. Mol Phylogent Evol. 2003, 26: 502-512. 10.1016/S1055-7903(02)00335-4.
Yu DJ, Xu L, Nardi F, Li JG, Zhang RJ: The complete nucleotide sequence of the mitochondrial genome of the oriental fruit fly, Bactrocera dorsalis (Diptera: Tephritidae). Gene. 2007, 396: 66-74. 10.1016/j.gene.2007.02.023.
Nardi F, Carapelli A, Dallai R, Frati F: The mitochondrial genome of the olive fly Bactrocera oleae: two haplotypes from distant geographical locations. Insect Mol Biol. 2003, 12: 605-611. 10.1046/j.1365-2583.2003.00445.x.
Spanos L, Koutroumbas G, Kotsyfakis M, Louis C: The mitochondrial genome of the mediterranean fruit fly, Ceratitis capitata. Insect Mol Biol. 2000, 9: 139-144. 10.1046/j.1365-2583.2000.00165.x.
Junqueira AC, Lessinger AC, Torres TT, Da Silva FR, Vettore AL, Arruda P, Azeredo Espin AM: The mitochondrial genome of the blowfly Chrysomya chloropyga (Diptera: Calliphoridae). Gene. 2004, 339: 7-15.
Lessinger AC, Martins Junqueira AC, Lemos TA, Kemper EL, da Silva FR, Vettore AL, Arruda P, Azeredo-Espin AM: The mitochondrial genome of the primary screwworm fly Cochliomyia hominivorax (Diptera: Calliphoridae). Insect Mol Biol. 2000, 9: 521-529. 10.1046/j.1365-2583.2000.00215.x.
Cameron SL, Lambkin CL, Barker SC, Whiting MF: A mitochondrial genome phylogeny of Diptera: whole genome sequence data accurately resolve relationships over broad timescales with high precision. Syst Entomol. 2007, 32: 40-59. 10.1111/j.1365-3113.2006.00355.x.
Ballard JW: Comparative genomics of mitochondrial DNA in members of the Drosophila melanogaster subgroup. J Mol Evol. 2000, 51: 48-63.
Lewis DL, Farr CL, Kaguni LS: Drosophila melanogaster mitochondrial DNA: completion of the nucleotide sequence and evolutionary comparisons. Insect Mol Biol. 1995, 4: 263-278. 10.1111/j.1365-2583.1995.tb00032.x.
Oliveira MT, Grande Barau J, Martins Junqueira AC, Feijão PC, Coelho da Rosa A, Feix Abreu C, Azeredo-Espin AML, Lessinger AC: Structure and evolution of the mitochondrial genomes of Haematobia irritans and Stomoxys calcitrans: The Muscidae (Diptera: Calyptratae) perspective. Mol Phylogenet Evol. 2008, 48: 850-857. 10.1016/j.ympev.2008.05.022.
Stevens JR, West H, Wall R: Mitochondrial genomes of the sheep blowfly, Lucilia sericata, and the secondary blowfly, Chrysomya megacephala. Med Vet Entomol. 2008, 22: 89-91. 10.1111/j.1365-2915.2008.00710.x.
Beckenbach AT, Joy JB: Evolution of the Mitochondrial Genomes of Gall Midges (Diptera:Cecidomyiidae): Rearrangement and Severe Truncation of tRNA Genes. Genome Biol Evol. 2009, 1: 278-287.
Cameron SL, Dowton M, Castro LR, Ruberu K, Whiting MF, Austin AD, Diement K, Stevens J: Mitochondrial genome organization and phylogeny of two vespid wasps. Genome. 2008, 51: 800-808. 10.1139/G08-066.
Crozier RH, Crozier YC: The mitochondrial genome of the honeybee Apis mellifera: complete sequence and genome organization. Genetics. 1993, 133: 97-117.
Cha SY, Yoon HJ, Lee EM, Yoon MH, Hwang JS, Jin BR, Han YS, Kim I: The complete nucleotide sequence and gene organization of the mitochondrial genome of the bumblebee, Bombus ignitus (Hymenoptera: Apidae). Gene. 2007, 392: 206-220. 10.1016/j.gene.2006.12.031.
Dowton M, Cameron SL, Dowavic JI, Austin AD, Whiting MF: Characterization of 67 mitochondrial tRNA gene rearrangements in the Hymenoptera suggests that mitochondrial tRNA gene position is selectively neutral. Mol Biol Evol. 2009, 26: 1607-1617. 10.1093/molbev/msp072.
Wei SJ, Shi M, He JH, Sharkey M, Chen XX: The complete mitochondrial genome of Diadegma semiclausum (hymenoptera: ichneumonidae) indicates extensive independent evolutionary events. Genome. 2009, 52: 308-319. 10.1139/G09-008.
Wei SJ, Tang P, Zheng LH, Shi M, Chen XX: The complete mitochondrial genome of Evania appendigaster (Hymenoptera: Evaniidae) has low A+T content and a long intergenic spacer between atp8 and atp6. Mol Biol Rep. 2010, 37: 1931-1942. 10.1007/s11033-009-9640-1.
Silvestre D, Dowton M, Arias MC: The mitochondrial genome of the stingless bee Melipona bicolor (Hymenoptera, Apidae, Meliponini): Sequence, gene organization and a unique tRNA translocation event conserved across the tribe Meliponini. Genet Mol Biol. 2008, 31: 451-460. 10.1590/S1415-47572008000300010.
Hu J, Zhang D, Hao J, Huang D, Cameron S, Zhu C: The complete mitochondrial genome of the yellow coaster, Acraea issoria (Lepidoptera: Nymphalidae: Heliconiinae: Acraeini): sequence, gene organization and a unique tRNA translocation event. Mol Biol Rep. 2010, 37: 3431-3438. 10.1007/s11033-009-9934-3.
Lee ES, Shin KS, Kim MS, Park H, Cho S, Kim CB: The mitochondrial genome of the smaller tea tortrix Adoxophyes honmai (Lepidoptera: Tortricidae). Gene. 2006, 373: 52-57.
Liu Y, Li Y, Pan M, Dai F, Zhu X, LuC , Xiang Z: The complete mitochondrial genome of the Chinese oak silkmoth, Antheraea pernyi (Lepidoptera: Saturniidae). Acta Biochim Biophys Sin. 2008, 40: 693-703.
Kim SR, Kim MI, Hong MY, Kim KY, Kang PD, Hwang JS, Han YS, Jin BR, Kim I: The complete mitogenome sequence of the Japanese oak silkmoth, Antheraea yamamai (Lepidoptera: Saturniidae). Mol Biol Rep. 2009, 36: 1871-1880. 10.1007/s11033-008-9393-2.
Hong G, Jiang S, Yu M, Yang Y, Li F, Xue F, Wei Z: The complete nucleotide sequence of the mitochondrial genome of the cabbage butterfly, Artogeia melete (Lepidoptera: Pieridae). Acta Biochim Biophys Sin. 2009, 41: 446-455. 10.1093/abbs/gmp030.
Yukuhiro K, Sezutsu H, Itoh M, Shimizu K, Banno Y: Significant levels of sequence divergence and gene rearrangements have occurred between the mitochondrial genomes of the wild mulberry silkmoth, Bombyx mandarina and its close relative, the domesticated silkmoth, Bombyx mori. Mol Biol Evol. 2002, 19: 1385-1389.
Kim I, Lee EM, Seol KY, Yun EY, Lee YB, Hwang JS, Jin BR: The mitochondrial genome of the Korean hairstreak, Coreana raphaelis (Lepidoptera: Lycaenidae). Insect Mol Biol. 2006, 15: 217-225. 10.1111/j.1365-2583.2006.00630.x.
Jiang ST, Hong GY, Yu M, Li N, Yang Y, Liu YQ, Wei ZJ: Characterization of the complete mitochondrial genome of the giant silkworm moth, Eriogyna pyretorum (Lepidoptera: Saturniidae). Int J Biol Sci. 2009, 5: 351-365.
Yang L, Wei ZJ, Hong GY, Jiang ST, Wen LP: The complete nucleotide sequence of the mitochondrial genome of Phthonandria atrilineata (Lepidoptera: Geometridae). Mol Biol Rep. 2009, 36: 1441-1449. 10.1007/s11033-008-9334-0.
Hong MY, Lee EM, Jo YH, Park HC, Kim SR, Hwang JS, Jin BR, Kang PD, Kim KG, Han YS, Kim I: Complete nucleotide sequence and organization of the mitogenome of the silk moth Caligula boisduvalii (Lepidoptera: Saturniidae) and comparison with other lepidopteran insects. Gene. 2008, 413: 49-57. 10.1016/j.gene.2008.01.019.
Wiegmann BM, Trautwein MD, Kim JW, Cassel BK, Bertone MA, Winterton SL, Yeates DK: Single-copy nuclear genes resolve the phylogeny of the holometabolous insects. BMC Biology. 2009, 7 (34): 1-16.
Wei SJ, Shi M, Sharkey MJ, van Achterberg C, Chen XX: Comparative mitogenomics of Braconidae (Insecta: Hymenoptera) and the phylogenetic utility of mitochondrial genomes with special reference to Holometabolous insects. BMC Genomics. 2010, 11 (371): 1-16.
Aspöck U, Plant JD, Nemeschkal HL: Cladistic analysis of Neuroptera and their systematic position within Neuropterida (Insecta: Holometabola: Neuropterida: Neuroptera). Syst Entomol. 2001, 26: 73-86. 10.1046/j.1365-3113.2001.00136.x.
Haring E, Aspöck U: Phylogeny of the Neuropterida: a first molecular approach. Syst Entomol. 2004, 29: 415-430. 10.1111/j.0307-6970.2004.00263.x.
Winterton SL, Hardy NB, Wiegmann BM: On wings of lace: phylogeny and Bayesian divergence time estimates of Neuropterida (Insecta) based on morphological and molecular data. Syst Entomol. 2010, 35: 349-378. 10.1111/j.1365-3113.2010.00521.x.
Carapelli A, Vannini L, Nardi F, Boore JL, Beani L, Dallai R, Frati F: The mitochondrial genome of the entomophagous endorparasite Xenos vesparum (Insecta: Strepsiptera). Gene. 2006, 376: 248-259. 10.1016/j.gene.2006.04.005.
Castro LR, Dowton M: The position of the Hymenoptera within the Holometabola as inferred from the mitochondrial genome of Perga condei (Hymenoptera: Symphyta: Pergidae). Mol Phylogenet Evol. 2005, 34: 469-479. 10.1016/j.ympev.2004.11.005.
Castro LR, Ruberu K, Dowton M: Mitochondrial genomes of Vanhornia eucnemidarum (Apocrita: Vanhorniidae) and Primeuchroeus spp. (Aculeata: Chrysididae): Evidence of rearranged mitochondrial genomes within the Apocrita (Insecta: Hymenoptera). Genome. 2006, 49: 752-766. 10.1139/G06-030.
Dowton M: Relationships among the cyclostome braconid (Hymenoptera: Braconidae) subfamilies inferred from a mitochondrial tRNA gene rearrangement. Mol Phylogenet Evol. 1999, 11: 283-287. 10.1006/mpev.1998.0580.
Dowton M, Austin AD: Evolutionary dynamics of a mitochondrial rearrangement ''hotspot'' in the Hymenoptera. Mol Biol Evol. 1999, 16: 298-309.
Dowton M, Castro LR, Campbell SL, Bargon SD, Austin AD: Frequent mitochondrial gene rearrangements at the hymenopteran nad3-nad5 junction. J Mol Evol. 2003, 56: 517-526. 10.1007/s00239-002-2420-3.
Wolstenholme DR: Animal mitochondrial DNA: structure and evolution. Int Rev Cytol. 1992, 141: 173-216.
Montooth KL, Abt DN, Hofmann JW, Rand DM: Comparative genomics of Drosophila mtDNA: novel features of conservation and change across functional domains and lineages. J Mol Evol. 2009, 69: 94-114. 10.1007/s00239-009-9255-0.
Stewart JB, Beckenbach AT: Characterization of mature mitochondrial transcripts in Drosophila, and the mplications for the tRNA punctuation model in arthropods. Gene. 2009, 445: 49-57. 10.1016/j.gene.2009.06.006.
Abascal F, Posada D, Knight RD, Zardoya R: Parallel evolution of the genetic code in arthropod mitochondrial genomes. PLoS Biol. 2006, 4: e127-10.1371/journal.pbio.0040127. (1-8)
Abascal F, Zardoya R, Posada D: GenDecoder: genetic code prediction for metazoan mitochondria. Nucleic Acids Res. 2006, 34: W389-W393. 10.1093/nar/gkl044.
Mishmar D, Ruiz-Pesini E, Mondragon-Palomino M, Procaccio V, Gaut B, Wallace DC: Adaptive selection of mitochondrial complex I subunits during primate radiation. Gene. 2006, 378: 11-18.
Strimmer K, von Haeseler A: Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment. Proc Natl Acad Sci USA. 1997, 94: 6815-6819. 10.1073/pnas.94.13.6815.
Supek F, Vlahovičekl K: INCA: synonymous codon usage analysis and clustering by means of self-organizing map. Bioinformatics. 2004, 20: 2329-2330. 10.1093/bioinformatics/bth238.
Negrisolo E, Minelli A, Valle G: The mitochondrial genome of the house centipede Scutigera and myriapod monophyly vs. paraphyly. Mol Biol Evol. 2004, 21: 770-780. 10.1093/molbev/msh078.
Lavrov DV, Brown WM, Boore JL: A novel type of RNA editing occurs in the mitochondrial tRNAs of the centipede Lithobius forficatus. Proc Natl Acad Sci USA. 2000, 97: 13738-13742. 10.1073/pnas.250402997.
Cannone JJ, Subramanian S, Schnare MN, Collett JR, D'Souza LM, Du Y, Feng B, Lin N, Madabusi LV, Müller KM, Pande N, Shang Z, Yu N, Gutell RR: The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BMC Bioinformatics. 2002, 3 (2): 1-31.
Coleman AW: ITS2 is a double-edged tool for eukaryote evolutionary comparisons. Trends Genet. 2003, 19: 370-375. 10.1016/S0168-9525(03)00118-5.
Kim I, Cha SY, Yoon MH, Hwang JS, Lee SM, Sohn HD, Jin BR: The complete nucleotide sequence and gene organization of the mitochondrial genome of the oriental mole cricket, Gryllotalpa orientalis (Orthoptera: Gryllotalpidae). Gene. 2005, 353: 155-168. 10.1016/j.gene.2005.04.019.
Fenn JD, Song H, Cameron SL, Whiting MF: A preliminary mitochondrial genome phylogeny of Orthoptera (Insecta) and approaches to maximizing phylogenetic signal found within mitochondrial genome data. Mol Phylogenet Evol. 2008, 49: 59-68. 10.1016/j.ympev.2008.07.004.
Miya M, Satoh TP, Nishida M: The phylogenetic position of toadfishes (order Batrachoidiformes) in the higher ray-finned fish as inferred from partitioned Bayesian analysis of 102 whole mitochondrial genome sequences. Biol J Linn Soc Lond. 2005, 85: 289-306. 10.1111/j.1095-8312.2005.00483.x.
Gillespie JJ, Johnston JS, Cannone JJ, Gutell RR: Characteristics of the nuclear (18S, 5.8S, 28S and 5S) and mitochondrial (12S and 16S) rRNA genes of Apis mellifera (Insecta: Hymenoptera): structure, organization, and retrotransposable elements. Insect Mol Biol. 2006, 15: 657-686. 10.1111/j.1365-2583.2006.00689.x.
Roberti M, Polosa PL, Bruni F, Musicco C, Gadaleta MN, Cantatore P: DmTTF, a novel mitochondrial transcription termination factor that recognises two sequenze in Drosophila melanogaster mitochondrial DNA. Nucleic Acid Res. 2003, 31: 1597-1604. 10.1093/nar/gkg272.
Zhang DX, Hweitt GM: Insect mitochondrial control region: a review of its structure, evolution and usefulness in evolutionary studies. Biochem Syst Ecol. 1997, 25: 99-120. 10.1016/S0305-1978(96)00042-7.
Larizza A, Pesole G, Reyes A, Sbisà E, Saccone C: Lineage specificity of the evolutionary dynamics of the mtDNA D-Loop region in rodents. J Mol Evol. 2002, 54: 145-155. 10.1007/s00239-001-0063-4.
Larizza A, Makalowski W, Pesole G, Saccone C: Evolutionary dynamics of mammalian mRNA untranslated regions by comparative analysis of orthologous human, artiodactyl and rodent gene pairs. Comput Chem. 2002, 26: 479-490.
Simon C, Frati F, Beckenbach A, Crespi B, Liu H, Flook P: Evolution, weighting, and phylogenetic utility of mitochondrial gene sequences and a compilation of conserved polymerase chain reaction primers. Ann Entomol Soc Am. 1994, 87: 651-704.
Simon C, Buckley TR, Frati F, Stewart JB, Beckenbach AT: Incorporating molecular evolution into phylogenetic analysis, and a new compilation of conserved polymerase chain reaction primers for animal mitochondrial DNA. Annu Rev Ecol Evol Syst. 2006, 37: 545-579. 10.1146/annurev.ecolsys.37.091305.110018.
Negrisolo E, Minelli A, Valle G: Extensive gene order rearrangement in the mitochondrial genome of the centipede Scutigera coleoptrata. J Mol Evol. 2004, 58: 413-423. 10.1007/s00239-003-2563-x.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215: 403-410.
Tatusova TA, Madden TL: BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. FEMS Microbiol Lett. 1999, 174: 247-250. 10.1111/j.1574-6968.1999.tb13575.x.
Lavrov DV, Boore JL, Brown WM: The complete mitochondrial DNA sequence of the horseshoe crab Limulus polyphemus. Mol Biol Evol. 2000, 17: 813-824.
Thompson JD, Higgins DG, Gibson TJ: Clustal-W - improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22: 4673-4680. 10.1093/nar/22.22.4673.
Smith AE, Marcker KA: N-formylmethionyl transfer RNA in mitochondria from yeast and rat liver. J Mol Biol. 1968, 38: 241-243. 10.1016/0022-2836(68)90409-9.
Fearnley IM, Walker JE: Initiation codons in mammalian mitochondria: differences in genetic code in the organelle. Biochemistry. 1987, 26: 8247-8251. 10.1021/bi00399a034.
Lowe TM, Eddy SR: tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997, 25: 955-964. 10.1093/nar/25.5.955.
Perna NT, Kocher TD: Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes. J Mol Evol. 1995, 41: 353-358. 10.1007/BF01215182.
Tamura K, Dudley J, Nei M, Kumar S: MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) Software Version 4.0. Mol Biol Evol. 2007, 24: 1596-1599. 10.1093/molbev/msm092.
Wright F: The 'effective number of codons' used in a gene. Gene. 1990, 87: 23-29. 10.1016/0378-1119(90)90491-9.
Swofford DL: PAUP*, Phylogenetic Analysis using Parsimony (*and other methods). Version 4.10. 2002, Sunderland MA: Sinauer Associates
Sharma D, Issac B, Raghava GPS, Ramaswamy R: Spectral Repeat Finder (SRF): identification of repetitive sequences using Fourier transformation. Bioinformatics. 2004, 20: 1405-1412. 10.1093/bioinformatics/bth103.
Schmidt HA, Strimmer K, Vingron M, von Haeseler A: TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics. 2002, 18: 502-504. 10.1093/bioinformatics/18.3.502.
We express our sincere thanks to Filippo Calore (Albignasego, Padova, Italy), who painted the icon of L. macaronius included in Figure 2, using as reference a picture of an animal belonging to the same population of the specimen used in this work. We express our sincere thanks to three anonymous reviewers who provided very useful comments and criticisms to an early version of our manuscript. This project was supported by a grant provided to EN (animal mitogenomics, developed at the Department of Public Health, Comparative Pathology and Veterinary Hygiene). Support was also given by the Italian Ministry for Research though a grant to TP (PRIN n. 20078J9BYX).
EN and TP designed and coordinated all experiments. EN and MB conducted the molecular experiments. EN performed the genomic analyses and wrote the first draft of the manuscript. All authors contributed the final version of the manuscript and approved it.
Electronic supplementary material
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.