Intraspecific rearrangements
We obtained 290 samples from 28 localities for Quasipaa boulengeri (Fig. 2 and Additional file 1: Table S1). Sequences from a region encompassing nad2 to cox1, which includes the WANCY hotspot, revealed gene-organizations atypical of vertebrates. Stable secondary structures of tRNA genes and the absence of premature stop codons in nad2 and cox1 authenticated the sequencing of mtDNA. The WANCY region in this frog differed from the typical organization by having a long noncoding sequence that ranges in size from 473 bp to 925 bp. Further, gene annotation identified different positions for the gene and its copy, even within a single population. The details of gene organization of the WANCY region for each sample were listed in Additional file 2: Table S2.
Annotation identifies four kinds (types) of gene rearrangements (Fig. 3a). The typical gene order of the trnW–trnY block is trnW, trnA, trnN, OL, trnC, and trnY (WANCY). Unlike the other types, where the OL is located before trnN (Fig. 3a), in Type I, the OL occurs after trnN, separated by an intergenic spacer (IGS or noncoding region). In Type II, two trnA occur with an IGS located between trnA1 and OL, another IGS occurs between OL and trnA2, and another IGS between trnN and trnC. The gene organization of Type III and Type IV are similar to Type II, but Type III lacks trnA1 and Type IV lacks trnA2, respectively. Except for the reorganizations of trnA, trnN, and OL, all other tRNAs and protein-coding genes have positions and lengths typical of the vertebrate mitogenome (Fig. 3a; Additional file 2: Table S2).
Sequential process of TDRL
All four types of gene rearrangement in Q. boulengeri involve trnA, trnN, OL and trnC. The IGSs reveal tandem duplications in the WANCY region. These residues identify pseudogenes of trnA, trnN and trnC, whose sequences are similar to corresponding tRNAs. Additional file 3: Figure S1 shows the primary sequence of trnA and trnN for each variant. The secondary structures of these genes fold into typical stem-and-loop structures (Additional file 4: Figure S2). Each type of variant has only one trnA and trnN, except in Type II, which has two trnAs, both of which form stable clover-leaf structures. Thus, these genes are paralogs created by gene duplication. Residues are very similar to trnA and trnN, but have a loss of function owing to secondary structures or a mutation on the anticodon position.
TDRL [5, 16] best explains the gene rearrangement in Q. boulengeri, and the diversity of rearrangements rejects the alternative hypotheses of non-random loss (TDNL and DMNL), recombination, and DRRL. Genes of the same polarity do not cluster together, and the finding of alternative loss of duplicated genes, as seen in comparison of Types III and IV, contradicts non-random loss models (TDNL or DMNL). The absence of different tandem duplication junction points and no variation in the number of tandem repeats (VNTR) and this does not support the recombination model. We cannot reject the hypotheses that unequal crossing over of intermolecular recombinations were subtly inserted in front of trnA and behind trnC, but tandem duplication would essentially be a consequence of this recombination. Further, no concerted evolution rejects intramolecular recombination, because the two copies of trnA in type II differ (Additional file 3: Figure S1, and Additional file 4: Figure S2). Finally, analyses reject the hypothesis of DRRL due to the absence of two control regions in the mitogenome of Q. boulengeri [26].
The TDRL hypothesis remains the only viable explanation and our results conform to its predictions (Fig. 3b). The hypothesized duplicated region in the mitogenome of Q. boulengeri includes trnA, trnN, OL, and a partial fragment of trnC. Slipped-strand mispairing, imprecise termination or recombination have been proposed to explain mitogenomic duplications [5, 7, 15]. Regardless of the molecular mechanism, tandem duplication mutations will yield two copies each of trnA, trnN, OL, and trnC. Subsequent random loss appears to have occurred at least twice independently in the mitogenome of Q. boulengeri. First, rearrangement Type I involves the loss of OL1, trnC1, trnA2, and trnN2. Second, loss involves trnN1, trnC1, and OL2 in Type II. The retention of two copies of trnA in Type II is direct evidence for the randomness of loss because alternative losses occur in Type III and Type IV, which have the same gene order as Type II (trnA1 has been lost in Type III as compared to trnA2 in Type IV). Rather than a result of selection for one or other alternative, loss of one copy of trnA appears to have occurred by chance alone.
The sequencing cox1 and cob for 290 individuals (Additional file 1: Table S1) identifies the origin of the initial tandem duplication event and the stepwise process of random loss when viewed in terms of the phylogenetic relationships of all types of gene rearrangement. The concatenated alignment contains 1463 nucleotide positions (cox1: 626 bp; cob: 837 bp) without stop codons. Maximum likelihood (ML) and Bayesian inference (BI) reconstructions obtain similar tree topologies for the four types of rearrangement (Fig. 3c). All haplotypes cluster together by type and with moderate to strong levels of nodal support, except for Type III, which is paraphyletic. Analyses recover the group Type II + Type III + Type IV, and roots it as the sister-group of Type I. Type II forms the sister-group of Type III + Type IV and some samples of Type III unite with Type IV.
The phylogenetic analyses and gene-order strongly indicate a single tandem duplication event and stepwise random loss in Q. boulengeri. The clustering of Types II–IV suggests that they share the same primary TDRL process (Fig. 3b, c), and that Type I represents an independent random loss. The monophyly of types I, II and IV indicate a single origin of each rearrangement type. Paraphyly of Type III suggests two parallel random losses are responsible for the same gene rearrangement. The associations of Type IIIa, IIIb and IV indicate that they shared a recent common ancestor, but independently lost one duplicated copy of trnA. The possession of both copies of trnA indicates that Type II represents the intermediate state.
Although mitochondrial gene rearrangements are not uncommon among related taxa, recognized intermediate steps of gene-order rearrangements are rare, and their presence can suggest evolutionary mechanisms [7, 27]. Most intermediate states appear as pseudogenes or residues of tandemly duplicated sequences rather than as two functional gene copies [3, 8, 16]. The intermediate state of Type II, leading either to Type III or Type IV, is an example of the random loss of one trnA gene.
Alternative loss-types have been observed among closely related species, e.g. alternative losses of trnH in the anuran Babina [17, 28]. However, alternative losses have been reported rarely within a single species and even more so within a population. The finding of alternate losses in Q. boulengeri is the first observation of an intermediate state involving two functional gene copies, and, simultaneously, the loss of alternative types in a vertebrate mt genome.
The occurrence of TDRL in Q. boulengeri corresponds to the view that reorganization of the mitochondrial genome is nonadaptive [17, 29]. Our results indicate that the hotspot of gene rearrangement is adjacent to the origin of light-strand replication [6]. Homoplastic mitochondrial rearrangements are contiguous in the genome or they locate around the origin of replication [6, 30, 31]. Under these conditions, mitochondrial gene-orders appear susceptible to convergent or parallel evolution because of functional constraints or selective pressures [32–34]. However, such evolution does not occur in Q. boulengeri. The gene-order and phylogenetic analyses indicate a single tandem duplication event followed by independent losses (Fig. 3), which implies that random loss and gene-order are not involved in adaptive evolution [17].
Evolution of mitogenomic rearrangement
A time-calibrated phylogeny constructed using Bayesian inference estimates the recency of mitogenomic rearrangements in Q. boulengeri (Fig. 3c). The initial diversification (Type I) dates to about 0.8 Ma and divergence among the other three types ranges from 0.4 to 0.6 Ma, suggesting that the duplication and fixation of these rearrangements can occur quite quickly. Combing interspecific rearrangement data, we summarize the rates of mitogenomic duplication and loss (Additional file 5: Table S3). This suggests that post duplication, the alternative loss types can occur in 0.2–5 Ma.
To explore how many intraspecific rearrangements existed, we gather both in silico and experimental evidence to detect the gene order in highly rearranged groups. First, our in silico approach obtains mitochondrial gene-order information (mtGordV0.5.pl; Additional file 6: Software) from GenBank in Amphibia and Squamata, two groups with high diversities in mitogenome rearrangement. Analyses discover that intraspecific rearrangements are rare, but they exist. Two cases occur in Squamata, one in asexual squamates with multiple origins of duplication [7], and another in an amphisbaenian with alternative loss-types varying among populations [19]. No intraspecific rearrangements in Amphibia exist in data from GenBank.
Our sequences of the WANCY fragment in multiple populations of the frogs Odorrana schmackeri and Amolops mantzorum detected that this hotspot region differed from the typical vertebrate arrangement [17]. These species do not exhibit intraspecific variation in gene-order, yet evidence for variation in losses may be represented by the residues of pseudogenes (data not shown).
Intraspecific studies may provide new insights into the high incidence of rearranged mitochondrial genomes. Above the species level, rates of mitogenomic partial duplication have been found to be high, and multiple duplication events can facilitate gene-rearrangement [7, 16, 21]. However, duplications may not occur frequently within a species. The rarity of this may reflect selection or functional constraints that prevent fixation, and may shed light on the limits of intraspecific diversity of mitogenomes. It could also owe to the dearth of investigations of intraspecific mitogenomic reorganization. We predict that mitochondrial metagenomic skimming by next-generation sequencing [35, 36] will detect additional cases of intraspecific rearrangements.
Random loss within duplicated regions could occur repeatedly, and the rate of duplication excision may be relatively high. At the intraspecific level, random loss occurred independently many times both in Q. boulengeri and the lizard Bipes biporus (Fig. 3c, Additional file 5: Table S3). At the interspecific level, the sibling frog genera Babina and Odorrana share the same duplication of genes, but the pathways of deletion differed among species [17, 28].
The loss of a duplicated region and fixation could occur in short evolutionary time (0.2 Ma, Additional file 5: Table S3). Deletion of a redundant gene-copy may happen rapidly due to functional constraints and the compactness of the metazoan mitogenome, facilitating the formation of pseudogenes or the complete deletion of redundant genes [27, 37]. A functionally redundant duplicate gene copy may not persist long in a population because deleterious mutations can accumulate and cause the redundant gene to become nonfunctional [38].
Nonadaptive forces, such as genetic drift or bottlenecking, may drive the fixation and dispersal of mitogenomic reorganizations. The low effective population size of the mitogenome leaves it vulnerable to bottlenecks and genetic drift, which can drive the fixation of large-scale genomic modifications [16, 39–41]. Quasipaa boulengeri resides in localized montane areas, mainly in rocky streams [42]. Its highly specific habitat may limit gene flow and result in a pattern structured by genetic drift. The upper and midstream tributaries of the Yangtze River, including some areas in Chongqing, Guizhou and Hubei, have populations containing two or more sympatric types of rearrangements (Fig. 2). This area may be the original source of the gene rearrangements, or may represent areas of secondary contact. Both scenarios explain the distribution of types. Historical demographic analyses in this area point to dispersal events for Q. boulengeri [42]. TDRL may have first occurred in this area, followed by dispersal to other places. However, the single origin of each type suggests independent fixations of alternative arrangements, in which case secondary contact could also explain the pattern.