Structural and gene composition variation of the complete mitochondrial genome of Mammillaria huitzilopochtli (Cactaceae, Caryophyllales), revealed by de novo assembly
BMC Genomics volume 24, Article number: 509 (2023)
Structural descriptions of complete genomes have elucidated evolutionary processes in angiosperms. In Cactaceae (Caryophyllales), a high structural diversity of the chloroplast genome has been identified within and among genera. In this study, we assembled the first mitochondrial genome (mtDNA) for the short-globose cactus Mammillaria huitzilopochtli. For comparative purposes, we used the published genomes of 19 different angiosperms and the gymnosperm Cycas taitungensis as an external group for phylogenetic issues.
The mtDNA of M. huitzilopochtli was assembled into one linear chromosome of 2,052,004 bp, in which 65 genes were annotated. These genes account for 57,606 bp including 34 protein-coding genes (PCGs), 27 tRNAs, and three rRNAs. In the non-coding sequences, repeats were abundant, with a total of 4,550 (179,215 bp). In addition, five complete genes (psaC and four tRNAs) of chloroplast origin were documented. Negative selection was estimated for most (23) of the PCGs. The phylogenetic tree showed a topology consistent with previous analyses based on the chloroplast genome.
The number and type of genes contained in the mtDNA of M. huitzilopochtli were similar to those reported in 19 other angiosperm species, regardless of their phylogenetic relationships. Although other Caryophyllids exhibit strong differences in structural arrangement and total size of mtDNA, these differences do not result in an increase in the typical number and types of genes found in M. huitzilopochtli. We concluded that the total size of mtDNA in angiosperms increases by the lengthening of the non-coding sequences rather than a significant gain of coding genes.
In plants, mitochondria play a crucial role in providing cellular energy through respiration [1, 2], and they are also involved in various metabolic processes , such as stress tolerance  and programmed cell death . In addition, some mitochondrial mutations have been associated with male sterility and they were identified in approximately 150 species, particularly in some cultivated species such as Beta vulgaris, Capsicum annuum, Daucus carota and Zea mays .
We recently searched the NCBI website (April 20, 2022) for complete organelle genomes of angiosperm taxa, and approximately 450 mitochondrial (mtDNA) and ~ 8000 plastidic (cpDNA) genomes were documented. This disparity in the number of sequenced genomes has led to a poorer understanding of the biology and evolution of plant mtDNA. Genomic comparisons between mtDNA and cpDNA indicate that the former is larger and more structurally complex than the latter . Accordingly, mtDNA has been found to be organized either in a single molecule or multiple molecules called chromosomes, which can be arranged in linear or circular forms . At present, the underlying factors and processes that determine the structural organization of plant mtDNA have not been fully elucidated. The available data suggest that in flowering plants, the number and length of mitochondrial chromosomes are not necessarily determined only by the total size of the mtDNA. For example, the parasitic mistletoe Viscum scurruloideum (Santalaceae) has the shortest mitochondrial genome of only 66 kbp and is organized in two chromosomes . In contrast, those larger mtDNAs of Zelkova schneideriana with 154 kbp (Ulmaceae, MW717907) and Corchorus capsularis of 2 Mbp (Malvaceae, KT894204) are organized in a single chromosome. Presently, the largest mtDNA (11.3 Mbp) was documented in Silene conica (Caryophyllaceae), which shows a complex organization in the huge number of 128 circular chromosomes .
Despite this wide variation in size and structural organization, angiosperm mtDNA contains a relatively small number of genes, ranging from 28 in Viscum scurruloideum (Santalaceae)  to 69 in Sesuvium portulacastrum (Aizoaceae) . In flowering plants, mtDNA is typically composed by three functional types of genes: protein-coding genes, tRNAs and rRNAs. As with other genomes, these functional genes are separated by non-coding DNA sequences called intergenic spacers . It has been proposed that the relatively small number of genes contained in mtDNA is due to the large-scale gene migration that occurred from mitochondria to the nuclear genome along the evolutionary history of plants . In fact, most of the ∼2,000 functional mitochondrial proteins currently identified are encoded in the nuclear genome, and only nearly 1% of them are encoded in mtDNA [1, 14]. In addition, gene transfer between the two cytoplasmic genomes is also common; thus, complete sequences of functional genes as well as fragments of non-coding sequences of mitochondrial origin have been identified in chloroplasts. This dynamic intergenomic gene transfer is not unusual, and it has been documented in various land plant taxa . For example, the mtDNA of melon Cucumis melo (Cucurbitaceae) has a total size of 2.7 Mbp, and nearly 46.77% and 1.41% are from nuclear and plastidic origin, respectively . Accordingly, intergenomic gene transfer is a factor that has increased the total size of mtDNA in plants [15, 17]. Additionally, in the mtDNA of angiosperms, horizontal gene transfer has been documented from different taxonomic groups, such as viruses , bacteria , fungi , as well as from distinct plant species [21, 22]. The mtDNA of land plants contains abundant repeated DNA sequences, most of them located at the non-coding sequences (intergenic spacers, IGS). These abundant repeats also cause substantial increases in the overall size of mtDNA , which could have a role in the homologous recombination and regulation of the complete replication of mtDNA .
Currently, the underlying factors that drive the mutation have not been fully identified for plants. However, preliminary comparisons of coding genes showed lower mutation rates in mtDNA than those estimated in plastidic (3X higher) and nuclear (16X) genomes [24, 25]. Since mutations are more constrained in coding sequences of mtDNA, they do not represent an adequate source of molecular variation for phylogenetic studies . On the other hand, the widely abundant, large and continuous sequences of non-coding regions (i.e., introns and IGS) have not been explored as potential sources of molecular variation to address biological questions. Finally, plant mtDNA is likely to be imprinted with the evolutionary history of plants and may help to elucidate the enigmatic and not fully resolved evolutionary history of angiosperms.
At present, most phylogenetic studies in angiosperms have been carried out using plastidic loci (e.g., , ). However, this genome has not been effective for whole flowering groups, such as cacti species. The nearly 1,500 members of Cactaceae  are recognized as a monophyletic group ; however, their internal phylogenetic relationships have not been fully resolved (e.g., [31, 32]). In this study, we de novo sequenced and assembled the mitochondrial genome of Mammillaria huitzilopochtli D. R. Hunt. (Cactaceae, Caryophyllales). Recently, the whole cpDNA of this short-globose cactus M. huitzilopochtli was described , and its relative plastidic molecular variation was assessed . The objectives of the present study were (1) to describe the structural organization of the whole mitochondrial genome in this cactus, (2) to estimate the mutation rates of coding regions among 21 species, (3) to compare our results with those reported for mtDNA from 20 other land plants, with emphasis on Caryophyllids.
Characterization of the mitochondrial genome of Mammillaria huitzilopochtli
The newly assembled mitochondrial genome of M. huitzilopochtli has a total size of 2.052 Mbp and is organized in a single linear molecule. This mtDNA had a higher proportion of A’s (28.6%) and T’s (28.4%), followed by G’s and C’s (21.5% each). This genome comprised genes from 12 families: 10 of these corresponded to different types of protein-coding genes (Fig. 1).
A total of 65 distinct genes (PCGs, tRNAs and rRNAs) were annotated in the mtDNA of M. huitzilopochtli, six of these genes had one to four additional copies (Table 1). Thirty-four of them were protein-coding genes (PCGs), including 33 of mitochondrial origin and one (psaC) from the plastid. A total of 28 subunits of tRNAs were identified, and four of them were of plastidic origin; lastly, three subunits of rRNAs were documented (Fig. 1). The 65 annotated genes represented only 2.8% (57,606 bp) of the DNA sequence of the total genome size; consequently, 97.2% of the DNA sequences corresponded to non-coding sequences mostly located in the IGS (Fig. 1).
With respect to the 33 mitochondrial PCGs, 29 (87.8%) of them had the typical ATG start codon, and four had alternative codons: ACG (nad1), TTG (rps4), ATA (mttb), and GTG (rpl16); and three types of stop codons were documented: TAA (13 PCGs), TGA (13), and TAG (6); and only the gene atp9 had CGA. In eight genes, introns were identified that varied in number and length (Table 1): nad7 had four introns, followed by nad2 (3 introns), nad4, and nad5 (2); and ccmFc, cox2, nad1, and rps3 (1). The length of these introns ranged from 838 bp (nad5) to 2,350 bp (nad2). Moreover, three of these genes with introns were trans-spliced (nad1, nad2, and nad5), and the other five (ccmFc, cox2, nad4, nad7, and rps3) were cis-spliced.
With respect to the repeated sequences, a total of 1,219 microsatellites were recorded along the mtDNA of M. huitzilopochtli. The most abundant microsatellites were of type mononucleotide (396 repeats), followed by dinucleotide (462), trinucleotide (59), and tetranucleotide (170). In addition, 109 microsatellites showed a compound motif (i.e., two types of repeated motifs separated by a non-microsatellite sequence). Lastly, only 23 complex microsatellites that were composed of five to six nucleotides were identified, and these were distributed along the IGS (Table 2); 20 of them were abundant on the IGS of trnD-GUC – cox2 (5 repeats) and nad1 - rps3 (4) (Table 2).
On the other hand, direct and inverted repeats were widely and abundantly distributed across mtDNA (Fig. 2). A total of 4,550 of these repeats were documented, representing 8.73% (179,215 bp) of the total length of the genome. The most abundant repeats were the shortest ones: 20–39 bp (2,470 repeats), followed by those of 30–59 bp (1,878), 60–199 bp (183), 100–199 bp (44), and finally only 17 repeats > 200 bp were identified. Irrespective of the length, the number of repeats in direct orientation was similar to those in inverted orientation (Fig. 2).
In the mtDNA of M. huitzilopochtli, a total of 34 DNA sequences of plastidic origin (10,184 bp) were identified (Table 3), which were represented either by complete genes, gene fragments, or non-coding regions of the plastid. These complete copies of genes were the coding gene psaC (start and stop codons included) and three tRNAs: trnD-GUC (two copies) and one copy of trnN-GUU and trnI-CAU. The other remaining 31 DNA sequences were fragments of genes and also of IGS (Table 3).
Comparison of mitochondrial DNA of Mammillaria huitzilopochtli to other land plants
The phylogenetic analysis showed a confident topology, in which the Caryophyllids were clearly grouped in a clade and had to A. thaliana as sister group (Fig. 3).
The comparisons carried out showed that the mtDNA of M. huitzilopochtli has a GC content of 42.97%, which is similar to that reported for the other 15 Caryophyllid species (Fig. 4). The average GC content in the 16 studied Caryophyllids was 43.77 ± 0.99SD. In the 21 studied plant species, there was a negative correlation between the GC content and the total length of the mitochondrial genome (r=-0.68, p = 0.00073). However, when we excluded the atypical value of S. noctiflora, this correlation became non-significant (r=-0.37, p = 0.11). The lowest GC content was documented in the two Caryophyllaceae species: S. latifolia (42.56%) and S. noctiflora (40.82%), with genome sizes of 235 kbp and 7.1 Mbp, respectively. Among the 21 species examined, the mtDNAs of two Caryophyllids were the largest ones: M. huitzilopochtli (2,052,004 bp) is the second largest genome after that of Silene noctiflora (Fig. 4). The average number of genes across the 21 species was 59 ± 6.34SD, and there was no correlation between their total number of genes and their total length (N = 21, r=-0.14, p = 0.56). In fact, for the largest genome of the Caryophyllid, S. latifolia was reported the lowest number of genes (41), whereas the gymnosperm C. taitungensis had the highest number of genes (70).
With regard to the identity of the genes that composed the mitochondrial genomes, we documented that the 21 species had the three typical ribosomal units (rrn5, rrn18 and rrn26) reported for land plants. However, among these species, a conspicuous variation in gene identity of PCGs was identified. The gymnosperm (C. taitungensis) contained the largest number of PCGs (41 genes), and the majority of angiosperms had a complete set of 24 PCGs, which are considered core genes. However, a few PCGs were missing (white squares, Fig. 5) or were incomplete sequences (pseudogenes; grey squares, Fig. 5), as was the case in M. jalapa, where the genes cob and cox1 were absent, whereas the genes nad4 and nda6 were not identified in S. glauca. In contrast, the set of 17 genes known as variable PCGs or non-core genes was more variable across the 20 studied angiosperms. In particular, we documented the complete absence and pseudogenization of subunits of the ribosomal proteins (rps) and the succinate dehydrogenase (sdh). The cactus M. huitzilopochtli lacks eight of these two types of genes, and for other 12 species we identified a total of 24 pseudogenes. With respect to tRNAs, the most frequent absences were documented in trnL-UAA (20 species), trnR-UCU (20), trnV-UAC (20), trnI-GAU (19) and trnL-CAA (17) (Fig. 5); and pseudogenization was documented in four tRNAs but only in two species (A. thaliana and S. noctiflora). In Caryophyllids, the species S. noctiflora and S. latifolia, had a higher number of pseudogenes, 6 and 5, respectively; whereas the cactus M. huitzilopochtli had only one pseudogene (Ψrps14; Fig. 5).
The comparison of substitution rates in 25 genes between M. huitzilopochtli and six other angiosperm species (Fig. 6) showed that 23 genes had values indicating negative selection (Ka/Ks < 1, below the red horizontal line, Fig. 6). Positive selection (Ka/Ks > 1) was estimated only in the comparison of the gene atp6 of C. quinoa and in ccmB of A. thaliana and N. tabacum (Fig. 6). No evidence for neutral selection was found.
This study pioneered the analysis of the complete mitochondrial genome of cactus species, and we consider that these results will open new perspectives for the phylogenetic analysis of these plants. Unfortunately, due to the lack of data, we were only able to compare our findings to other land plants that are not phylogenetically closely related; however, the comparisons focused on Caryophyllids (Amaranthaceae, Aizoaceae, Caryophyllaceae, Nepenthaceae, Nyctaginaceae, and Polygonaceae) showed similar gene content, although the strong differences in size and structural arrangement. Our findings showed that M. huitzilopochtli possesses the third largest mitochondrial genome (2.05 Mbp), behind the other two Caryophyllids S. conica (11.3 Mbp)  and S. noctiflora (7.1 Mbp) . Our comparisons among 21 species suggest that total genome size does not determine: (1) structural complexity (i.e., arrangement in multiple chromosomes), (2) GC content, and (3) total number of genes, and (4) gene identity.
We identified that the variation in the total size of mtDNA among the 21 species studied was caused by the expansion and contraction of non-coding sequences, primarily by the lengthening of IGS and secondarily by introns. Thus, the total size of mtDNA expands or contracts determined by the non-coding sequences rather than by the gain/loss of coding genes. In addition, we identified that the lengthening of IGS was associated with the abundance of repeated sequences of different types, such as microsatellites, as well as direct and inverted repeats. The abundance of repeats in the IGS of land plant mtDNA is a typical observed feature [19, 36, 37], and some studies [16, 37] have suggested that IGS may receive more DNA sequences from foreign genomes. Currently, the functional role of these repeats in mtDNA has not been clearly elucidated, but it has been postulated that these repeats may participate in the replication of complete mtDNA ; and in repeat-mediated recombination [38, 39]; in fact, this latter process has been proposed to play an important role in the structural rearrangements of mtDNA [7, 39, 40].
Our results indicated that the mitochondrial genome of land plants tends to maintain a stable gene composition (i.e., number and types of genes), irrespective of the overall size, structural organization, and complexity in which a specific genome is arranged. We identified that the four ribosomal units and the set of 24 PCGs show a tendency to be maintained suggesting a potential key role for these genes in plants. The results suggest that phylogeny influences the number and identity of genes rather than the mtDNA’s structural features. A conspicuous result was that the gymnosperm C. taitungensis had the highest number of distinct genes, which is consistent with previous findings in two other conifers, Larix sibirica (77 genes, ) and Picea sitchensis (71, ); and the studied 20 angiosperms have a lower total gene number (56, this study). In these angiosperms, this drop in the number of genes was caused by the loss of different types of PCGs and tRNAs. However, we cannot confirm if these lacking genes are in the nuclear genome since it is a fact that they are not in the plastidic genome (e.g., MW894644 and MK867773). Since the set of core genes was documented in most of the 20 angiosperms, we consider that basal common evolutionary steps constrained the current gene composition in the mtDNA of flowering plants; however, this needs further verification when more complete mitochondrial genomes are available. On the other hand, the results showed that the evolutionary process of natural selection restricts mutations in the coding genes of M. huitzilopochtli, as indicated by the Ka/Ks values < 1 (negative selection). Consequently, coding sequences are highly conserved in this cactus, as has been recognized for most of the angiosperm species (e.g., [2, 12]).
The migration of DNA sequences of plastidic origin (complete coding genes, fragmented gene sequences and IGS) in mtDNA of the cactus M. huitzilopochtli has also been documented in other species [19, 37, 43]. However, the migration of complete coding genes from chloroplast to mtDNA is not common in either angiosperms  or gymnosperms . Currently, it has not been established if these copies of plastidic origin are functional in mtDNA [17, 44]. The migration of tRNAs from chloroplasts to mitochondria is also common in land plants ; and in the case of M. huitzilopochtli, four plastidic tRNAs  were documented, and for these genes a functional role in the synthesis of proteins has been proposed . On the other hand, the migration from the nuclear genome to mtDNA has not been extensively researched in plants, although it may occur; as was mentioned for Cucumis melo (Cucurbitaceae), nearly 46.47% of its mtDNA is of nuclear origin . In our study, we did not evaluate sequences of nuclear origin because a complete nuclear genome for M. huitzilopochtli has not yet been published.
It should be noted that the primary goal of this study was not to establish the phylogenetic relationships of M. huitzilopochtli with other Caryophyllids due to the scarcity of complete mtDNA data available; however, the obtained phylogenetic tree revealed a concordant topology with that from previous studies based on plastidic loci . Accordingly, the seven families of Caryophyllales studied here were organized according to the previously published phylogeny of 40 families belonging to this order, which was derived from 83 plastidic loci . In addition, the 16 Caryophyllid species examined in this study were grouped into a monophyletic ingroup. These phylogenetic results indicate that mtDNA harbors an evolutionary history, and particularly those 29 mitochondrial loci utilized in the study have sufficient resolution to distinguish the families of the Caryophyllales order. We expect that in the future, as more complete mitochondrial genomes are published, the value of mtDNA for phylogenetic analysis will be reassessed. For instance, the recent study conducted by Rydin et al.  analyzed 53 species of Rubiaceae (Gentianales) based on mitochondrial and chloroplast genomes. The phylogenetic trees showed phylogenetic discordances, suggesting that future phylogenetic studies should aim to include loci from the mitochondrial, nuclear, and plastid genomes in order to study plant evolution in detail.
This newly assembled and annotated complete mitochondrial genome of the cactus M. huitzilopochtli provides insights that will allow further comparisons with other plants, including Cactaceae. We expect that our study will contribute to elucidate biological, phylogenetic, taxonomic, and systematic issues that have not been fully resolved in Cactaceae. In the whole group of angiosperms, we consider that we are currently far from understanding the processes that drive the structural organization of mtDNA. The low mutation rates of coding genes are restricted by natural selection, which permits synonymous substitutions in DNA sequences without affecting the amino acid chains. Lastly, we encourage the sequencing of complete mitochondrial genomes in order to unravel the evolutionary puzzle of plants.
Genomic DNA extraction and massive sequencing
Tissue samples of Mammillaria huitzilopochtli D.R. Hunt were collected in 2016 from a wild population near the municipality of San Juan Bautista Cuicatlán, Oaxaca. These tissue samples were immediately stored in liquid nitrogen until experimental processing in the laboratory, where tissue samples are maintained at -80 °C for long-term genetic research.
Frozen tissue samples of 70–100 mg from a single individual of Mammillaria huitzilopochtli were independently processed according to the manufacturer’s instructions of the DNAeasy Plant Mini Kit (Qiagen, Germany) in order to obtain one microgram of gDNA of high molecular weight and 260/280 ≥ 1.7. This total gDNA was sent to the sequencing service provider, who prepared PE libraries with an average insert size of ~ 600 bp and sequenced in 2 × 150 cycles on TruSeq Nano DNA 350 (Illumina, USA).
Mitochondrial genome assembly and annotation
The quality of the raw data reads was assessed using FastQC v0.11.9 . Since 91.66% of the reads had Qphred ≥ 30 and no attached adapters were identified, these reads were not filtered. This whole set of reads contained three genomes; thus, we proceeded to extract only the reads of mitochondrial origin. For this, those reads of plastidic origin were mapped with BWA-0.7.17 , using as a reference the cpDNA published for M. huitzilopochtli . The plastidic reads were discarded using SAMtools 1.15 . The remaining reads were assembled de novo with NovoPlasty 4.3 . The resulting assembly produced several large supercontigs (~ 10–290 kbp) that did not form a single continuous sequence. In these large supercontigs, the plant mitochondrial origin of the reads was confirmed using BLASTN . All those verified mitochondrial reads were extracted directly from raw data and newly assembled using the Unicycler v.0.4.9 pipeline , which employs SPAdes 3.15  as the assembler. This assembler was able to recover several independent and large supercontigs of approximately 300 kbp, which were visualized in the program Bandage v.0.8.1 . Since short and few gaps were identified in these large supercontigs, the original raw data were used to fill in the gaps. The program Bandage identified those pairs of supercontigs that shared flanking extremes; thus, we used BBDuk  to search the raw data for those reads that joined each pair of flanking sequences. Successive searches with Bandage enabled us to merge all supercontigs, resulting in a single continuous linear sequence. We found that most of the original reads of mtDNA were mapped on this single linear sequence; thus, we checked uniformity with the program Integrative Genomics Viewer (IGV), which showed that the depth of coverage had an average value of 1,318X. Once the genome was completely assembled, it was fully annotated with Mitofy ; and all identified genes were manually curated using BLASTN . The complete mitochondrial genome of M. huitzilopochtli assembled, annotated, and manually curated was plotted using OGDRAW . This newly assembled and curated genome was characterized in terms of total size, number of chromosomes, and gene composition based on three types of genes: protein-coding genes (PCGs) that were classified according to their functional role; tRNAs and rRNAs. For each protein-coding gene, its length, start and stop codons, as well as the length of the amino acid chain transcribed, was identified. In addition, the abundant and diverse types of repeats were characterized using MISA-web . We identified microsatellite type repeats (i.e., DNA sequences repeated in tandem), as well as direct and inverted repeats of at least 20 bp with REPuter . Lastly, we searched for DNA sequences of plastid origin by comparing the mtDNA with the cpDNA accessed at NCBI (MN517612) previously reported . This comparison was performed using BLASTN  with the following parameters: matching rate ≥ 70%, E-value ≤ 1e − 10, and length ≥ 40.
Comparison of the mitochondrial genome of Mammillaria huitzilopochtli to other land plant species
The comparisons were carried out in detail with the other 15 Caryophyllids as well as the other four angiosperms (Arabidopsis thaliana, Cucurbita pepo, Nicotiana tabacum, and Zea mays). The gymnosperm Cycas taitungensis was used as an external group in the phylogenetic analysis (species evaluated are listed in Online Resource 1). The phylogenetic tree was obtained for these 21 species, and it was based on 29 orthologous loci (26,849 bp), which were identified using OrthoFinder 2.5.4 . The DNA sequences of these loci comprised both coding and non-coding sequences, including IGS. The DNA sequences of these loci were concatenated and aligned with MAFFT 7.471 . The best substitution model identified by ModelFinder  was IVM, the Maximum Likelihood analysis ran with 1000 bootstraps in IQ-TREE 1.6.12 , used to obtain this tree. We used this phylogenetic tree to organize the order of taxa in the comparisons made. We compared the percentage of GC content, total size, number, and identity of genes among the 21 species. We described in detail the variation in the set of genes recognized as core genes, which includes PCGs (e.g., [2, 13]) and rRNAs. We tested the statistical correlation between GC content and the total length of the 21 genomes analyzed with Pearson correlation, following the procedure described by Sokal and Rohlf . In order to evaluate the relevance of natural selection on 25 PCGs of M. huitzilopochtli, we estimated the rate of synonymous (Ks) and no synonymous (Ka) substitutions with the other six angiosperm species (A. thaliana, Bougainvillea spectabilis, Chenopodium quinoa, N. tabacum, and Z. mays). These 25 PCGs were extracted from the respective complete mtDNA of each of these seven species and then aligned using MAFFT 7.471 . The rate Ka/Ks was estimated with codeml , which was executed online on the PAL2NAL website . Accordingly, the effect of natural selection was classified as negative selection if Ka/Ks < 1, positive selection if Ka/Ks > 1, and neutral selection if Ka/Ks = 1 .
A list of the species studied and their accession IDs is provided in Table S1. The genome generated and analyzed in the current study is provided as Additional file 2. The accession number in GenBank is OP081771.
- M. huitzilopochtli :
Protein coding gene
Roger AJ, Muñoz-Gómez SA, Kamikawa R. The origin and diversification of mitochondria. Curr Biol. 2017;27:R1177–92.
Kozik A, Rowan BA, Lavelle D, Berke L, Schranz ME, Michelmore RW, et al. The alternative reality of plant mitochondrial DNA: one ring does not rule them all. PLoS Genet. 2019;15:e1008373.
Jacoby RP, Li L, Huang S, Lee CP, Millar AH, Taylor NL. Mitochondrial composition, function and stress response in plants. J Integr Plant Biol. 2012;54:887–906.
Liberatore KL, Dukowic-Schulze S, Miller ME, Chen C, Kianian SF. The role of mitochondria in plant development and stress tolerance. Free Radic Biol Med. 2016;100:238–56.
Van Aken O, Van Breusegem F. Licensed to kill: mitochondria, chloroplasts, and cell death. Trends Plant Sci. 2015;20:754–66.
Kim Y-J, Zhang D. Molecular control of male fertility for crop hybrid breeding. Trends Plant Sci. 2018;23:53–65.
Mahapatra K, Banerjee S, De S, Mitra M, Roy P, Roy S. An insight into the mechanism of plant organelle genome maintenance and implications of organelle genome in crop improvement: an update. Front Cell Dev Biol. 2021;9:671698.
Chevigny N, Schatz-Daas D, Lotfi F, Gualberto JM. DNA repair and the stability of the plant mitochondrial genome. Int J Mol Sci. 2020;21:328.
Skippington E, Barkman TJ, Rice DW, Palmer JD. Miniaturized mitogenome of the parasitic plant Viscum scurruloideum is extremely divergent and dynamic and has lost all nad genes. Proc Natl Acad Sci USA. 2015;112:E3515–24.
Sloan DB, Alverson AJ, Chuckalovcak JP, Wu M, McCauley DE, Palmer JD et al. Rapid evolution of enormous, multichromosomal genomes in flowering plant mitochondria with exceptionally high mutation rates. PLoS Biol. 2012;10.
Li R, Wei X, Wang Y, Zhang Y. The complete mitochondrial genome of a mangrove associated plant: Sesuvium portulacastrum and its phylogenetic implications. Mitochondrial DNA B Resour 5:3112–3.
Adams KL, Palmer JD. Evolution of mitochondrial gene content: gene loss and transfer to the nucleus. Mol Phylogenet Evol. 2003;29:380–95.
Adams KL, Qiu Y-L, Stoutemyer M, Palmer JD. Punctuated evolution of mitochondrial gene content: high and variable rates of mitochondrial gene loss and transfer to the nucleus during angiosperm evolution. Proc Natl Acad Sci USA. 2002;99:9905–12.
Rao RSP, Salvato F, Thal B, Eubel H, Thelen JJ, Møller IM. The proteome of higher plant mitochondria. Mitochondrion. 2017;33:22–37.
Zhao N, Wang Y, Hua J. The roles of mitochondrion in intergenomic gene transfer in plants: a source and a Pool. Int J Mol Sci. 2018;19:547.
Rodríguez-Moreno L, González VM, Benjak A, Martí MC, Puigdomènech P, Aranda MA, et al. Determination of the melon chloroplast and mitochondrial genome sequences reveals that the largest reported mitochondrial genome in plants contains a significant amount of DNA having a nuclear origin. BMC Genomics. 2011;12:424.
Alverson AJ, Wei X, Rice DW, Stern DB, Barry K, Palmer JD. Insights into the evolution of mitochondrial genome size from complete sequences of Citrullus lanatus and Cucurbita pepo (Cucurbitaceae). Mol Biol Evol. 2010;27:1436–48.
Goremykin VV, Salamini F, Velasco R, Viola R. Mitochondrial DNA of Vitis vinifera and the issue of rampant horizontal gene transfer. Mol Biol Evol. 2008;26:99–110.
Alverson AJ, Rice DW, Dickinson S, Barry K, Palmer JD. Origins and recombination of the bacterial-sizedmultichromosomal mitochondrial genome of cucumber. Plant Cell. 2011;23:2499–513.
Rice DW, Alverson AJ, Richardson AO, Young GJ, Sanchez-Puerta MV, Munzinger J, et al. Horizontal transfer of entire genomes via mitochondrial fusion in the angiosperm Amborella. Science. 2013;342:1468–73.
Bergthorsson U, Richardson AO, Young GJ, Goertzen LR, Palmer JD. Massive horizontal transfer of mitochondrial genes from diverse land plant donors to the basal angiosperm Amborella. Proc Natl Acad Sci USA. 2004;101:17747–52.
Sanchez-Puerta MV, Edera A, Gandini CL, Williams AV, Howell KA, Nevill PG, et al. Genome-scale transfer of mitochondrial DNA from legume hosts to the holoparasite Lophophytum mirabile (Balanophoraceae). Mol Phylogenet Evol. 2019;132:243–50.
Cupp JD, Nielsen BL, Minireview. DNA replication in plant mitochondria. Mitochondrion. 2014;19:231–7.
Drouin G, Daoud H, Xia J. Relative rates of synonymous substitutions in the mitochondrial, chloroplast and nuclear genomes of seed plants. Mol Phylogenet Evol. 2008;49:827–31.
Wolfe KH, Li WH, Sharp PM. Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proc Natl Acad Sci USA. 1987;84:9054–8.
Rydin C, Wikström N, Bremer B. Conflicting results from mitochondrial genomic data challenge current views of Rubiaceae phylogeny. Am J Bot. 2017;104:1522–32.
Moore MJ, Soltis PS, Bell CD, Burleigh JG, Soltis DE. Phylogenetic analysis of 83 plastid genes further resolves the early diversification of eudicots. Proc Natl Acad Sci USA. 2010;107:4623–8.
Gitzendanner MA, Soltis PS, Wong GK-S, Ruhfel BR, Soltis DE. Plastid phylogenomic analysis of green plants: a billion years of evolutionary history. Am J Bot. 2018;105:291–301.
Hunt DR. The new cactus lexicon, text and atlas. Milborne Port, UK: DH Books; 2006.
Hernández-Hernández T, Hernández HM, De-Nova JA, Puente R, Eguiarte LE, Magallón S. Phylogenetic relationships and evolution of growth form in Cactaceae (Caryophyllales, Eudicotyledoneae). Am J Bot. 2011;98:44–61.
Butterworth CA, Wallace RS. Phylogenetic studies of Mammillaria (Cactaceae): insights from chloroplast sequence variation and hypothesis testing using the parametric bootstrap. Am J Bot. 2004;91:1086–98.
Harpke D, Peterson A, Hoffmann M, Röser M. Phylogenetic evaluation of chloroplast trnl–trnf DNA sequence variation in the genus Mammillaria (Cactaceae). Schlechtendalia. 2006;14:7–16.
Solórzano S, Chincoya DA, Sanchez-Flores A, Estrada K, Díaz-Velásquez CE, González-Rodríguez A, et al. De novo assembly discovered novel structures in genome of plastids and revealed divergent inverted repeats in Mammillaria (Cactaceae, Caryophyllales). Plants. 2019;8:392.
Chincoya DA, Sanchez-Flores A, Estrada K, Díaz-Velásquez CE, González-Rodríguez A, Vaca-Paniagua F, et al. Identification of high molecular variation loci in complete chloroplast genomes of Mammillaria (Cactaceae, Caryophyllales). Genes. 2020;11:830.
Wu Z, Cuthbert JM, Taylor DR, Sloan DB. The massive mitochondrial genome of the angiosperm Silene noctiflora is evolving by gain or loss of entire chromosomes. Proc Natl Acad Sci USA. 2015;112:10185–91.
Cheng Y, He X, Priyadarshani SVGN, Wang Y, Ye L, Shi C, et al. Assembly and comparative analysis of the complete mitochondrial genome of Suaeda glauca. BMC Genomics. 2021;22:167.
Cui H, Ding Z, Zhu Q, Wu Y, Qiu B, Gao P. Comparative analysis of nuclear, chloroplast, and mitochondrial genomes of watermelon and melon provides evidence of gene transfer. Sci Rep. 2021;11:1595.
Stern DB, Palmer JD. Recombination sequences in plant mitochondrial genomes: diversity and homologies to known mitochondrial genes. Nucl Acids Res. 1984;12:6141–57.
Cole LW, Guo W, Mower JP, Palmer JD. High and variable rates of repeat-mediated mitochondrial genome rearrangement in a genus of plants. Mol Biol Evol. 2018. https://doi.org/10.1093/molbev/msy176.
Gualberto JM, Mileshina D, Wallet C, Niazi AK, Weber-Lotfi F, Dietrich A. The plant mitochondrial genome: Dynamics and maintenance. Biochimie. 2014;100:107–20.
Putintseva YA, Bondar EI, Simonov EP, Sharov VV, Oreshkova NV, Kuzmin DA, et al. Siberian larch (Larix sibirica Ledeb.) Mitochondrial genome assembled using both short and long nucleotide sequence reads is currently the largest known mitogenome. BMC Genomics. 2020;21:654.
Jackman SD, Coombe L, Warren RL, Kirk H, Trinh E, MacLeod T, et al. Complete mitochondrial genome of a gymnosperm, Sitka spruce (Picea sitchensis), indicates a complex physical structure. Genome Biol Evol. 2020;12:1174–9.
Warren JM, Salinas-Giegé T, Triant DA, Taylor DR, Drouard L, Sloan DB. Rapid shifts in mitochondrial tRNA import in a plant lineage with extensive mitochondrial tRNA gene loss. Mol Biol Evol. 2021;38:5735–51.
Wang D, Wu Y-W, Shih AC-C, Wu C-S, Wang Y-N, Chaw S-M. Transfer of chloroplast genomic DNA to mitochondrial genome occurred at least 300 MYA. Mol Biol Evol. 2007;24:2040–8.
Chaw S-M, Chun-Chieh Shih A, Wang D, Wu Y-W, Liu S-M, Chou T-Y. The mitochondrial genome of the gymnosperm Cycas taitungensis contains a novel family of short interspersed elements, Bpu sequences, and abundant RNA editing sites. Mol Biol Evol. 2008;25:603–15.
Brockington SF, Alexandre R, Ramdial J, Moore MJ, Crawley S, Dhingra A, et al. Phylogeny of the Caryophyllales sensu lato: revisiting hypotheses on pollination biology and perianth differentiation in the core Caryophyllales. Int J Plant Sci. 2009;170:627–43.
Yao G, Jin J-J, Li H-T, Yang J-B, Mandala VS, Croley M, et al. Plastid phylogenomic insights into the evolution of Caryophyllales. Mol Phylogenet Evol. 2019;134:74–86.
Andrews S. FastQC a quality control tool for high throughput sequence data. 2010. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/. Accessed 4 Feb 2020.
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9.
Dierckxsens N, Mardulyn P, Smits G. NOVOPlasty: de novo assembly of organelle genomes from whole genome data. Nucleic Acids Res. 2017;45:e18–8.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10.
Wick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol. 2017;13:e1005595.
Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. Using SPAdes De Novo Assembler. Curr Protoc Bioinformatics. 2020;70.
Wick RR, Schultz MB, Zobel J, Holt KE. Bandage: interactive visualization of de novo genome assemblies: Fig. 1. Bioinformatics. 2015;31:3350–2.
Bushnell B, Rood J, Singer E. BBMerge – accurate paired shotgun read merging via overlap. PLoS ONE. 2017;12:e0185056.
Greiner S, Lehwark P, Bock R. OrganellarGenomeDRAW (OGDRAW) version 1.3.1: expanded toolkit for the graphical visualization of organellar genomes. Nucleic Acids Res. 2019;47:W59–64.
Beier S, Thiel T, Münch T, Scholz U, Mascher M. MISA-web: a web server for microsatellite prediction. Bioinformatics. 2017;33:2583–5.
Kurtz S. REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001;29:4633–42.
Emms DM, Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20:238.
Katoh K, Misawa K, Kuma K, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30:3059–66.
Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods. 2017;14:587–9.
Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015;32:268–74.
Sokal RR, Rohlf FJ. Introduction to biostatistics. 2nd ed. Dover ed. Mineola, N.Y: Dover Publications; 2009.
Goldman N, Yang Z. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol. 1994. https://doi.org/10.1093/oxfordjournals.molbev.a040153.
Suyama M, Torrents D, Bork P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 2006;34 Web Server:W609–12.
Hurst LD. The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends Genet. 2002;18:486–7.
D. Cruz Plancarte (414082920) is a Master student at Posgrado en Ciencias Biológicas, Universidad Nacional Autónoma de México; he is granted by the Consejo Nacional de Ciencia y Tecnología CONACyT (1086093); and this paper is a requirement for obtaining his MSc degree at the Posgrado en Ciencias Biológicas, UNAM. The company Macrogen Inc., Seoul, South Korea provided the whole genome sequencing service for M. huitzilopochtli.
This work was supported by Programa de Apoyo a Proyectos de Investigación e Innovación Tecnológica de la UNAM (PAPIIT-DGAPA IN228619).
The authors declare no competing interests.
Ethics approval and consent to participate
The cactus species analyzed in this study is included in the Mexican Red List of Species (NOM-059-SEMARNAT-2010), the sampling was authorized to S.S. with the collecting permission number SGPA/DGVS/06880/16, in accordance with the national regulations established for protected species sampled for research purposes. Dr. Salvador Arias, specialist in cactus taxonomy, confirmed the taxonomic identity of the specimen.
Consent for publication
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Plancarte, D.C., Solórzano, S. Structural and gene composition variation of the complete mitochondrial genome of Mammillaria huitzilopochtli (Cactaceae, Caryophyllales), revealed by de novo assembly. BMC Genomics 24, 509 (2023). https://doi.org/10.1186/s12864-023-09607-8
- Cactaceae, Mammillaria huitzilopochtli
- Mitochondrial genome