The complete mitochondrial genome of okra (Abelmoschus esculentus): using nanopore long reads to investigate gene transfer from chloroplast genomes and rearrangements of mitochondrial DNA molecules

Background Okra (Abelmoschus esculentus L. Moench) is an economically important crop and is known for its slimy juice, which has significant scientific research value. The A. esculentus chloroplast genome has been reported; however, the sequence of its mitochondrial genome is still lacking. Results We sequenced the plastid and mitochondrial genomes of okra based on Illumina short reads and Nanopore long reads and conducted a comparative study between the two organelle genomes. The plastid genome of okra is highly structurally conserved, but the mitochondrial genome of okra has been confirmed to have abundant subgenomic configurations. The assembly results showed that okra’s mitochondrial genome existed mainly in the form of two independent molecules, which could be divided into four independent molecules through two pairs of long repeats. In addition, we found that four pairs of short repeats could mediate the integration of the two independent molecules into one complete molecule at a low frequency. Subsequently, we also found extensive sequence transfer between the two organelles of okra, where three plastid-derived genes (psaA, rps7 and psbJ) remained intact in the mitochondrial genome. Furthermore, psbJ, psbF, psbE and psbL were integrated into the mitochondrial genome as a conserved gene cluster and underwent pseudogenization as nonfunctional genes. Only psbJ retained a relatively complete sequence, but its expression was not detected in the transcriptome data, and we speculate that it is still nonfunctional. Finally, we characterized the RNA editing events of protein-coding genes located in the organelle genomes of okra. Conclusions In the current study, our results not only provide high-quality organelle genomes for okra but also advance our understanding of the gene dialogue between organelle genomes and provide information to breed okra cultivars efficiently. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08706-2.

and a medicinal source, okra has attracted much attention due to its high nutritional value and health benefits for human beings [2]. Its industrial applications mainly focus on the polysaccharides isolated from immature okra pods, which have been successfully used as emulsifiers, drug binders, edible coatings, and food packaging ingredients. Moreover, okra's potent pharmacological effects have been verified in clinical studies, including its antidiabetic, antiobesity, and anticancer activities [3,4]. However, low production limits the development of the okra industries. For a long time, few okra cultivars have been bred, which has contributed to yield stagnation [5]. Developing modern cultivars with significant heterosis based on cytoplasmic male sterility associated with various chimeric open reading frames in the plant mitochondrial genome (mtDNA) is common among crops. Unfortunately, no mitochondrial genome of okra has been reported thus far, which severely restricts follow-up research.
It is generally accepted that plant organelle genomes are derived from endosymbiotic bacteria [6,7]. They have a genetic system independent of the nuclear genome, and they also established a stable regulatory mechanism with the nuclear genome in long-term evolution. Among them, plastid genomes (cpDNA) are usually structurally conserved; they have stable, double-stranded, and circular genomes that contain the core genes for photosynthesis. The combination of its rapid evolution rate and conserved genome structure make the plastid genome a good material for the phylogenomic study of plants [8][9][10]. cpDNA is widely used in studies of the origin of species, plant diversity and cytoplasmic evolution. In recent years, numerous plastid genomes have been assembled based on Illumina short reads, including okra [11].
However, plant mtDNA is much larger than that of other eukaryotes and it varies in size even among related species. Although mtDNA is normally depicted as a circular molecule, different structures of mtDNA molecules have also been found, including linear conformations, branched structures, and numerous smaller circular molecules [12,13]. Thus, it is difficult for us to recover the conformation of plant mtDNA due to its redundant sequences and extensive genomic recombination [14,15]. It has also been reported that plant mtDNA may simultaneously exist in different genome configurations, which is puzzling. Moreover, there is widespread gene transfer between organelle genomes and between organelle and nuclear genomes. For example, the mtDNA had multiple losses of ribosomal and succinate dehydrogenase genes, caused by these genes being transferred to the host cell and becoming part of the nuclear genome during plant evolution [16,17]. Some chloroplast genes were also transferred to the nuclear genome during evolution, which is similar to the mitochondrial gene process [18,19].
The current study sequenced and assembled the complete mitochondrial genome of okra. Based on Illumina short reads and Nanopore long reads, we deciphered the structure of okra mtDNA, whose structure is variable. These results will contribute to understanding the organelle genome evolution of okra, especially for the dialogue between the two organelle genomes, and provide information to breed okra cultivars efficiently.

Characteristics of the mitochondrial genomes of A. esculentus
Initially, we obtained a complex assembly graph with 12 pairs of short repeats (SRs) and 3 pairs of long repeats (LRs) and displayed multiple paths in the Illumina-based assembly ( Fig. S1A-D). We solved these repeats by artificially simulating four possible paths and making judgements based on the mapping results of long reads. As shown in Fig. S2, the structures we recovered here were supported by most long reads, and a total of 12 contigs were obtained by merging redundant nodes (Table 1). We numbered them according to their length. As shown in Fig. 1, we obtained two independent mtDNA molecules of okra, one of which had a complex multibranched conformation, but it was still a closed-loop structure ( Fig. 1, above). The other one presented a typical circular molecule containing a pair of long forward repeats (LR11) (Fig. 1, below). We tried to describe molecule 1 of mtDNA (mtDNA m1) with a reasonable path, but no matter how hard we tried, it could not be reduced to a closed-loop molecule without branches.
For the convenience of description, we processed mtDNA 1 into a linear molecule in the order of contig10 -LR12 -contig8 -LR9 -contig2 -LR9 -contig6 -con-tig4 -LR12 -contig7 -contig3 and processed mtDNA m2 into a circular molecule in the order of contig1 -LR11 -contig5 -LR11 -contig1. Of course, we emphasize that the treatment here is not the only form because the mitochondrial DNA configuration of plants is in dynamic transformation mediated by repeats, and the treatment here was selected since it was convenient for subsequent analysis. We mapped the short reads and long reads to the two mtDNA molecules, and the average depth was 351 × for mtDNA m1, 356 × for mtDNA 2 (short reads), and 402 × for mtDNA 1, 405 × for mtDNA 2 (long reads) (Fig. S3). Statistics of the sequencing depth showed that we obtained a gap-free genome, indicating that our assembly was of high quality.
Surprisingly, we also annotated many plastid genes in the mtDNA, but most of them were just fragments, such as ndhB, psbC, psbE, psbF, psbL, ycf2, psaB, psbM, rps12, and rpl14. However, we observed three intact plastid genes, psbJ, psaA, and rps7. This result suggested that there has been considerable sequence migration between okra cpDNA and mtDNA, accompanied by gene transfer, which will be discussed in detail below. Figure 2 shows the mtDNA genome map.

Homologous recombination mediated by repeats
We excluded false-positive repeat sequences based on Nanopore long reads (SR11, Fig. S2) and finally identified 14 pairs of repeats involved in mediated genome recombination (Fig. S2, Table 3), including the three pairs of long repeats described earlier. The remaining repeats were all short repeats, the longest being 322 bp. Their positions are shown in Fig. 3.
In our case of okra, three pairs of long repeats mediated recombination with high frequency. The proportions of the two different isomers mediated by the three pairs of long repeats were 48% vs. 52% (LR9), 69.49% vs. 30.510% (LR11), and 60% vs. 40% (LR12). Figure 1 shows the possible conformation mediated by the three long repeats. Both LR9 and LR11 served as mediators for further separation of the two independent molecules. In this case, four independent molecules could exist at the same time. The frequency of LR9-mediated recombination was slightly higher than the main configuration, i.e., 13 long reads covered the LR9 repeats and supported con-tig2 forming an independent molecule with LR9, while 12 long reads supported it as part of mtDNA m1. However, the length of repeats was so long that the number of long reads available for reference was statistically limited. For the long repeats, the true ratio was probably closer to equal. However, for the remaining short repeats, the major conformation was clearly dominant in the mitochondria. The alternative conformation generated by the short repeats was less than or close to 2%, except for SR1 and SR3, which were nearly 8% (Table 3). Due to the shorter length of these repeats, we were able to map more long reads and obtained a ratio closer to the actual situation.
Notably, the two repeated units of four pairs of short repeats (SR2, SR4, SR7, and SR8) were found to be located on the two molecules. They were able to participate in the recombination of the two molecules at a low frequency, giving them a chance to merge into one complete molecule.

Intracellular gene transfer (IGT) of A. esculentus organelle genomes
The assembly and annotation of cpDNA revealed that the cpDNA obtained here was almost identical to that previously reported. Therefore, the cpDNA was extremely conserved for okra. In the previous annotation of the organelle genome, we found the presence of gene residues from plastids in the mitochondrial genome, meaning that there was much sequence migration between the two organelles. Here, we searched for homologous sequences among the two organelle genomes based on the BLASTn program to identify potential gene transfer events.
A total of 28 homologous sequences were identified ( Fig. 4A and Table 4), among which 6 were over 1000 bp in length, and the longest was 5142 bp. The total length of these repeats was 21,231 bp, including 13,340 bp in the repeat region of cpDNA and 311 bp in the repeat region of mtDNA. Therefore, a total of 34,571 bp were homologous with cpDNA, accounting for 21.19% of it, and a total of 21,542 bp were homologous with mtDNA, accounting for 4.07% of it. We then extracted and annotated these homologous sequences. Most of these fragments migrated from cpDNA to mtDNA, except that a few tRNA genes were highly similar in sequence and we could not determine the direction of migration. Thus, we called these mitochondrial plastid sequences (MTPTs). In addition to tRNAs and rRNAs, fragments homologous to plastid PCGs were identified on 8 MTPTs, including mtpt1 (ndhB-exon1; rps7; and rps12-exon2,3), mtpt2 (psaA; Fig. 1 The assembly graph of the A. esculentus mitogenome. Each colored segment is labeled with its size and named contig/R 1-12 by rank of size. Only segment 9, 11 and 12 representations are inferred as repeats. All segment adjacencies are supported by the long reads, indicating a complex branching genomic structure. The possible structures formed by high frequency rearrangements mediated by three long repeats were drew psaB), mtpt4 (rpl14), mtpt5 (psaB), mtpt6 (psaA), mtpt12 (ycf2), mtpt14 (psbJ; psbL; psbF; and psbE) and mtpt22 (psbC). We noted that three genes were still intact in the mtDNA sequences, including rps7, psaA, and psbJ. The first two genes were 100% similar in sequence.
We noted that 7 of these MTPTs failed to distinguish from the chloroplast homologous sequences during assembly. Most of these fragments were highly similar to cpDNA sequences. For example, mtpt1, the longest homologous fragment, had only 5 mismatches to corresponding cpDNA sequences (Table S1). With the help of long reads, it was confirmed that they migrated from chloroplasts and were integrated into the mtDNA (Fig. S4).
mtpt14 from cpDNA differs from its mtDNA sequence. On mtpt14, in addition to psbJ, we also found three gene fragments (psbL, psbF and psbE), which might have been transferred to the mitochondria together as a whole and showed varying degrees of pseudogenization during the evolution of the mitochondrial genome, but only the psbJ gene was relatively intact in sequence (Fig. S5). The results of phylogenetic analysis based on the mtpt14 homologous sequences showed that the mitochondrial sequences were clustered into a group (Fig. 4B). We looked closely at the sequence and found that some SNPs and Indels were shared only in mtDNA (Supplementary file 2). This indicated that this homologous sequence has undergone different evolutionary processes along with the two organellar genomes.

RNA editing sites in the PCGs of organelle genomes
RNA editing events are common in plant mitochondrial genomes [20]. This includes single base substitutions and the addition of bases to complete the initiation or termination codon [20][21][22]. In this study, we focused on RNA editing events in the PCGs of okra organelle genomes. A total of 29 plastid PCGs (Fig. 5A) and 26 mitochondrial PCGs (Fig. 5B) were identified as having undergone RNA editing events. However, the total number of RNA editing events identified in plastid PCGs was only 85 (Table S2) compared with 281 in mitochondrial PCGs (the raw data were uploaded on Figshare, the link is https:// doi. org/ 10. 6084/ m9. figsh are. 19608 789, Table S3). In plastid PCGs, rpoC2 had the most RNA editing sites, followed by ndhB and ycf2 with 16, 13 and 11, respectively. In mitochondrial PCGs, rpl2 had the most RNA editing sites, with 76, followed by ndh4 and rps14, both more than 30.
Furthermore, we identified a total of 12 different types of RNA editing, all of which were detected in mitochondrial PCGs. However, A to C and C to G editing types were not identified in plastid PCGs (Fig. 5C). Among them, C to U editing was the most common in both plastids and mitochondria (52 and 185, respectively). Most of the other types were less than 10. In terms of editing efficiency, most PCGs of plastids and mitochondria had an editing efficiency above 80% (Fig. 5D), and the number of low-frequency editing events was relatively low. A total of 46.62% (131) of editing events in mitochondria had an editing efficiency of more than 90%. However, it should be noted that the RNA editing sites identified here might be incomplete, and we found that multiple mitochondrial PCGs had low gene expression, such as ccmB, ccmFN, mttB, nad4 L, nad9, etc. These PCGs lacked adequate coverage, which might be due to their low expression levels or a small amount of sequencing data.

Discussion
Homologous recombination mediated by repeats is almost universal in plant mitochondrial genomes [23][24][25]. In addition to acting as a good mediator for genome recombination, these repeats also greatly increase the size of mtDNA [26,27]. In the assembly of the okra mitochondrial genome, we also found repeats with recombination activity. We confirmed that 14 of these repeats could mediate genome recombination based on long reads. However, it must be noted that some potential repeats involved in recombination have not been discovered. It was previously found in Nymphaea colorata [28] that the two units of repeats do not need to be 100% similar. Therefore, some sequences with low similarity might also mediate genome recombination.
It has been reported that the size of the repeats is closely related to the frequency of recombination [20], namely, the frequency of recombination mediated by short repeats tends to be lower than that mediated by long repeats, the isomers mediated by which were closer to equal proportions. For large repeats (e.g., typical inverted repeats observed in cpDNA), it was thought previously that they mediate SSC region recombination in equal proportions [29]. The long repeats we found in the okra mitochondrial genome also have a high frequency of recombination. For short repeats, they all had low recombination frequency, which is consistent with those previously reported [28,30].
The mtDNA m1 of okra has a branching structure. In terms of coverage, both contig4 and contig10 were singlecopy, and both ends overlapped with LR12 and contig6. However, for contig 6, its other end only overlapped with LR9. Therefore, there were two different paths (contig6-contig10-LR12 and contig6-contig4-LR12). However, it was not a repeat region (Fig. 1). In our previous assembly based on Oxford Nanopore data, these two paths' results were also obtained. Another node in question was contig7, which overlapped both ends of contig3 on one side, but the other side only overlapped with LR12, thus creating an awkward structure. This result suggested that the mtDNA of okra most likely has a multibranched conformation or that there could be different mtDNA molecules in different copies of the mitochondrial genome, which explained why we could not assemble a circular molecule. The polymorphisms in the conformation of the plant mitochondrial genome has always puzzled us. As a previous study on lettuce showed, plant mtDNA should be presented as multiple sequence units showing their variable and dynamic connection rather than as circles [12,13]. Our results also supported the representation that mtDNA should be considered a dynamic genome. In okra's case, at least, this structure is a more complete description of a mtDNA. Horizontal gene transfer (HGT) has been widely discussed, especially in parasitic plants. Adam [31] reported host-to-parasite horizontal gene transfer (hpHGT) events of several genes. These host-derived plastidial genes were found in the mitochondrial genome of the parasite plant Aphyllon epigalium. However, in addition to hpHGT, intracellular gene transfers (IGTs) have also been widely reported and have been an interesting topic. Gene transfer between cpDNA, mtDNA and nuclear genomes had previously been identified. Many plastidial genes have been reported to be found in mitochondria. For example, the plastid-derived rpl32 gene has been transferred into the nucleus of the subfamily Thalictroideae [32]. The atpI gene in the Aeginetia indica mitogenome was acquired from another angiosperm's chloroplast genome [33], and IGT events of multiple ribosomal proteins were also found in Geranium [34]. Here, we found three complete genes in the mitogenomes that migrated from the cpDNA of okra, including psaA, rps7 and psbJ, as well as several plastid-derived gene fragments. However, as previously reported, these genes transferred from plastids might not function in mitochondria, and they might undergo pseudogenization as the mitochondrial genome evolves [33,35]. In our study, a typical example was the psbJ gene, which has a total length of 123 bp, but the two genes we annotated in plastids and mitochondria had 12 mismatches, accounting for nearly 10% of the total length (Table  S4). We mapped transcriptome data to these two psbJ genes, all of which were transcripts of the plastid psbJ gene, and no transcriptional evidence was detected for the mitochondrial psbJ gene. Based on the phylogenetic analysis of mtpt14, we hypothesized that mtpt14 may be an ancient fragment of plastid migration, and this migration event was shared by many plant mitochondrial genomes. However, with the evolution of mitochondrial genomes, some plant lineages may have lost this gene cluster derived from plastids. Furthermore, considering the difference in the evolutionary rate between mtDNA and cpDNA, it is difficult to determine exactly when this sequence was transferred from plastids to mitochondria. More mtDNA sequencing should be performed in the future to address this question.

Conclusions
In this study, we completed the sequencing and assembly of okra organelle genomes and obtained a high-quality organelle genome. Although the chloroplast genome of okra has been previously published, we obtained the complete mitochondrial genome, which enabled us to make a comprehensive comparison between the organelle genomes of okra, thus providing a broader perspective for studying gene transfer between mitochondria and plastid. The use of a mixture of long reads and short reads made it possible to accurately assemble the plant mitochondrial genomes with limited homology. At the same time, the long reads also facilitated the structural analysis of these complex organelle genomes, which enabled us to describe the organelle genomes, especially the dynamic transformation of the plant mitochondrial genome, more intuitively than the previous limited description. Deciphering the organelle genome of okra can provide invaluable information for future investigations of the genome structure and mechanism of replication of Malvales organelle genomes. Fig. 4 Schematic of homologous sequences identified among the two organelle genomes. A The blue arcs represent the mtpts with 100% similarity, the green arcs represent the mtpts has similarity be-tween 90 to 100%, the red arcs represent the mtpts has similarity between 80 to 90%, and the orange arcs represent the mtpts has similarity between less than 80%. B Phylogenetic tree base on the partial of mtpt14 sequences identified in cp DNA and mt DNA. The purple branches represent origin from mt DNA and green branches represent origin from cp DNA. The mt DNA mtpt14 and cp DNA mtpt14 are extracted from okra organelle genomes. The other sequences are downloaded from NCBI, the accession number and position are shown in the label Li et al. BMC Genomics (2022) 23:481 Materials and methods

Plant materials
The okra (A. esculentus) seeds were planted and germinated in small plastic pots and grown in a temperature incubator held at 25 °C with a 16-hr/8-hr light/dark cycle for 2 weeks. We collected well-grown young leaf tissue for DNA extraction. The remaining parts were preserved in the Herbarium of Southwest University, and the voucher number was SWU-QK01.

DNA extraction and sequencing
Total genomic DNA was extracted by using the CTAB method [36]. The same DNA sample was used for Illumina sequencing and Oxford Nanopore sequencing. For Illumina sequencing, the experimental procedures were carried out according to the standard protocol provided by Illumina: the DNA library with an insert size of 350 bp was constructed using the NEBNext ® library building kit [37] and was sequenced by using the HiSeq Xten PE150 sequencing platform at BioMaker (Wuhan, China). Sequencing produced 15.62 Gb of clean data (52.29 Mb clean reads). Clean data were obtained by using Trimmomatic [38]. For Oxford Nanopore sequencing, gTube was used to break the genomic DNA into approximately 8 kb on average, and long-read sequencing followed the protocol in the SQK-LSK109 genomic sequencing kit (ONT, Oxford, UK

Assembly and annotation of organelle genomes
First, we used GetOrganelle v1.7.5.1 [39] to complete the assembly of the plastid genome (cpDNA) by referring to the parameters recommended by the author. For the mitochondrial genome (mtDNA) assembly, the Oxford Nanopore long reads were assembled into contigs using Nextdenovo with default parameters. Mitochondrial contigs were identified in each draft assembly by the BLASTn program [40] using the mitochondrial genome sequences of Gossypium arboretum (accession number: NC_035073.1) as a reference. As a result, there were two self-loop and three linear contig candidates with abundant matched hits. We then assembled the long reads using Smartdenovo [41] with default parameters, obtaining three self-loop and three linear candidate contigs. During our assembly of the mitochondrial genomes, we found that several pairs of repeats might mediate genome recombination, since these repeats were thought to have multiple connections during SPAdes [42] assembly. This result puzzled us, and we thought there might be a complex configuration of the mtDNA that interfered with the assembly. However, given the large number of foreign DNA fragments inserted into the mtDNA of plants, these multiple connections might be "false-positive-positives"; they might not be real, just artificial structures. Subsequently, we performed a de novo assembly of Illumina short-read data using SPAdes and obtained a preliminary draft mtDNA, a complex multibranched and closed-loop conformation (Fig. S5A). We then manually simplified the graph using Bandage [43] software by removing the chloroplast-and nuclear-derived nodes (Fig. S5B). During this process, some chloroplast nodes were retained, as they might be mitochondrial plastid DNA (MTPT). Thereafter, previous long-read assembly results were used to eliminate the interference of the repeats to restore the true mtDNA structure as much as possible. Finally, with the help of long reads, we obtained two independent molecules, and they were the dominant configurations of okra mtDNA (Fig. S5C).
The cpDNA was annotated using CPGAVAS2 [44] with the reference of 2544 plastomes. The two molecules of mtDNA were annotated using GeSeq [45] with the reference mtDNA of G. arboretum (accession number: NC_035073.1). The protein-coding genes (PCGs) were manually checked and edited using Apollo [46] if there were some problems. The genome map was drawn using OGDRAW [47]. All transfer RNA genes were confirmed by using tRNAscan-SE [48] with default settings.

Detection of genome recombination
In a previous mtDNA assembly, we found multiple repeats present in the draft mitochondrial genome (LR9, LR11, LR12 and SR1-SR12). Although we obtained the mitochondrial genome using long-read data, our assembly might only represent the dominant configuration of okra mtDNA. Given the structural variability of mtDNA, these repeats may be involved in mediating genome recombination, resulting in nondominant configurations. We mapped long reads to these repeats to detect any evidence of genome recombination. Specifically, for each repeat, there were two paths representing the major conformations (m1 and m2) and two paths representing the secondary conformations (s1 and s2), and we mapped the long reads to the 4 conformations. The flanking region of each repeat was also extended by an additional 1 kb region to ensure that the mapped long reads completely spanned the repeat region, and only reads long enough to completely cover the repeat sequences were counted as reads supporting this configuration. Two paths supporting the same conformation (m1 and m2, s1 and s2) only counted the number of reads of the one with the largest number. Particularly, for the nondominant configuration, we carefully checked each long read using Tablet [49] to eliminate ambiguous reads.

Analysis of intracellular gene transfer (IGT)
Due to the lack of a published nuclear genome for okra, only the two organelle genomes could be used for the identification of intracellular sequence migration at present. To identify the homologous sequences that might be transferred among the organelles, we compared the cpDNA of okra with the mtDNA using the BLASTn program with the following parameters: -evalue 1e-5, −word_size 9, −gapopen 5, −gapextend 2, −reward 2, −penalty-3, and -dust no. The BLASTn results were visualized using TBtools [50]. The identified transferred DNA fragments were also extracted according to their genome position and then annotated using GeSeq. We noted that most of these homologous sequences in the mitochondrial and chloroplast genomes, known as MTPTs, were not 100% similar in sequence. The plastid-derived and mitochondriaderived proteins could be distinguished in the Kmerbased assembly. However, 6 MTPTs were found during mitochondrial assembly (Fig. S5B), which could not be distinguished by Kmer-based assembly. We also used long reads to verify migration events for these MTPTs. When there was a long read supporting an MTPT flanked by mtDNA, this could indicate that this MTPT has been absorbed and integrated by the mitochondrial genome.

Identification of RNA editing sites
To identify RNA editing sites that occur at proteincoding genes (PCGs) in organelle genomes, we downloaded three sets of transcriptome data from NCBI (SRR15808319; SRR15808320; SRR15808321). In addition, to exclude the interference of natural variation, we also downloaded the WGS data (SRR5812498) to search for single nucleotide polymorphisms (SNPs) located in organelle PCGs. We mapped all of the downloaded data to protein-coding sequences extracted from the organelle genomes to identify RNA editing sites and SNPs. Here, we calculated the base composition and coverage of each site of each PCG in BAM files based on a custom script. For high-copy chloroplast PCGs, a minimum of 20× coverage and 10% or more read support were required to be considered RNA editing sites or SNPs. For mitochondrial PCGs of low copy number and low expression, the coverage was relaxed to 10×. Finally, sites that excluded SNPs were considered high-quality RNA editing sites in PCGs of the organelle genome of okra.

Phylogenetic inference
We conducted a BLAST search on the NCBI website for the regions homologous to mtpt14 in okra. We found that these plastid-derived homologous fragments were present in multiple mitochondrial genomes and were approximately 850 bp in length (Fig. S4). We downloaded these aligned sequences and added additional homologous sequences from cpDNA of other plant lineages to construct a phylogenetic tree. The corresponding nucleotide sequences were aligned using MAFFT (v7.450) [51]. Bayesian inferences (BI) analysis was performed using MrBayes (v3.2.6) [52] with the Markov chain Monte Carlo method for 200,000 generations and sampling trees every 100 generations. The first 20% of trees were discarded as burn-in, with the remaining trees being used for generating a consensus tree.