Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing
© Jo et al.; licensee BioMed Central Ltd. 2014
Received: 11 March 2014
Accepted: 20 June 2014
Published: 4 July 2014
Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp.
We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes.
Although large portion of sequence context was shared by mitochondrial genomes of CMS and male-fertile pepper lines, extensive genome rearrangements were detected. CMS candidate genes located on the edges of highly-rearranged CMS-specific DNA regions and near to repeat sequences. These characteristics were detected among CMS-associated genes in other species, implying a common mechanism might be involved in the evolution of CMS-associated genes.
Mitochondrial genomes of higher plants are clearly different from their animal counterparts and from plastid genomes in terms of evolutionary dynamics of genome structure [1, 2]. Although the rate of synonymous substitution in plant mitochondrial DNA (mtDNA) is 50–100 times and three times lower than in vertebrate mtDNA and plant plastid DNA, respectively, structural variations including changes in gene order, rearrangement, genome expansion and shrinkage, and incorporation of foreign DNAs are more common in plant mitochondria compared to the others [3–5]. The complexity of the plant mtDNA structure has been attributed to existence of a reservoir of low-copy-number subgenomic mtDNA molecules suppressed via nuclear control as well as the presence of repeat sequences dispersed throughout the genome that can mediate recombination [6–8].
Structural variations in mtDNA are associated with several mutant phenotypes such as cytoplasmic male sterility (CMS) and variegated-leaf phenotypes [9–11]. CMS has been studied in many crop species because of its economic importance for hybrid seed production. Most identified CMS-associated genes are novel chimeric open reading frames (ORFs) generated by fusion of several sequence segments due to rearrangement of the mitochondrial genome [9, 12]. Although CMS-associated genes in different crop species do not show significant similarity in their sequences, most share several features such as possession of a transmembrane domain and co-transcription with normal mitochondrial genes, which often encode ATP-synthase or cytochrome C oxidase subunits [9, 13–15]. The detailed mechanisms of how these genes originated remain unknown, however.
Complete sequencing and comparative analysis of mtDNA of normal and CMS lines has been performed in several crop species including sugar beet (Owen-type CMS), maize (CMS-T, CMS-S, CMS-C), wheat (K-type CMS), rice (CW- and LD-type CMS), rapeseed (pol- and nap-type CMS), and radish (Ogura- and DCGMS-type CMS), revealing structural variation in mtDNA and identifying genes responsible for CMS [16–24]. These studies showed that mitochondrial genome structures in lines exhibiting CMS are extensively rearranged compared to those of fertile lines, whereas gene contents are mostly conserved . For example, in sugar beet, normal and CMS lines have mitochondrial genomes composed of different arrays of fourteen sequence blocks that are syntenic between the two genomes . Recently, Tanaka et al.  showed that the mitochondrial genome of a radish with CMS, Ogura cytoplasm, has a large CMS-specific DNA region in addition to syntenic block sequences. This region, containing the CMS-associated orf138, was postulated to be inserted into the fertile mitochondrial genome by recombination via inverted repeat sequences located on its borders. However, the origin of the CMS-associated gene and other CMS-specific sequences in this region are still unknown.
CMS has been widely used in hybrid seed production in chili peppers. Only a single source of cytoplasm has been reported to be responsible for CMS . Kim et al. [14, 26] found two candidate CMS-associated genes, orf456 and ψatp6-2. The orf456 gene fused with a mitochondrial target sequence induced male sterility in transgenic Arabidopsis . In later studies, it was shown that the orf456 gene exists as a longer orf named orf507 . Further analysis of the function of ORF507 protein showed that the interaction of ORF507 with an ATP-synthase subunit (ATP synthase 6 kDa subunit) may cause impaired ATP synthase activity in CMS cytoplasm . The ψatp6-2 gene is the truncated form of atp6-2 that was generated by rearrangement in the 3′ region of atp6-2. Differences in the transcription pattern of ψatp6-2 between male-sterile and restorer lines demonstrated a possible association of this gene with CMS . However, complete sequence analyses of mitochondrial genomes to elucidate CMS-specific mtDNA structures and their evolutionary history has not been performed in Capsicum.
In this study, we first report the complete mitochondrial genome sequences for Capsicum. Comparative analysis between CMS and normal mitochondrial genomes provides insights into the evolution of mitochondrial genome structure associated with CMS in plants.
A pepper CMS line, ‘FS4401’ (S/rfrf), and a restorer line, ‘Jeju’ (N/RfRf), which are known to contain CMS and normal cytoplasm, respectively, were provided by Monsanto Korea. ‘FS4401’ is a breeding line containing a natural CMS cytoplasm that has been used as the stable CMS source in seed companies in Korea and ‘Jeju’ is a landrace in Korea. For each pepper line, approximately 3,000 seedlings were grown in dark conditions for twenty days and harvested to isolate mitochondria.
Mitochondrial DNA extraction
The method described by Millar et al.  and modified by Kim  was used for mitochondrial DNA extraction. Seedlings were homogenized using a mortar with isolation buffer consisting of 0.3 M mannitol, 50 mM Tris–HCl, 3 mM EDTA, 1 mM 2-mercaptoethanol, 0.1% BSA, 1% PVP, and protease inhibitor cocktail (Roche Applied Science, Indianapolis, USA) and adjusted to pH 7.5 with KOH. Homogenized tissue was filtered with one layer of Miracloth and four layers of cheesecloth. Two rounds of centrifugation at 2,000 g for 10 min were then performed to remove cell debris and larger organelles in cells. The supernatant subsequently was centrifuged at 15,000 g for 10 min to obtain a crude mitochondrial pellet. The pellet was gently resuspended with a painter’s brush in isolation buffer without PVP, adjusted to 10 mM MgCl2 and treated with DNase I (50 μg/ml) for one hour to degrade nuclear DNA. The sample was adjusted to 20 mM EDTA and centrifuged at 15,000 g for 10 min. The pellet was gently resuspended in 500 μL buffer II (0.3 M sucrose, 0.05 M Tris–HCl, 0.02 M EDTA, 0.1% BSA, pH 7.5) with a painter’s brush. After resuspension, the sample was layered on a 30-mL Percoll cushion (28% Percoll, 0.3 M sucrose, 0.05 M Tris–HCl, 0.02 M EDTA, pH 7.5) and centrifuged at 40,000 g for 90 min. The yellowish mitochondrial ring in the middle of the cushion was collected. The mitochondrial fraction was rinsed with washing buffer (0.3 M mannitol, 50 mM Tris–HCl, 1 mM EDTA, pH7.5) using three rounds of centrifugation at 15,000 g for 10 min. Mitochondrial DNA was extracted following the method described by Kim .
mtDNA sequencing was performed using the GS-FLX system (Roche Applied Science, Indianapolis, USA) and the resultant single-read sequences were assembled by Newbler Assembler Software Version 2.0 (454 Life Sciences, Branford, USA) in the National Instrumentation Center for Environmental Management (NICEM, Seoul, Republic of Korea) (Additional file 1).
Contigs were further assembled using the following strategies. First, analysis using the Basic Local Alignment Search Tool (BLAST) was performed against the Genbank nucleotide database (http://www.ncbi.nlm.nih.gov) to obtain mitochondrial contig sequences longer than 1 kb. The contig sequences that contained significant matches to known mtDNA sequences from other species were used for the next analyses. In the second step, a DNA library in which the average size of inserts was about 3 kb in length was constructed and the end sequences of inserts were analyzed using the ABI3700 sequencing system (Applied Biosystems, Foster City, USA) in NICEM (Seoul, Republic of Korea; Additional file 1). The mate-pair information of insert end sequences obtained by ABI sequencing was used to order contig sequences. In the third step, primers were designed from end sequences of each contig and all possible combinations of primers were tested by PCR analysis to identify connected contigs. If a gap sequence obtained from PCRs contained only plastid-derived sequence, the gap was considered to be obtained due to contamination with plastids during mitochondrial DNA preparation, not because of the existence of subgenomic mtDNA molecules. Through this step, gaps could be closed and the number of contigs was reduced. In the fourth step, genome walking from the ends of each assembled scaffold sequence was conducted using the GenomeWalker™ universal kit (Clontech, Mountain View, USA) according to the manufacturer’s instructions. In the fifth step, LA-PCRs were performed to fill remaining gaps between scaffolds using TaKaRa LA Taq™ (TaKaRa, Shiga, Japan). Finally, all information obtained by the stepwise approach was used to construct a master circle model that contained at least one copy of every mtDNA contig. The contig sequences that could be connected to multiple other contig sequences were regarded as repeated sequences. These sequences could be contained in a master circle two or more than two times. Insertions of repeated sequence were validated by PCR with primers designed from flanking regions of repeated sequences.
Accession numbers of mitochondrial genome sequences
Complete mitochondrial genome sequences of FS4401 (CMS) and Jeju (male-fertile) have been deposited in the GenBank nucleotide sequence database under the accession numbers of KJ865409 and J865410, respectively.
Screening a CM334 bacterial artificial chromosome library
BAC clones containing cox2 or atp6 was screened from a 12× BAC library of CM334, which is a Mexican landrace of chili pepper (C.annuum L.) containing normal cytoplasm . The cox2 and atp6 genes of CM334 were amplified from total DNA of CM334 using primer sets designed from sugar beet mitochondrial DNA (GenBank accession number: BA000009). The amplicons were labeled and used as the probes for BAC library screening based on hybridization as described by Yoo et al. . The sequences of the selected BAC clones containing both cox2 or atp6 was analyzed by 12× Shot-gun sequencing, which was carried out in NICEM (Seoul, Republic of Korea).
Gene annotation and characterization of ORFs
The protein and rRNA genes on mtDNA sequences were identified using BLAST with the nucleotide and protein database in GenBank (http://www.ncbi.nlm.nih.gov). The tRNA genes were identified using the tRNA scan-SE program (http://lowelab.ucsc.edu/tRNAscan-SE/). ORFs that were predicted to encode hypothetical proteins longer than 100 amino acids were screened using custom-made Perl scripts. The presence of a transmembrane domain in each hypothetical protein was predicted using TMHMM server v.2.0 (http://www.cbs.dtu.dk/services/TMHMM/).
Sequence comparison and repeat sequence analysis
Alignment between target sequences was performed using the BLASTN algorithm (http://blast.ncbi.nlm.nih.gov/). Pools of repeated sequences were obtained by analysis using BLASTN in which a given target sequence was used as both query and subject sequence. The alignments that met the criteria to be syntenic sequence blocks or repeated sequences in terms of length and similarity were isolated and visualized as Scalable Vector Graphics using custom-made Perl scripts followed by manual modification.
Assembly of complete mitochondrial genome sequences
We obtained contigs containing most mitochondrial genome information for a pepper CMS line (FS4401) and a fertile line (Jeju) by sequencing their mitochondrial genomes using the 454 GS-FLX system. However, it appeared that too many mtDNA contigs (>1 kb) were obtained for each line when the high coverage of sequencing (>100×) was considered (Additional file 1). The reasons for this were revealed in the process of further assembly of contigs and analysis of gap sequences. Firstly, contamination with plastid DNA hampered sequence assembly at the positions of mitochondrial genomes where plastid-derived sequences were located. Secondly, ends of large repeated sequences remained unconnected due to the short length of individual reads in the 454 GS-FLX system (Additional file 1). Therefore, we constructed a DNA library containing inserts averaging 3 kb in length and analyzed the end sequences of inserts using ABI3700 for ordering of contigs (Additional file 1). In addition, we performed PCR analysis and genome walking from contig ends to test all of the possible combinations of large repeated sequences with other contig sequences. The final circular molecules for complete mitochondrial genomes included every mtDNA contig longer than 1 kb at least one time and were consistent with all of the results produced during sequence assembly process.
Meanwhile, screening of a bacterial artificial chromosome (BAC) library of pepper line ‘CM334’ with normal (fertile) cytoplasm using cox2 and atp6 as probes resulted in the isolation of one BAC clone containing both cox2 and atp6. The complete sequence of this BAC clone (74,615 bp) was obtained by shotgun sequencing and assembly.
Comparative analysis of general features and sequence contents between mitochondrial genomes
General features of mitochondrial genomes of two pepper lines and one tobacco line
Genome size (bp)
GC content (%)
Coding sequences (bp)a
Plastid-derived sequences (bp)b
Repeated sequences (bp)c
Gene content (number)
Protein coding genesd
Sequence alignment analysis showed that most sequence content was shared between two mtDNAs of pepper. 95.1% of the sequence of FS4401 could be aligned with Jeju and 98.0% of Jeju with FS4401. In comparative analysis with tobacco sequence, only about 46.3% and 45.4% of each genome could be aligned with tobacco mtDNA, respectively (Additional file 2).
Distribution of sequence blocks syntenic between pepper and tobacco mitochondrial genomes
Gene contents and localization on mitochondrial genomes
We next annotated and localized the genes with known functions on the FS4401 and Jeju mitochondrial genomes (Figure 1). The protein-coding genes were classified according to the functions of proteins; the 37 protein-coding genes shared by FS4401 and Jeju included nine genes for complex I proteins (nad1, nad2, nad3, nad4, nad4L, nad5, nad6, nad7, nad9), two for complex II (sdh3, sdh4), one for complex III (cob), three for complex IV (cox1, cox2, cox3), five for ATP synthase subunits (atp1, atp4, atp6, atp8, atp9), ten for ribosomal proteins (rpl2, rpl5, rpl10, rpl16, rps3, rps4, rps10, rps12, rps13, rps19), four for proteins involved in cytochrome c biogenesis (ccmB, ccmC, ccmFc, ccmFN), one for maturase (matR), and one for a protein translocation system subunit (mttB). In addition to these genes, FS4401 had another copy of the atp6 gene (ψatp6-2). Both FS4401 and Jeju contained three rRNA genes (rrn5, rrn18, rrn26), while FS4401 contained one additional tRNA gene compared to Jeju.
Differences between Jeju and FS4401 in sequences of known genes
Genes in Jeju
Gene length in Jeju
Polymorphism in FS4401
Corresponding sequence in tobacco
904 gcA (A) → 904 gcG (A)
904 gcG (A)
16 acGAATATGCAg (TNMQ)
16 acGAATATGCAg (TNMQ)
→ 16 acg (T)
178 ccCAACAGTTTg (PNSL)
178 ccCAACTGTTTg (PNCL)
→ 178 ccg (P)
337 ccCGGGAAGGGggat (PGKGD)
337 ccCGGGAAGGGggat (PGKGD)
→ 337 ccggat (PD)
178 tTCttc (FF)
178 tCTttc (SF)
→ 178 tCTttc (SF)
Higher similarity with atp6-1 in FS4401
283 gGt (G) → gCt (A)
316 ACa (T) → CAa (Q)
454 aaAGaa (KE) → aaCCaa (NQ)
no similarity in downstream of 931th bp due to DNA rearrangement
no similarity in upstream of 497th bp due to DNA rearrangement
A large number of genes were located close to each other, forming gene clusters that might be co-transcriptional units. In total, sixteen clusters were detected in FS4401 and Jeju, including rps10ab-nad1a, rrn18-rrn5, trnD-trnS, trnM-rrn26-ccmC-trnL, rps4-nad5c, rps13-nad1bc-nad4L-atp4, rpl2ab-rpl10, rpl5-rps14-cob-trnC, rps19-rps3ab-rpl16-cox2ab, nad1d-matR, sdh3-nad2ab, atp8-cox3-sdh4, nad3-rps12, trnC-trnN-trnY-nad2cde, nad9-trnP-trnW, trnS-trnF-trnP-nad1e (Figure 1, Additional file 3). Although numerous rearrangements between the two pepper mitochondrial genomes were detected, the clustering pattern of these genes was highly conserved.
Rearrangements of genome structure between CMS and normal mtDNA
End points of several sequence blocks syntenic between FS4401 and Jeju were located very close (<2 kb) to the edges of blocks shared by each pepper line and tobacco analyzed using the same criteria (>2 kb, > 95%; Additional file 3). There were eight and seven block end regions included in this category in FS4401 and Jeju, respectively. At least two rearrangement events, including one during speciation between tobacco and pepper and another during evolution of pepper mitochondrial genomes, were expected in these regions. No single sequence blocks that could cover adjacent ends of blocks syntenic between FS4401 and Jeju were detected in the alignment with tobacco mtDNA except for two syntenic sequence block that were specific to the alignment between Jeju and tobacco. This syntenic block connected the 3′ junction of block 16, which was downstream of cox2, and the 5′ junction of block 1 (Additional file 3).
ORFs unique to each mitochondrial genome
We next identified novel open reading frames predicted to encode proteins longer than 100 amino acids in length in the mitochondrial genomes. A total of 155 and 142 ORFs (excluding ORFs for known genes) were detected in FS4401 and Jeju, respectively. Comparative analysis of these ORFs showed that 45 and 30 ORFs had polymorphisms with ORF counterparts or were specific to one genome in FS4401 and Jeju, respectively (Additional file 6). FS4401 mtDNA contained 33 ORFs with SNPs or length polymorphism and 12 ORFs that were chimeric, whereas Jeju mtDNA contained 22 polymorphic and 8 chimeric ORFs. When the localization of ORFs that were chimeric or unique to FS4401 was investigated to search for candidate CMS-associated genes, seven ORFs including orf100d, orf102l, orf108a, orf119c, orf141, orf300 and orf507 were found to be close (<2 kb) to the edge of sequence blocks syntenic between FS4401 and Jeju (>2 kb; > 95% similarity). When we searched for ORFs located near (<2 kb) repeat sequences (>100 bp; > 95% similarity between copies) and containing putative transmembrane domains, six (orf102l, orf119c, orf262, orf300, orf338, orf507) and three (orf262, orf300, orf507) ORFs were identified, respectively (Figure 3). The orf507 gene, a strong candidate CMS-associated gene reported in previous research , met the conditions for candidate genes for CMS. Although the other previously reported candidate, ψatp6-2, was not classified as a novel ORF because it was defined as a gene, it also satisfied all of the conditions described above (proximity to edges of syntenic sequence blocks and repeated sequence, containing putative transmembrane domains). Other than these two, only orf300 showed the same characteristics.
Structure of sequences around orf507 and ψatp6-2
The short sequence elements that were detected on and ψatp6-2 in FS4401 were present as repeat sequence only in FS4401, implying that these sequences were duplicated during rearrangement (Additional file 7). In particular, R21 in Jeju was duplicated downstream of ψatp6-2 in FS4401 resulting in generation of a repeat pair around the ψatp6-2 gene. The orf507 gene and other related sequence elements seemed to be inserted between cox2 and R21 via multiple DNA rearrangements.
Comparison of mtDNA around cox2 and atp6 in Jeju with corresponding regions in CM334, which is a C.annuum landrace introduced from distant area (Mexico) and contains normal cytoplasm, showed that the DNA rearrangement involving R17 resulted in close proximity of cox2 and atp6 to each other although sequence contents flanking the two genes were highly conserved. In addition, the gene order of cox2 and atp6 was opposite in CM334 compared to FS4401 (Figure 4).
DNA rearrangement pattern and localization of CMS-associated genes in mitochondrial genomes of other crop species
Here, we report the complete mitochondrial genome sequence of pepper (C. annuum L.). This represents only the second mitochondrial genome from the Solanaceae to be fully sequenced, following that of tobacco . Therefore, the mtDNA sequence of pepper described herein is a valuable resource for studying the evolution of mitochondrial genomes in the Solanaceae. The contents and sequences of protein-coding genes were mostly conserved between tobacco and pepper mtDNA although small number of SNPs and indels were found in two and three genes, respectively. It was noticeable that the frequency of in-frame indel polymorphisms was higher in pepper than other crops such as rice  and radish . However, the overall structure and the non-coding sequences were extensively changed, resulting in less than 50% coverage of pepper mitochondrial genome sequence by tobacco mtDNA. Similar widespread rearrangements within a plant family have been reported from comparative analysis of Arabidopsis and rapeseed mitochondrial genomes , in which only one-third of Arabidopsis and two-third of rapeseed mtDNA could be aligned to each other. By contrast, small DNA regions containing clusters of gene sequences were mostly protected from rearrangement events. The conservation of gene sequence clusters might reflect a requirement for co-regulation of gene expression on each cluster. However, rearrangements were detected even in the small number of gene clusters, including atp9-rps13-nad1bc, nad4-rps1-nad5ab, nad3-nad1a, and rps4-nad6, that have been reported to be putative co-transcribed units in tobacco . In particular, co-transcription of gene cluster nad3-nad1a has been confirmed experimentally in tobacco [36, 37], whereas we found that nad3 and nad1a were incorporated into different clusters in pepper mtDNA. Therefore, a change in co-transcription units has resulted from DNA rearrangement during speciation or independent evolution of tobacco and pepper mitochondrial genomes after speciation.
Numerous rearrangements of mitochondrial DNA were also detected even in the comparison of CMS-associated and normal mtDNA within C. annuum species. Conservation of gene coding sequences and clustering patterns indicated that maintenance of clusters may be essential for normal expression of genes or those sequences in transcribed regions have characteristics that efficiently suppress rearrangement. However, multiple rearrangements occurring outside of gene clusters resulted in the fragmentation of alignment units between genomes. Several sequence blocks that were syntenic between genomes contained overlapping repeat sequences that might mediate homologous recombination and result in genome rearrangement. However, many sequence blocks were connected with sequences unique to each genome or had overlapping sequences that were shorter than 50 bp which was known as the lower limit of homologous sequence length required for recombination that mediates double-strand break repair [7, 38, 39]. This might be explained by lose of larger repeats during evolution after rearrangements occurred or involvement of nonhomologous end-joining (NHEJ) and/or microhomology-mediated recombination. Comparative analyses on mitochondrial genomes of Arabidopsis ecotypes and mutants revealed that DNA rearrangement results from nonhomologous end-joining (NHEJ) and asymmetric recombination via intermediate-sized repeats followed by randomly occurring double-strand breaks of DNA [7, 40]. Sequences lacking homology were joined by NHEJ, while asymmetric recombination is accompanied by repeat sequences longer than 50 bp . Recombinations via microhomologous repeats (ranging from 6 to 31 bp) have been also reported in pearl millet and maize mutants showing nonchromosomal stripe (NCS) phenotype [41–43]. Microhomology-mediated break-induced replication (MMBIR) have been proposed as one of the mechanism for microhomology-mediated rearrangements in plastids and mitochondria [44, 45].
A significant number of ends of sequence blocks syntenic between FS4401 and Jeju were located close to the junctions of sequence blocks syntenic between FS4401 and tobacco or Jeju and tobacco. Therefore, those regions experienced at least two rearrangement events within a very short distance: one between pepper and tobacco, and the other between CMS and normal pepper lines. Why recombination frequently occurs in specific regions is still unknown although localization of DNA cruciforms, localized melting due to high transcriptional activity, and stalling replication folks have been suggested as possible explanations [40, 46]. Further investigation on a large number of mitochondrial genome sequences in diverse plant families is required to identify and characterize recombination hotspots.
The orf507 and ψatp6-2 genes are known to be associated with CMS in pepper, based on genetic and functional analyses [14, 26, 28]. Comparison between complete mtDNA sequences from CMS-associated and normal cytoplasm in this study reinforced these genes as CMS candidate genes. Although a large number of ORFs were specific to the CMS-associated mitochondrial genome, only one ORF (orf300; Figure 3) in addition to orf507 and ψatp6-2 had the typical characteristics shared by most CMS-associated genes in other species: formation of the ORF by novel DNA rearrangement , presence of a transmembrane domain , and localization close to a junction of syntenic sequence blocks and repeat sequences (discussed below). However, the potential association of other screened orfs including orf300 with CMS also needs to be investigated because discrepancies in cytoplasm types and haplotypes of markers based on orf507 and ψatp6-2 have been reported in several germplasms [47, 48], which may due to incorrect identification of candidate CMS genes or to the existence of a different CMS source.
The genomic regions around orf507 and ψatp6-2 were clearly distinguished from other DNA regions because they were included among the largest sequence fragments highly specific to the CMS line and matched no other known sequences. Recently, similar results were reported for radish Ogura type cytoplasm in which the CMS-associated gene orf138 was located on an edge of the largest genomic region unique to the CMS line . Although the insertion of the DNA region containing orf138 could be demonstrated to result from homologous recombination using a pair of inverted repeat sequences on the ends, the region around orf507 and ψatp6-2 in pepper contained more complicated structure, hampering the elucidation of the mechanism by which the genomic structure arose. The origin of the 3′ part of orf 507 and a large portion of the region around orf507 and ψatp6-2 remains unknown. One possible explanation for how these sequences came to be specifically present in the CMS line might be substoichiometric shift (SSS), which has been reported in several plant species [6, 41, 49, 50]. According to the SSS model, subgenomic molecules of mtDNA are present at very low copy number under normal conditions, in which recombination of intermediate-sized repeat sequences is suppressed, and if this recombination is activated (e.g., under certain conditions), these molecules can be efficiently amplified by recombination-dependent replication and maintained as the predominant form of subgenomic molecules even in subsequent generations . In fact, small amounts of orf507 and ψatp6-2 were detected by PCR even in fertile pepper lines . Therefore, the subgenomic molecule containing CMS candidate genes that had been generated by rearrangements via microhomology-mediated recombination using short sequences overlapped between sequence elements (ranging from 5 to 40 bp; Figure 4) and/or NHEJ of sequence elements from diverse sources might be maintained at low copy number even in normal pepper lines. If the suppression of ectopic recombination is released under certain conditions, amplification of the CMS-specific DNA structure containing orf507 and ψatp6-2 might occur by recombination-dependent replication via intermediate-sized repeat sequences around these region. A pair of intermediate-sized repeats (R21) of which one copy is located downstream of orf507 and the other downstream of ψatp6-2 might be candidates for mediating this procedure. However, prediction of the precise mechanism of the rearrangement is limited by the lack of information on the evolutionary relationship between the CMS cytoplasm and a normal type cytoplasm in Jeju. In fact, the corresponding DNA region found in another type of normal cytoplasm from CM334 showed structural differences when compared to Jeju implying the presence of multiple cytoplasm types that have undergone different levels of rearrangement (Figure 4). Analyses of the mtDNA region containing orf507-ψatp6-2 from different cytoplasms of pepper may provide clues to the detailed steps of the rearrangements. In addition, further analysis using high-coverage paired-end sequencing may facilitate identification of possible structures of subgenomic molecules to elucidate dynamic processes related to origin of CMS in Capsicum.
Considering the specific characteristics of the CMS-associated region in pepper, we performed analysis of the organization of syntenic sequence blocks, repeat distribution, and localization of CMS-associated genes in six additional CMS mitochondrial genomes from other species. In all of the cases, CMS genes were located at the edge of considerably long CMS-specific sequences between syntenic blocks and close to intermediate-sized repeat sequences or on the repeat sequence itself (e.g., CMS-S in maize). None of CMS genes originated by the fusion of sequences that are exist on predominant subgenomic molecules of male-fertile lines nor by small insertions or deletions on pre-existing sequences. These findings fit well with the notion that subgenomic structure containing CMS genes might originate from multiple rearrangements mediated by microhomology-mediated recombination or NHEJ to create novel DNA sequence regions and copy number increases due to recombination via adjacent repeat sequences. Proximity of pre-existing low copy-number CMS genes to intermediate-sized repeat sequences might be the prerequisite to ensure amplification of these sequences required for the induction of CMS as discussed by Davila et al. . The close localization of CMS genes to syntenic sequence blocks might be due to the need for sequence elements required for transcription of a chimeric orf. In Arabidopsis, the majority of chimeric orfs were shown not to be transcribed . This implies that utilization of promoters in conserved regions might be requisite for the transcription of orfs. These common features of genomic environment of CMS-associated genes can be clues to understand the evolution of CMS as well as provide a strategy to screen for unknown CMS-gene candidates by comparative genomics approaches.
The complete mitochondrial genome sequences of pepper were obtained in CMS and male-fertile lines. A large portion of the intergenic sequences in the pepper lines could not be aligned with the mitochondrial genome of Nicotiana tabacum, which is a member of the same family (Solanaceae), whereas sequences and clustering patterns of genes were largely conserved. In the comparison between mitochondrial genomes of CMS and male-fertile pepper lines, however, most genome sequences could be aligned although syntenic sequences were divided into eighteen sequences blocks that were generated by rearrangements in intergenic regions. The CMS candidate genes orf507 and ψatp6-2 were located on the edges of CMS-specific sequence segments that were between syntenic sequence blocks. The presence of many repeat sequences and connection of sequence segments overlapped each other by a few nucleotides implied that extensive rearrangements by homologous recombination and/or NHEJ might be involved in evolution and substoichiometric shift of this region. Extended investigation using CMS-associated genes identified in other species revealed that these genes are specifically localized near edges of CMS-specific DNA regions and intermediate or large-sized repeat sequences indicating the evolution of CMS-associated genes might involve the common mechanism.
This research was supported by the Golden Seed Project, Ministry of Agriculture, Food and Rural Affairs (MAFRA), Ministry of Oceans and Fisheries (MOF), Rural Development Administration (RDA) and Korea Forest Service (KFS) and a grant (Project No. 710001–07) from the Vegetable Breeding Research Center through the R&D Convergence Center Support Program, Ministry of Agriculture, Food and Rural Affairs (MAFRA) Republic of Korea.
- Andre C, Levy A, Walbot V: Small repeated sequences and the structure of plant mitochondrial genomes. Trends Genet. 1992, 8: 128-132.PubMedGoogle Scholar
- Palmer JD, Adams KL, Cho Y, Parkinson CL, Qiu YL, Song K: Dynamic evolution of plant mitochondrial genomes: mobile genes and introns and highly variable mutation rates. Proc Natl Acad Sci U S A. 2000, 97: 6960-6966.PubMed CentralPubMedView ArticleGoogle Scholar
- Palmer JD, Herbon LA: Plant mitochondrial DNA evolves rapidly in structure, but slowly in sequence. J Mol Evol. 1988, 28: 87-97.PubMedView ArticleGoogle Scholar
- Palmer JD: Contrasting modes and tempos of genome evolution in land plant organelles. Trends Genet. 1990, 6: 115-120.PubMedView ArticleGoogle Scholar
- Wolfe KH, Li WH, Sharp PM: Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proc Natl Acad Sci U S A. 1987, 84: 9054-9058.PubMed CentralPubMedView ArticleGoogle Scholar
- Arrieta-Montiel M, Lyznik A, Woloszynska M, Janska H, Tohme J, Mackenzie S: Tracing evolutionary and developmental implications of mitochondrial stoichiometric shifting in the common bean. Genetics. 2001, 158: 851-864.PubMed CentralPubMedGoogle Scholar
- Davila JI, Arrieta-Montiel MP, Wamboldt Y, Cao J, Hagmann J, Shedge V, Xu YZ, Weigel D, Mackenzie SA: Double-strand break repair processes drive evolution of the mitochondrial genome in Arabidopsis. BMC Biol. 2011, 9: 64-PubMed CentralPubMedView ArticleGoogle Scholar
- Small I, Suffolk R, Leaver CJ: Evolution of plant mitochondrial genomes via substoichiometric intermediates. Cell. 1989, 58: 69-76.PubMedView ArticleGoogle Scholar
- Hanson MR, Bentolila S: Interactions of mitochondrial and nuclear genes that affect male gametophyte development. Plant Cell. 2004, 16 (Suppl): S154-S169.PubMed CentralPubMedView ArticleGoogle Scholar
- Abdelnoor RV, Yule R, Elo A, Christensen AC, Meyer-Gauen G, Mackenzie SA: Substoichiometric shifting in the plant mitochondrial genome is influenced by a gene homologous to MutS. Proc Natl Acad Sci U S A. 2003, 100: 5968-5973.PubMed CentralPubMedView ArticleGoogle Scholar
- Zaegel V, Guermann B, Le Ret M, Andres C, Meyer D, Erhardt M, Canaday J, Gualberto JM, Imbault P: The plant-specific ssDNA binding protein OSB1 is involved in the stoichiometric transmission of mitochondrial DNA in Arabidopsis. Plant Cell. 2006, 18: 3548-3563.PubMed CentralPubMedView ArticleGoogle Scholar
- Kubo T, Kitazaki K, Matsunaga M, Kagami H, Mikami T: Male sterility-inducing mitochondrial genomes: how do they differ?. Crit Rev in Plant Sci. 2011, 30: 378-400.View ArticleGoogle Scholar
- Ashutosh , Kumar P, Dinesh Kumar V, Sharma PC, Prakash S, Bhat SR: A novel orf108 co-transcribed with the atpA gene is associated with cytoplasmic male sterility in Brassica juncea carrying Moricandia arvensis cytoplasm. Plant Cell Physiol. 2008, 49: 284-289.PubMedView ArticleGoogle Scholar
- Kim DH, Kang JG, Kim BD: Isolation and characterization of the cytoplasmic male sterility-associated orf456 gene of chili pepper (Capsicum annuum L.). Plant Mol Biol. 2007, 63: 519-532.PubMedView ArticleGoogle Scholar
- Wang Z, Zou Y, Li X, Zhang Q, Chen L, Wu H, Su D, Chen Y, Guo J, Luo D, Long Y, Zhong Y, Liu YG: Cytoplasmic male sterility of rice with boro II cytoplasm is caused by a cytotoxic peptide and is restored by two related PPR motif genes via distinct modes of mRNA silencing. Plant Cell. 2006, 18: 676-687.PubMed CentralPubMedView ArticleGoogle Scholar
- Allen JO, Fauron CM, Minx P, Roark L, Oddiraju S, Lin GN, Meyer L, Sun H, Kim K, Wang C, Du F, Xu D, Gibson M, Cifrese J, Clifton SW, Newton KJ: Comparisons among two fertile and three male-sterile mitochondrial genomes of maize. Genetics. 2007, 177: 1173-1192.PubMed CentralPubMedView ArticleGoogle Scholar
- Bentolila S, Stefanov S: A reevaluation of rice mitochondrial evolution based on the complete sequence of male-fertile and male-sterile mitochondrial genomes. Plant Physiol. 2012, 158 (2): 996-1017.PubMed CentralPubMedView ArticleGoogle Scholar
- Chen J, Guan R, Chang S, Du T, Zhang H, Xing H: Substoichiometrically different mitotypes coexist in mitochondrial genomes of Brassica napus L. PLoS ONE. 2011, 6: e17662-PubMed CentralPubMedView ArticleGoogle Scholar
- Handa H: The complete nucleotide sequence and RNA editing content of the mitochondrial genome of rapeseed (Brassica napus L.): comparative analysis of the mitochondrial genomes of rapeseed and Arabidopsis thaliana. Nucleic Acids Res. 2003, 31: 5907-5916.PubMed CentralPubMedView ArticleGoogle Scholar
- Fujii S, Kazama T, Yamada M, Toriyama K: Discovery of global genomic re-organization based on comparison of two newly sequenced rice mitochondrial genomes with cytoplasmic male sterility-related genes. BMC Genomics. 2010, 11: 209-PubMed CentralPubMedView ArticleGoogle Scholar
- Liu H, Cui P, Zhan K, Lin Q, Zhuo G, Guo X, Ding F, Yang W, Liu D, Hu S, Yu J, Zhang A: Comparative analysis of mitochondrial genomes between a wheat K-type cytoplasmic male sterility (CMS) line and its maintainer line. BMC Genomics. 2011, 12: 163-PubMed CentralPubMedView ArticleGoogle Scholar
- Park JY, Lee YP, Lee J, Choi BS, Kim S, Yang TJ: Complete mitochondrial genome sequence and identification of a candidate gene responsible for cytoplasmic male sterility in radish (Raphanus sativus L.) containing DCGMS cytoplasm. Theor Appl Genet. 2013, 128: 1763-1774.View ArticleGoogle Scholar
- Satoh M, Kubo T, Nishizawa S, Estiati A, Itchoda N, Mikami T: The cytoplasmic male-sterile type and normal type mitochondrial genomes of sugar beet share the same complement of genes of known function but differ in the content of expressed ORFs. Mol Genet Genomics. 2004, 272: 247-256.PubMedView ArticleGoogle Scholar
- Tanaka Y, Tsuda M, Yasumoto K, Yamagishi H, Terachi T: A complete mitochondrial genome sequence of Ogura-type male-sterile cytoplasm and its comparative analysis with that of normal cytoplasm in radish (Raphanus sativus L.). BMC Genomics. 2012, 13: 352-PubMed CentralPubMedView ArticleGoogle Scholar
- Peterson PA: Cytoplasmically inherited male sterility in Capsicum. Amer Nat. 1958, 92: 111-119.View ArticleGoogle Scholar
- Kim DH, Kim BD: The organization of mitochondrial atp6 gene region in male fertile and CMS lines of pepper (Capsicum annuum L.). Curr Genet. 2006, 49: 59-67.PubMedView ArticleGoogle Scholar
- Gulyas G, Shin Y, Kim H, Lee JS, Hirata Y: Altered transcript reveals an orf507 sterility-related gene in chili pepper (Capsicum annuum L.). Plant Mol Bio Rep. 2010, 28: 605-612.View ArticleGoogle Scholar
- Li J, Pandeya D, Jo YD, Liu WY, Kang BC: Reduced activity of ATP synthase in mitochondria causes cytoplasmic male sterility in chili pepper. Planta. 2013, 237 (4): 1097-1109.PubMedView ArticleGoogle Scholar
- Millar AH, Sweetlove LJ, Giege P, Leaver CJ: Analysis of the Arabidopsis mitochondrial proteome. Plant Physiol. 2001, 127: 1711-1727.PubMed CentralPubMedView ArticleGoogle Scholar
- Kim DH: PhD thesis. Isolation of Cytoplasmic Male Sterility (CMS)-Associated Gene and Development of CMS-Specific SCAR Markers in Chili Pepper. 2004, Seoul, Republic of Korea: Seoul National University, Plant Science DepartmentGoogle Scholar
- Yoo EY, Kim S, Kim JY, Kim BD: Construction and characterization of a bacterial artificial chromosome library from chili pepper. Mol Cell. 2001, 12 (1): 117-120.Google Scholar
- Kubo N, Arimura S: Discovery of the rpl10 gene in diverse plant mitochondrial genomes and its probable replacement by the nuclear gene for chloroplast RPL10 in two lineages of angiosperms. DNA Res. 2010, 17 (1): 1-9.PubMed CentralPubMedView ArticleGoogle Scholar
- Jo YD, Park J, Kim J, Song W, Hur CG, Lee YH, Kang BC: Complete sequencing and comparative analyses of the pepper (Capsicum annuum L.) plastome revealed high frequency of tandem repeats and large insertion/deletions on pepper plastome. Plant Cell Rep. 2011, 30: 217-229.PubMedView ArticleGoogle Scholar
- Shinozaki K, Ohme M, Tanaka M, Wakasugi T, Hayashida N, Matsubayashi T, Zaita N, Chunwongse J, Obokata J, Yamaguchi-Shinozaki K, Ohto C, Torazawa K, Meng BY, Sugita M, Deno H, Kamogashira T, Yamada K, Kusuda J, Takaiwa F, Kato A, Tohdoh N, Shimada H, Sugiura M: The complete nucleotide sequence of the tobacco chloroplast genome: its gene organization and expression. EMBO J. 1986, 5: 2043-2049.PubMed CentralPubMedGoogle Scholar
- Sugiyama Y, Watase Y, Nagase M, Makita N, Yagura S, Hirai A, Sugiura M: The complete nucleotide sequence and multipartite organization of the tobacco mitochondrial genome: comparative analysis of mitochondrial genomes in higher plants. Mol Genet Genomics. 2005, 272: 603-615.PubMedView ArticleGoogle Scholar
- Lelandais C, Gutierres S, Mathieu C, Vedel F, Remacle C, Marechal-Drouard L, Brennicke A, Binder S, Chetrit P: A promoter element active in run-off transcription controls the expression of two cistrons of nad and rps genes in Nicotiana sylvestris mitochondria. Nucleic Acids Res. 1996, 24: 4798-4804.PubMed CentralPubMedView ArticleGoogle Scholar
- Gutierres S, Lelandais C, Paepe RD, Vedel F, Chetrit P: A mitochondrial sub-stoichiometric orf87-nad3-nad1 exonA co-transcription unit present in solanaceae was amplified in the genus Nicotiana. Curr Genet. 1997, 31: 55-62.PubMedView ArticleGoogle Scholar
- Singer BS, Gold L, Gauss P, Doherty DH: Determination of the amount of homology required for recombination in bacteriophage T4. Cell. 1982, 31 (1): 25-33.PubMedView ArticleGoogle Scholar
- Watt VM, Ingles CJ, Urdea MS, Rutter WJ: Homology requirements for recombination in Escherichia coli. Proc Natl Acad Sci U S A. 1985, 82 (14): 4768-4772.PubMed CentralPubMedView ArticleGoogle Scholar
- Shedge V, Arrieta-Montiel M, Christensen AC, Mackenzie SA: Plant mitochondrial recombination surveillance requires unusual RecA and MutS homologs. Plant Cell. 2007, 19: 1251-1264.PubMed CentralPubMedView ArticleGoogle Scholar
- Feng X, Kaur AP, Mackenzie SA, Dweikat IM: Substoichiometric shifting in the fertility reversion of cytoplasmic male sterile pearl millet. Theor Appl Genet. 2009, 118 (7): 1361-1370.PubMedView ArticleGoogle Scholar
- Newton KJ, Knudsen C, Gabay-Laughnan S, Laughnan JR: An abnormal growth mutant in maize has a defective mitochondrial cytochrome oxidase gene. Plant Cell. 1990, 2 (2): 107-113.PubMed CentralPubMedView ArticleGoogle Scholar
- Hunt MD, Newton KJ: The NCS3 mutation: genetic evidence for the expression of ribosomal protein genes in Zea mays mitochondria. EMBO J. 1991, 10 (5): 1045-1052.PubMed CentralPubMedGoogle Scholar
- Marechal A, Brisson N: Recombination and the maintenance of plant organelle genome stability. New Phytologist. 2010, 186 (2): 299-317.PubMedView ArticleGoogle Scholar
- Marechal A, Parent JS, Veronneau-Lafortune F, Joyeux A, Lang BF, Brisson N: Whirly proteins maintain plastid genome stability in Arabidopsis. Proc Natl Acad Sci U S A. 2009, 106 (34): 14693-14698.PubMed CentralPubMedView ArticleGoogle Scholar
- Stohr BA, Kreuzer KN: Coordination of DNA ends during double-strand-break repair in bacteriophage T4. Genetics. 2002, 162: 1019-1030.PubMed CentralPubMedGoogle Scholar
- Jo YD, Jeong HJ, Kang BC: Development of a CMS specific marker based on chloroplast-derived mitochondrial sequence in pepper. Plant Biotechnol Rep. 2009, 3: 309-315.View ArticleGoogle Scholar
- Jo YD, Jeong HJ, Kang BC: Origin of the Capsicum CMS cytoplasm revealed by cytoplasmic DNA derived marker analysis. Sci Horti. 2011, 131: 74-81.View ArticleGoogle Scholar
- Janska H, Sarria R, Woloszynska M, Arrieta-Montiel M, Mackenzie SA: Stoichiometric shifts in the common bean mitochondrial genome leading to male sterility and spontaneous reversion to fertility. Plant Cell. 1998, 10: 1163-1180.PubMed CentralPubMedView ArticleGoogle Scholar
- Kim S, Lim H, Park S, Cho KH, Sung SK, Oh DG, Kim KT: Identification of a novel mitochondrial genome type and development of molecular markers for cytoplasm classification in radish (Raphanus sativus L.). Theor Appl Genet. 2007, 115: 1137-1145.PubMedView ArticleGoogle Scholar
- Giege P, Konthur Z, Walter G, Brennicke A: An ordered Arabidopsis thaliana mitochondrial cDNA library on high-density filters allows rapid systematic analysis of plant gene expression: a pilot study. Plant J. 1998, 15: 721-726.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.