Skip to main content

Complete mitochondrial genome assembly of Juglans regia unveiled its molecular characteristics, genome evolution, and phylogenetic implications

Abstract

Background

The Persian walnut (Juglans regia), an economically vital species within the Juglandaceae family, has seen its mitochondrial genome sequenced and assembled in the current study using advanced Illumina and Nanopore sequencing technology.

Results

The 1,007,576 bp mitogenome of J. regia consisted of three circular chromosomes with a 44.52% GC content encoding 39 PCGs, 47 tRNA, and five rRNA genes. Extensive repetitive sequences, including 320 SSRs, 512 interspersed, and 83 tandem repeats, were identified, contributing to genomic complexity. The protein-coding sequences (PCGs) favored A/T-ending codons, and the codon usage bias was primarily shaped by selective pressure. Intracellular gene transfer occurred among the mitogenome, chloroplast, and nuclear genomes. Comparative genomic analysis unveiled abundant structure and sequence variation among J. regia and related species. The results of selective pressure analysis indicated that most PCGs underwent purifying selection, whereas the atp4 and ccmB genes had experienced positive selection between many species pairs. In addition, the phylogenetic examination, grounded in mitochondrial genome data, precisely delineated the evolutionary and taxonomic relationships of J. regia and its relatives. We identified a total of 539 RNA editing sites, among which 288 were corroborated by transcriptome sequencing data. Furthermore, expression profiling under temperature stress highlighted the complex regulation pattern of 28 differently expressed PCGs, wherein NADH dehydrogenase and ATP synthase genes might be critical in the mitochondria response to cold stress.

Conclusions

Our results provided valuable molecular resources for understanding the genetic characteristics of J. regia and offered novel perspectives for population genetics and evolutionary studies in Juglans and related woody species.

Peer Review reports

Introduction

Relict trees are not only repositories of information about Earth’s changes over hundreds of millions of years, but they may also improve our understanding of how continuous environment changes have impacted the world’s biotic communities and the services provided by ecosystems [1, 2]. Previous studies showed temperate trees, including walnuts and butternuts, served as a superior framework for deciphering the ecological adaption and domestication of woody plants under global climate changing [1, 3,4,5,6,7,8,9]. The genus Juglans (Juglandaceae), namely the walnuts or butternuts, contains about 22 species widely distributed in southeastern Europe, eastern Asia, the Americas, and the west Indies [10,11,12]. Being temperate deciduous trees, all members of the Juglans are characterized by their monoecious nature, anemophilous pollination, and diploid status, featuring a consistent karyotype of 2n = 32 chromosomes. Although taxonomic studies approved four sections of Juglans, including Dioscaryon, Cardiocaryon, Rhysocaryon, and Trachycaryon, according to their morphologic characteristics, many are recognized for their propensity to interbreed both in their natural habitats and under cultivated conditions [10,11,12,13,14,15].

The Persian walnut, known scientifically as J. regia, ranks among the world’s most valuable Juglans species, prized for its cultivation that yields nuts abundant in oil and timber of exceptional quality [9, 16,17,18,19]. J. regia is not only a delicious food ingredient but can also be used to extract walnut oil for cooking and medicinal purposes. Its timber is also commonly used in furniture making and woodworking. J. regia is mainly distributed in Europe, North and South America, South Africa, Asia, Australia, and New Zealand [18, 20]. It was proposed that the J. regia originated from a hybridization of Rhysocaryon and Cardiocaryon [11]. Furthermore, the domestication history of J. regia could be tracked back to 6800 years ago, with the cradles of its cultivation considered to be in Central Asia, namely the Irano-Anatolian region [21, 22]. The native distribution of J. regia is unclear, but wild populations grow in relatively isolated favorable habitats spanning a broad geographic range from China to the Iberian Peninsula [12, 23, 24]. Due to its wide distribution, the populations of J. regia were genetically differentiated because of geographical isolation or environmental heterogeneity. In China, the J. regia was divided into four ecotypes that occupied different niches, while the southwest population generally contained relatively high genetic diversity [9, 15]. A recent study found some genomic variations related to temperature, precipitation, and altitude [16].

To date, China is recognized as a prominent center of J. regia diversity, offering extensive germplasm reservoirs for its cultivation and breeding [18]. According to FAO statistics (https://www.fao.org/faostat/zh/#data/QCL/visualize), China leaded world production with 826,012 tons from 2000 to 2022, followed by America (453,310), Iran (289,214), and Turkey (189,027). The rapid development of genomics has transformed walnut cultivation from traditional phenotype observation to modern molecular approaches. The announcements of the high-quality reference genome [23, 25,26,27] and 700 Kb SNP array [28] of J. regia made it possible to profoundly investigate its domestication history, ecological adaption, and the undying genetic basis of agronomic traits [9, 16, 18, 20, 29,30,31,32,33,34,35,36]. The publication of the genomes of closely related species such as J. sigillata [37], J. nigra [38, 39], J. mandshurica [40, 41], J. californica [42] and J. cinerea [43] has also provided raw materials for the comparative genomics study of Juglans.

Even though the nuclear genome holds the vast majority of genetic information, organelle genomes are still essential for eukaryotic organisms [44, 45]. Mitochondrion and chloroplast are cellular organelles that possess a semiautonomous genetic system within higher plant cells and harbor important genetic material [46, 47]. In recent decades, chloroplast DNA has been extensively utilized in phytogeography, evolution analysis, and DNA barcoding due to its tiny genome size, comparatively tardy substitution rates, and cytoplasmic inheritance [48, 49]. The first complete chloroplast genome of J. regia was published in 2016 [50], and more and more chloroplast researches on Juglans have emerged recently [14, 51,52,53,54,55]. However, inconsistencies were discovered in the phylogenic position of Juglans species on the trees reconstructed using nuclear and chloroplast DNA. An illustration of this was the ambiguous phylogenetic placement of J. cinerea, which transitioned from its classification as a constituent of the butternut clade in nuclear trees to its categorization as a member of the black walnut clade in plastid trees [11, 51, 54, 55].

Mitochondria are integral in plant development, ecological adaptation, and reproductive processes [56,57,58]. The analysis of mitochondrial genome sequence is essential for comprehending the evolution of various plant species [59]. Compared with the chloroplast genome, the plants’ mitochondrial genome is more variable in structures and gene orders, although its mutation rates are relatively lower [60]. Since the ancient endosymbiotic event, the plant mitogenomes have experienced swift and profound alterations in their structure, leading to a dramatically different genome length spanning from 66 kb (Viscum scurruloideum) [61] to 11.7 Mb (Larix sibirica) [62]. Additionally, the extensive recombination of lengthy repetitive sequences comprised non-coding DNA, and the integration of exogenous sequences through intracellular or horizontal transfer, contribute to the diverse molecular subunit genomes or isomer forms of plant mitochondria, encompassing circular, linear, and reticulate configurations [45, 58, 59]. Furthermore, although the functional genes show significant conservation, the quantity of genes in plant mitogenomes displays considerable variation, typically ranging from 32 to 67, accounting for approximately 10% of the whole genome [45, 58]. These distinctions in mitogenomes are evident not only among different plant species but also within the same genus or species [63, 64].

The intricate physical structures inherent in plant mitogenomes pose a significant challenge to assembling comprehensive sequences. To date, over 13,000 plastid genomes have been sequenced and deposited into the NCBI database, in contrast to approximately 673 available plant mitogenomes [45, 63, 65]. Regarding Juglans, She [66] and Zhou et al. [55] used mitochondrial fragments to study its phylogeography and phylogeny. Su et al. [67] conducted an assembly and molecular characterization of the J. mandshurica mitogenome, which contained two complete circular genomic molecules (558,032 bp and 161,386 bp), coding for 61 total genes. Assembling and unraveling a species’ mitogenome are crucial for gaining a comprehensive understanding of species’ genetic attributes and essential cellular processes, as well as for breeding research. However, the complete mitogenome of J. regia has not yet been reported, which has limited the in-depth investigation.

Thanks to the rapid development of long-reads sequencing technologies and advanced assembly strategy, long repeat regions and genomic rearrangement in the plant mitogenomes can now be effectively addressed and resolved, facilitating the study of plant mitogenomes [45, 47, 59, 60, 67]. Therefore, in this study, we have first assembled and characterized the complete mitogenome of J. regia using Illumina and Nanopore technologies. The aims of our study are the following: (1) dissect the genome features, including genome structure, length, gene contents, repetitive sequences, and codon preference; (2) conduct the comparative genome and phylogenetic analyses between J. regia and related species to reveal the mitogenome evolution; (3) assess the intracellular gene transfer among the chloroplast genome, mitogenome and nuclear genome of J. regia; and (4) identify the RNA editing sites and explore the transcriptional profile of mitochondrial gene expression in J. regia under cold stress. These results will enhance our understanding toward the structure and function of the J. regia mitogenome, and they will provide valuable molecular resources for population genetics and evolutionary studies on Juglans and related woody species.

Materials and methods

Plant materials sampling, DNA extraction, and sequencing

In this study, young and heathy leaves were collected from a single specimen of J. regia cultivar ‘Qingxiang’ cultivated in the orchard of Northwest A &F University, located in Yangling, Shaanxi, China (108°05’ E, 34°17’ N). The leaves were promptly frozen in liquid nitrogen and stored at -80 ℃ until DNA isolation. The voucher specimen was stored in the university herbarium under the accession number NAFU20241025. Genomic DNA extraction was performed using the CTAB method [68], followed by DNA quality assessment through a NanoDrop One Microvolume UV-Vis Spectrophotometer (Thermo Fisher Scientific, Massachusetts, USA). The mitogenome of J. regia was sequenced using both Illumina and Nanopore methodologies. For Illumina sequencing, 150 bp paired-end reads were generated using the NovaSeq 6000 platform with an average insert size of 350 bp. In the process of Nanopore sequencing, libraries were assembled with the SQK-LSK109 kit following standard procedures. The purified libraries were then sequenced on primed R9.4 Spot-on Flow Cells by the PromethION system (Oxford Nanopore).

Mitogenome assemble and annotation

After sequencing, FASTP [69] and NANOFILT [70] were used to purify the short and long reads with default parameters, respectively. Then, the MINIMAP2 [71] was used to align the long clean reads to the reference mitochondrial genes from the plant mitogenome database (ftp://ftp.ncbi.nlm.nih.gov/refseq/release/mitochondrion/). The sequences with a minimum overlap over 1000 bp and more than 70% similarity were exported as the candidate mitogenome sequences and corrected by CANU [72] for subsequent assembly. The assembly of corrected long reads was executed using FLYE software with three times polishing [73], to obtain the mitogenome contigs. Subsequently, we employed BWA software [74] to map the short, clean reads onto the assembled contigs with the parameters ‘bwa mem -t 40 –M’, and polished them using PILON software [75] for replicates five times. Finally, MUMMER [76] was used to test whether these contigs were circular.

In terms of the annotation for mitochondrial protein-coding genes (PCGs), Arabidopsis thaliana (NC_037304) and J. mandshurica (MZ900993 and MZ900994) were employed as references. The mitochondrial genome was annotated using the online tool GESEQ with default parameters [77]. The tRNA and rRNA were annotated using TRNASCAN-SE [78] and BLAST software with an E-value of 1E-5 [79], respectively. Any annotation errors in mtDNA were manually corrected using APOLLO software [80].

Repeat elements identification and codon usage bias inference

The SSRs were identified using the online tool MISA [81], with a minimum repeat number of 10, 5, 4, 3, 3, and 3 for mono-, di-, tri-, tetra-, penta-, and hexa-repeat units, respectively. Interspersed repeat sequences (forward, reverse, complement, and palindromic repeats) were detected using the online tool REPUTER [82] with minimum repeat size over 30 bp and Hamming distance of 3. Tandem repeat sequences were detected using the online tool TRF [83] with a minimum period size of over 30 bp.

The PCGs of J. regia mitogenome were extracted using PHYLOSUIT [84]. The online tool CUSP and CODONW software [85] were used to calculate the GC content, relative synonymous codon usage (RSCU), and effective number of codons (ENC) of PCGs. Utilizing the R package ‘ggplotg2’, a scatter plot was constructed, positioning the ENC value along the vertical axis while the GC3 value was depicted on the horizontal axis. The anticipated ENC value’s standard curve was created employing the formula ENC = 2 + GC3 + 29/(GC32 + (1-GC3)2) [86]. If codon usage bias is predominantly driven by mutational pressures, the data points will manifest above or slightly below this standard curve, while their placement below it indicates that codon preference is primarily influenced by natural selection [87]. The GC plot was generated with R package ‘ggplot2’, in which the average value of GC1 and GC2, known as the GC12, was on the vertical axis while the GC3 was on the horizontal axis.

Homologous fragment analysis

The GETORGANELLE software [88] was used to assemble the chloroplast genome of J. regia cultivar ‘Qingxiang’ with the default parameters based on short reads, followed by genome annotation using CPGAVAS2 [89]. The nuclear genome of J. regia ‘Chandler 2.0’ was obtained from NCBI (accession number: GCF_001411555.2). The BLAST program was employed to identify homologous fragments between the mitogenome and the chloroplast genome, and between the mitogenome and nuclear genome of J. regia with an e-value of 1E-5. The BEDTOOLS [90] was used to annotate the putative homologous fragments.

Comparative genomic analysis of seven Fagales species

The complete mitogenomes of six Fagales species including J. mandshurica (MZ900993, MZ900994), Betula pendula var. carelica (OR496170), Quercus variabilis (MN199236), Q. acutissima (MZ636519), Lithocarpus litseifolius (NC_065018) and Fagus sylvatica (NC_050960) were downloaded from NCBI. The Cucurbita pepo (NC_014050) was employed as the outgroup. Including J. regia, a total of 24 shared PCGs among these genomes were extracted using PHYLOSUIT, followed by sequence alignment using MAFFT [91]. Subsequently, an ML tree was constructed using IQTREE software [92], in which the GTR + F + I was adopted as the best substation model according to the BIC score. The collinearity analysis among these genomes was implemented using BLAST with an e-value of 1E-5 and visualized by LINKVIEW2 (https://github.com/YangJianshun/LINKVIEW2). The DNASP software [93] was employed to estimate the haplotype diversity and nucleotide diversity of these PCGs (considering the gap model), and the POPART [94] was used to construct the haplotype network based on the Median-joining method. To estimate the selective pressure, we employed the KAKS_CALCULATOR [95] to calculate the synonymous (Ks), nonsynonymous (Ka), and the ratio of Ka/Ks based on the YN method of [96], and Fisher’s exact test was implemented to validate the significance level.

RNA editing sites prediction and mitochondrial gene expression analysis

The RNA editing sites in J. regia mitogenome were predicted using the PREPACT [97]. The Arabidopsis thaliana and Cucurbita pepo mitochondrial proteins were used as references, respectively. To validate the prediction results, we employed 204 transcriptome accessions (Table S1) of J. regia from NCBI for RNA editing sites calling. The transcriptome data were pruned using FASTP software. The clean data was mapped to mitochondrial DNA sequences using HISAT2 [98], followed by sorting and merging using SAMTOOLS [99]. To identify the RNA editing sites, the comparison between DNA and RNA sites was executed by REDITOOLS [100] with a coverage depth ≥ 100× and editing frequency ≥ 0.10.

To explore the expression pattern of mitochondrial genes in different tissues, we downloaded 22 transcriptome data of J. regia from 15 different tissues (vegetative bud, leaf, root, callus interior, callus exterior, catkins, somatic embryo, fruit, hull, packing tissue, hull peel, hull cortex, pellicle, embryo, and pistillate flower) from NCBI under the accession number PRJNA232394 [101]. The cold treatment transcriptome data of J. regia were downloaded from the NCBI under the accession number PRJNA942426 (leaf tissues, 4 ℃ treatment for 0 h, 3 h, 6 h, 12 h and 24 h with three replicates) [26]. Additionally, in 2023, we collected fresh leaves from five J. regia cultivars (‘Daguanmao’, ‘Zijing’, ‘Lvling’, ‘Hongren’ and ‘Liao1’) under three low temperatures conditions in cold late spring: 8 °C (4 April), 22 °C (9 April), and 5 °C (23 April), respectively (PRJNA1051792) [102]. All trees were grown in Xi’an Botanical Garden (109°1′ E, 34°12′ N) Shaanxi Province, China. For each temperature setting and cultivar, we gathered three biological replicates. After data filtering, HISAT2 [98], FEATURECOUNTS [103], and R package DESEQ2 [104] were employed in transcriptome mapping, quantification, and differential expression analysis, respectively.

Phylogenetic tree construction

The complete mitogenomes of 37 species from six Order (Rosales, Fagales, Cucurbitales, Fabales, Malpighiales, and Zygophyllales) which related to J. regia were downloaded from the NCBI (Table S2), with Zygophyllum fabago and Tetraena mongolica employed as outgroups. Together with J. regia, the PHYLOSUITE software [84] was used to extract 20 shared PCGs among these species. After alignment using MAFFT [91], IQTREE [92] was employed to construct an ML tree under the best-fit substation model GTR + F + I + G4, with 1000 times Ultrafast bootstraps. Finally, the consensus tree was visualized and edited using the online tool ITOL (https://itol.embl.de/).

Results

Assembly of J. regia mitogenome and its molecular characteristics

After sequencing, the J. regia mitogenome was assembled utilizing 8.8 Gb Illumina data and 5.8 Gb Nanopore data using a joint assembly strategy. The genome sketch was relatively complex, with multi-bifurcating structures (Fig. S1a). We manually obtained a simplified molecular structure of J. regia mitogenome by dropping duplicated regions from Nanopore data (Fig. S1b). It consisted of three circular contigs (Fig. 1) with 1,007,576 bp total length and 44.52% GC content. The lengths of circular molecule chromosome 1 (Fig. 1a), 2 (Fig. 1b), and 3 (Fig. 1c) were 538,986 bp, 294,127 bp, and 174,463 bp, respectively, and the corresponding GC contents were 45.37%, 45.46%, and 40.35%, respectively.

The mitochondrial genome was additionally annotated, revealing a total of 39 PCGs, 47 tRNA genes, and five rRNA genes. The total length of all putative PCGs was 35,094 bp, accounting for only 3.48% of the mitogenome. Furthermore, the 37 putatively unique genes (nad5 has two copies, and nad7 has two copies) could be categorized into 24 core genes and 13 variable genes (Table 1). The 24 core genes consisted of nine NADH dehydrogenase genes, one cytochrome b oxidase gene, three cytochrome c oxidase genes, four cytochrome c biogenesis genes, five ATP synthase genes, a maturases matR and a transport membrane protein mttB. The 13 variable genes included five large subunits of the ribosomal, six small subunits of the ribosomal, and two succinate dehydrogenase genes. A total of 21 introns were found across nine PCGs, in which the ccmFc, rps3, cox2, and rps10 contained one intron, the nad4, nad1, and nad5 had three introns while the nad2 and nad7 had four introns.

Table 1 Gene composition in the J. regia mitogenome

In terms of the non-coding RNA genes, we identified three unique rRNAs and 27 unique tRNAs, respectively (Table 1). For the rRNAs, rrn18 and rrn5 contained two copies. For the tRNAs, the trnD-GUC, trnF-GAA, trnG-GCC, trnN-GUU, trnQ-UUG trnS-UGA, trnW-CCA, and trnY-GUA contained two copies, the trnC-GCA and trnP-UGG contained three copies, the trnE-UUC had four copies while the trnM-CAU possessed six copies.

Repeat sequence analysis

There were many repetitive sequences detected in J. regia mitogenome (Fig. 2). A total of 320 SSRs were identified across three chromosomes (Fig. 2a, b and c). The chromosome 1 contained the most number of SSRs (160), followed by the chromosome 2 (86) and 3 (74). In summary, the tetramer repeats were the most abundant SSR type, making up 33.13% of all detected SSRs. This was succeeded by monomer and dimer SSRs, representing 25% and 23.13% of the total SSR count, respectively. The counts for trimer (49) and pentamer repeats (10) were relatively lower. Among the monomer SSRs, those consisting of A/T bases constituted 92.5%, while for dimer SSRs, those with AT/TA bases made up 51.35%. Unlike the chromosome 1 (Fig. 2a) and 2 (Fig. 2b), the mono SSRs were the most abundant in chromosome 3 rather than the tetramer SSRs (Fig. 2c). Only one hexanucleotide repeat was also found on chromosome 3.

In addition to SSRs, we also identified the long repeat sequences in J. regia mitogenome (Fig. 2d, e and f). A total of 512 interspersed repeats were identified, with the palindromic repeats being the most abundant (244), followed by the forward repeats (234). Chromosome 1 possessed the largest number (364) of interspersed repeats (Fig. 2d), while chromosome 3 contained the minimum number (66) of interspersed repeats (Fig. 2f). Further, a total of 83 tandem repeats were found in J. regia mitogenome. The majority of tandem repeat lengths were concentrated between 1 and 20 bp, followed by 21 bp to 40 bp. Notably, no tandem repeats extended beyond 40 bp.

Analysis of relative synonymous codon usage

An analysis of codon usage bias analysis was conducted across 37 unique PCGs. The guanine-cytosine content was determined for each of the PCSs’ codon positions (GC1, GC2, and GC3), yielding average percentages of 47.60%, 42.39%, and 37.59%, respectively. These findings point to a marked preference for A/T bases and codons ending in A/T within the PCGs of J. regia. Additionally, the ENC was computed for these 37 PCGs, with values spanning from 36.37 to 57.87%. The mean ENC value exceeded 35, suggesting a relatively low level of codon usage bias in the mitochondrial PCGs of J. regia.

The ENC-plot (Fig. 3a) depicted a significant trend among the majority of PCGs in J. regia, as they were positioned below the standard curve, with a solitary exception: the gene ccmFc, which was situated above. This distribution pattern implied that selective pressures exert a predominant influence on the codon preferences within the mitochondrial genome of J. regia. In addition, the neutrality plot analysis (Fig. 3b) demonstrated a negligible correlation of 2 × 10− 4 (p > 0.05) between GC12 and GC3, which consistent with the relatively low regression coefficient (5.148 × 10− 4) and R2 (5.919 × 10− 7) between the GC12 and GC3 (Fig. 3b). These results collectively indicated that the bias in codon usage in J. regia mitochondrial DNA was predominantly driven by the forces of natural selection rather than by mutational processes.

Furthermore, we performed the RSCU analysis to evaluate the codon usage bias (Fig. 3c). The analysis identified 30 codons with RSCU values exceeding 1, signifying that these codons were used more frequently than their synonymous counterparts. All of these codons ended with A/U, except UUG and ACC, which coded for leucine and threonine, respectively. The start codon was AUG, and the codon for tryptophan was only UGG, and their RSCUs were both of 1.00. The terminator codon (Ter) preferred UAA (RSCU = 1.38) or UGA (RSCU = 1.08) rather than UAG (RSCU = 0.54).

DNA transfer among mitochondria, chloroplast and nucleus

To explore the intergenomic DNA transfers, we newly assembled and annotated the chloroplast genome of J. regia cultivar ‘Qingxiang’. The chloroplast genome was 160,537 in length with 84 unique PCGs (rpl2, rpl23, rps7, rps12, ndhB and ycf2 had two copies) (Fig. 4a). We screened 44 homologous fragments between the mitochondria and chloroplast, ranging in length from 33 to 7593 bp, with a total length of 36,051 bp, which accounted for 3.58% of the mitogenome (Fig. 4b). The mitochondrial chromosome 1 contained the most number of the homologous fragments (24) with the chloroplast genome, followed by the chromosome 2 (12) and 3 (8). These homologous fragments were involved in 39 transferred genes, including four PCGs, three rRNAs and 32 tRNAs. In addition, we also identified 2960 homologous fragments between the mitogenome and the nuclear genome, of which there were 113 fragments with an identity value over 80% (Table S3). These high-identity homologs were mainly located between mitogenome chromosome 3 and nuclear genome chromosomes 7 and 10, and were involved in 12 PCGs, two rRNAs, and seven tRNAs.

Comparative mitogenome analysis among Fagales species

A total of seven Fagales species that possessed complete mitochondrial genomes were employed in comparative analysis, where the Cucurbitales species Cucurbita pepo was an outgroup. After scanning, a total of 24 shared PCGs (Fig. S2) were extracted to construct an ML tree (Fig. 5a). As expected, the J. regia and J. mandshurica presented a closer relationship with high bootstrap value and clustered together with the species of Betulaceae. The other four Fagaceae species were clustered in the same clade. Although there existed significant inversion, spices with closer relationship generally exhibit higher collinearity between them, for example, the J. regia and J. mandshurica, and the Quercus variabilis and Q. acutissima (Fig. 5b). Between the J. regia and its relatives J. mandshurica, we found 66 homologous fragments with the length over than 3000 bp, consisted of 28 collinear blocks and 38 inversion regions, accounting for 59.75% and 83.76% of the entire J. regia and J. mandshurica mitogenome, respectively.

In addition to the genomic structure variation, we also calculated the genetic diversity of these shared PCGs (Table S4; Fig. S3). The haplotype number of these genes ranged from four to eight. The haplotype diversity (Hd) ranged from 0.75 to 1, with an average value of 0.942. The gene sdh3 had the highest nucleotide diversity (π) while the gene nad7 had the lowest nucleotide diversity. The haplotype network (Fig. S3) showed that these genes could distinguish these species well. Akin to the ML tree (Fig. 5a), the J. regia and J. mandshurica presented a closer relationship in the haplotype network and even shared a common haplotype in the same genes such as atp4, atp6, ccmB, nad3, rpl5, rps12, and sdh4 (Fig. S3).

To assess selection pressures exerted during the evolutionary process of PCGs among these closely related species, the ratio of nonsynonymous (Ka) to synonymous (Ks) substitutions, known as the Ka/Ks, were calculated (Fig. 6). The Ka ranged from the 0.0009 to 0.1425 while the Ks ranged from 0.0020 to 0.1944 (Fig. 6a). Overall, although most of gene pairs were showed purifying selection, we found 63 genes pairs contained the Ka/Ks value over than 1, indicating that these genes were under positive selection among these species (Fig. 6a). We found that ccmB, atp4, nad7, and nad6 often had the Ka/Ks value over than 1, indicating the positive selection (Fig. 6b). In terms of J. regia, we found 24 gene pairs under positive selection while 144 gene pairs were under purifying selection. Compared with the J. mandshurica, two important genes were under significantly positive selection, namely the atp6 (Ka/Ks = 1.3071) and nad7 (Ka/Ks = 1.2578).

RNA editing sites prediction and validation

There were a total of 539 RNA editing sites (C → T) predicted by PREPACT, and they were spread throughout all PCGs of the J. regia except rpl23 (Fig. 7a). The nad4 had the most number of RNA editing sites (34), followed by ccmB (32) and mttB (32). The transcriptome data identified 629 RNA editing sites, and 64.86% of them with high editing frequencies above 0.80 (Fig. 7b). There were 288 RNA editing sites supported by both PREPACT prediction and transcriptome validation (Fig. 7c), which could be regarded as the high confidence RAN editing sites. Among these sites, we found two sites in nad4 (Fig. 7d) and three sites in ccmB (Fig. 7e) with overwhelmingly high editing frequencies. The two sites were located at 1355 bp (editing frequency: 0.95) and 1373 bp (editing frequency: 0.98) on nad4. The RNA editing event trigged the non-synonymous codon changes from CCA (proline) to CTA (leucine), and TCC (serine) to TTC (phenylalanine), respectively (Fig. 7c). In ccmB, three sites were located at 566 (editing frequency: 0.85), 569 (editing frequency: 0.88) and 572 (editing frequency: 0.92) changed from C to T, leading to three continuous non-synonymous codon transformation corresponding to TCC (serine) to TTC (phenylalanine), TCT (serine) to TTT (phenylalanine), and CCG (proline) to CTG (leucine).

Transcriptome expressions profile under temperature stress

Additionally, leveraging the transcriptome data, we investigated the gene expression profiles of mitochondrial PCGs in J. regia across various tissues and under diverse temperature conditions (Fig. 8). The tissue-specific patterns were identified through analysis of the gene expression profiles (Fig. 8a). Notably, many genes predominantly expressed in the pellicle, such as nad9, mttB and atp6. The rps7 was mainly expressed in the callus exterior, while the atp9, atp10, rps10, rps12, ccmb and rpl2 were mainly expressed in the callus interior. We also observed that the rpl23 tended to express in the leaf while the atp1 and sdh4 had a high expression level in the vegetative bud. Some genes seemed not to be expressed in specific tissues, for example, the nad9 in catkins, the nad4L in catkins and hull cortex, and the rps7 in embryo, hull, pellicle, and somatic embryo.

To investigate the dynamics of these genes under temperature stress, multiple comparisons of gene expression patterns among different treatments were conducted (Fig. 8b, c, d and e; Table S5). Overall, we found that rps7, ccmB, and ccmC had relatively low expression levels in the leaf (Fig. 8b and c), which was consistent with the tissues specific expression analysis (Fig. 8a). A total of 28 genes (Table S5) occurred 86 times different expression including 37 upregulated and 49 downregulated among 25 comparison pairs (Fig. 8d and e; Table S5). We observed the most number of differently expressed genes (DEGs) in the comparison of HR5_vs_HR22. Therein, three genes were significantly upregulated including rpl23 (log2FC = 4.82; FDR = 1.82 × 10− 5), ccmB (log2FC = 4.30; FDR = 0.03) and cox3 (log2FC = 1.39; FDR = 0.04) while six genes were significantly downregulated, including matR (log2FC = − 2.29; FDR = 4 × 10− 3), rps12 (log2FC = − 2.07; FDR = 0.01), rps10 (log2FC = − 1.55; FDR = 0.03), nad3 (log2FC = − 1.41; FDR = 0.03), nad5 (log2FC = − 2.35; FDR = 1.87 × 10− 6) and nad5-copy2 (log2FC = − 2.36; FDR = 5.45 × 10− 6). However, we also found no DEGs detected between comparison under 4 ℃ treatment for 12 h and 3 h, and 24 h and 0 h, respectively.

It was noteworthy that some genes presented relatively complex regulated patterns (Fig. 8e; Table S5). They showed upregulated or downregulated expressions in different comparisons. For example, the rps12 was upregulated in six comparisons while the downregulated in three comparisons, respectively. The nad3 was upregulated in three comparisons while downregulated in five comparisons, respectively (Table S5). This indicated the relatively complex regulation network of mitochondrial PCGs response to cold conditions. To reveal the regulation relationship of these DEGs, we constructed a protein-protein interaction network (PPI) (Fig. S4). We noticed that the rps12, rpl6, atp9, atp6 and nad7 clustered at the center of the network and had strong interactions with other genes, possibly playing key roles in response to low-temperature stress in J. regia mitochondria.

Construction of a phylogenetic tree based on the PCGs

A maximum likelihood tree was constructed using 20 shared PCGs from 38 species belonging to six Orders (Rosales, Fagales, Cucurbitales, Fabales, Malpighiales, Zygophyllales) within the Fabids clade (Fig. 9). The phylogenetic tree, constructed using data from the mitogenome, closely reflected the taxonomic affinities among these species. All six Orders were well clustered. The Zygophyllales species Zygophyllum fabago and Tetraena mongolica, as outgroups, were clustered together at the base of the phylogenetic tree with a bootstrap value of 100. The two Juglans species (J. regia and J. mandshurica) were clustered into the same branch, showing a closer relationship with Betula. In summary, the tree topology derived from the mitochondrial DNA phylogeny aligned well with the categorization of the angiosperm phylogenetic group, suggesting that the mitogenome-based phylogeny was trustworthy.

Discussion

Molecular characterization of the J. regia mitogenome

Plant mitochondria are vital for energy metabolism and adapt to environmental stresses through genomic variation, contributing to plant evolution and diversity [66, 105]. Their complex genomes, characterized by structural changes, gene transfers, recombination, and interactions with other genomes, pose challenges for assembly and analysis [106,107,108]. However, recent advances in sequencing and assembly technologies have facilitated the study of plant mitogenomes, leading to an increase in high-quality assemblies [45].

Recent studies have uncovered diverse mitogenome structures across plant species. For instance, Salvia officinalis had two circular chromosomes [109], Paphiopedilum micranthum had 26 [106], Ombrophytum subterraneum had 54 [110], Angelica biserrata possessed six [59], while the Selenicereus monacanthus [58], Ilex metabaptista [45] and Asparagus officinalis [111] each contained a single major circular chromosome. Some species also exhibited linear mitogenomes [112]. Our study introduced the mitogenome of J. regia, featuring three circular chromosomes (Fig. 1 and S1), contrasting with its close relative J. mandshurica, which has only two [67]. The differentiation in mitogenomes was observed not only between closer species but also within the same species [45, 63, 64], due to the frequent recombination [113]. Compared with J. mandshurica, the J. regia was more domestic [8, 9, 11]. The distinction in demography history between related species might be the primary reason for the differences in their mitochondria [114].

Despite genomic structure differences, J. regia and J. mandshurica had similar GC contents of 44.52% and 45.15%, respectively [67], which fall within the 23.9–50.5% range for terrestrial plants [58]. The GC content significantly affects amino acid composition within the protein group throughout the evolutionary progression of land plants [59], whose diversity in plant mitogenomes reflects their adaptive consequences [58].

Our assembly annotated 39 PCGs, similar to J. mandshurica [67]. Typically, PCGs constituted about 10% of a mitogenome, but our assembly showed only 3.48%, aligning with J. mandshurica but lower than Liriodendron tulipifera (7.9%) [60] and Populus simonii (8.25%) [115]. The L. tulipifera had a “fossilized” mitogenome that evolved very slowly, retaining 41 PCGs from the ancestral angiosperms [60]. Compared to each other, the J. regia lost the rps19, rps1 and cob genes, while the J. mandshurica lost the cytB, nad1 and rps11, possibly due to mitogenome recombination and rearrangement. Additionally, the nad5 and nad7 had two copies in J. regia mitogenome, suggesting a potential for gene redundancy and functional divergence (Table 1; Fig. S2) [116].

Repeated sequences in seed plants’ mitochondrial DNA facilitate gene recombination and altered genome structure and size [112]. Studies on Monsonia ciliata [117] and Ipomoea batatas [118] highlighted the role of repeats in mitogenome dynamics and recombination. In our study, 320 SSRs, 83 tandem repeats, and 512 dispersed repeats were identified (Fig. 2), contributing to J. regia mitogenome complexity. Beyond merely sculpting the genomic framework, these repetitive sequences can affect plant phenotypic traits by modulating the expression of regulatory genes. Furthermore, they offer numerous reference loci for enhancing species identification and unraveling the intricacies of genetic evolution [58].

Codons play a crucial role in genetic information translation, with its usage varying among species due to evolutionary pressures [58, 59]. In J. regia, most PCGs initiated with the ATG start codon, and their amino acid composition distribution paralleled that observed in other flowering plants [45, 58, 59]. Analysis of codon composition in J. regia mitogenome revealed a pronounced preference for certain codons, with 30 exhibiting an RSCU value exceeding 1 (Fig. 3), and most of them ended with A/T bases, consisting with plant mitochondrial base composition [58, 59, 107, 112]. Both the neutrality plot and ENC-plot analyses (Fig. 3) demonstrated that the codon usage patterns in J. regia mitogenome were predominantly driven by natural selection, aligning with findings in Juglandaceae [119].

The comparative genomics analysis and gene transfer revealed the mitogenome evolution

Comparative analysis revealed that J. regia and its relatives experienced mitochondrial genome structural variations, especially inversions, leading to gene rearrangement and loss (Fig. 5 and S2), which were key in plant mitogenome evolution and diversification [45]. The count of mitochondrial genes in the current plant varied significantly, from as few as 14 in Viscum minimum to more than 50 in Marchantia polymorpha [65]. We observed that Fagales showed significant variation in RNA ribosomal protein gene number (Fig. S2), similar to Alismatales. Additionally, prior research had indicated a considerable variation in tRNA gene content within closely related species, such as Silene and Slaginella [65]. In our study, the number of tRNA genes was also significantly different among the studied species. The number of tRNAs was 47, 29, 21, 18, 21, 19 in J. regia, J. mandshurica, Quercus variabilis, Q. acutissima, Lithocarpus litseifolius, and Fagus sylvatica, respectively.

In plants, gene transfer serves as an alternative mechanism for the acquisition and loss of mitochondrial genes, predominantly involving those that encode ribosomal proteins [58, 120, 121]. The horizontal gene transfer (HGT) and intracellular gene transfer (IGT) were two major transfer phenomena in mitogenomes. IGTs in plant cells are dynamic and have practical application potential. They can replicate genes in new genomes or create new gene constructs through chimeric ORFs, diversifying plant genomes [65]. We identified 44 and 2960 IGT fragments between mitogenome and chloroplast genome, and nuclear genome, respectively (Fig. 4; Table S3). Interestingly, these IGTs included not only RNA genes and ribosomal genes but also some NADH dehydrogenase and ATP synthase genes, indicating frequent IGT in J. regia. Future research is needed to understand the role of IGTs in shaping genomic diversity in Juglans.

Although the number of PCGs was variable in Fagales, we found 24 shared PCGs in Fagales and Cucurbita pepo (Figs. S2 and S3; Table S4). Heterogeneous substitution rates of mitogenome have been reported in certain lineages; for example, the substitution rates of Silene noctiflora and S. latifolia differ by 180-fold [122]. In this study, we used Ka/Ks analysis to assess the substitution rates of shared mitochondrial PCGs (Fig. 6). The results showed that most PCGs had purifying selection during the evolution process, indicating that the PCGs in the mitogenome were relatively well-conserved, which was consistent with previous studies [45, 59]. However, we noticed that the ccmB was under positive selection among many species pairs (Fig. 6b), consistent with other studies [45, 123]. The ccmB gene is vital for cytochrome c biogenesis in plant mitochondria, impacting heme transport, assembly, and respiration efficiency, potentially driving positive selection in plant adaptation to diverse environments [124].

The pronounced variability of these shared PCGs (Fig. S3; Table S4) made them valuable phylogenetic markers, revealing evolutionary relationships across both Fagales (Fig. 5a) and Fabids (Fig. 9). In current study, the phylogenetic relationships among 38 Fabids species obtained from mitogenome were consistent with the topology of them in APG IV taxonomic system (https://www.mobot.org/MOBOT/research/APweb/), suggesting the the alignment between traditional and molecular taxonomic approaches and highlighting the feasibility of leveraging mitogenome-derived data in plant phylogeny. Additionally, these results will lay the foundation for identifying further complex evolutionary relationships within Juglans or even Juglandaceae with more mitogenomes publishing.

The RNA editing and expression profile of mitochondrial PCGs

RNA editing, a key post-transcriptional process in higher plant mitochondria, modifies mRNA by base substitution, primarily C to T, as shown in previous studies [45, 58, 59]. Our study identified reliable RNA editing sites in J. regia mitogenome using bioinformatics prediction and transcriptome validation, finding all PCGs except for rpl23 contained RNA editing sites (Fig. 7). The number of RNA editing sites in J. regia was similar to other angiosperm plants including Ilex metabaptista (543) [45] Selenicereus monacanthus (398) [58], Angelica biserrata (474) [59], but less than those of gymnosperms such as Taxus cuspidate (974) [125]. In addition to causing changes in protein conformation, most of the amino acids in J. regia mitogenome underwent nonsynonymous substitution from hydrophilic amino acids to hydrophobic amino acids (80.43%) due to the RNA editing, resulting in increased protein hydrophobicity, which was consistent with previous study [59]. Earlier research highlighted the importance of hydrophilic amino acids in protein folding, promoting proper structure formation, while their reduced presence correlated with increased protein stability [59].

Our study elucidated the tissue-specific expression profiles of PCGs (Fig. 8), revealing the intricate interplay and functional divergence among mitochondrial genes. The transcriptome profile following temperature treatment revealed the intricate dynamics of genes responses to stress (Fig. 8). In the various treatment comparisons, a total of 28 genes were recognized as DEGs, with 12 showing both upregulation and downregulation (Fig. 8e; Table S5).

There is a strong correlation between plant mitochondrial function and key characteristics such as the capacity to withstand stress and the robustness of plant growth. Mitochondrial genes responded to temperature stress is a complex physiological and molecular process, involving multi-faceted regulation and interplay. Akin to our results, in wheat embryos subjected to cold stress, the expression levels of 13 mitochondrial genes underwent specific alterations, with the majority being downregulated initially, whereas the expression of specific genes, such as nad, atp and cob, was significantly upregulated after 2 to 3 days of cold stress exposure [126]. In Arabidopsis thaliana, the expression of the nad1 and nad2 genes exhibited opposite regulatory patterns under cold and heat stress [127]. In the PPI network of DEGs (Fig. S4), together with the ribosomal subunit rps12 and rpl16, some NADH dehydrogenase genes (nad5, nad7 and nad4L) and ATP synthase genes (atp6, apt9, and apt4) locating in the core of the network, presented relatively high interaction relationship with others, implying their important role in response to temperature stress. In the future, the functions of these genes will need an in-depth explanation, although atp6 has been reported to be associated with the response to cold stress in A. thaliana [128].

Conclusion

This was the first published assembly of the economic tree species J. regia mitogenome, which was 1,007,576 bp in length with a GC content of 44.52%. It encoded 39 PCGs, 47 tRNA genes, and five rRNA genes, respectively. Then, the repeat sequences, codon usage bias, and intracellular gene transfer between organelle and nuclear were analyzed. Comparative genomic analysis highlighted abundant structural and sequence variation across Fagales species. Most PCGs showed signs of purifying selection, as indicated by Ka/Ks ratios below 1. The phylogeny based on mitochondrial PCGs could distinguish the studied species well. RNA editing sites widely existed in the PCGs of J. regia’s mitochondria. Tissues-specific expression pattern was identified in mitochondrial PCGs, indicating the important role of mitochondrial PCGs in J. regia development. Under temperature stress, PCG expression exhibited complex changes, suggesting that NADH dehydrogenase and ATP synthase genes might be key in the mitochondrial response to such stress. Our findings contribute to the field of population genetics and evolutionary studies, particularly for Juglans and related tree species.

Fig. 1
figure 1

Circular map of the J. regia mitogenome, consists of chromosome 1 (a), chromosome 2 (b) and chromosome 3 (c). Genes belonging to different functional groups are color-coded. Genes those shown on the outside and inside of the circle are transcribed clockwise and counterclockwise, respectively. The dark orange region in the inner circle represents the GC content

Fig. 2
figure 2

Analysis of repeat fragments from the J. regia mitogenome. (a), (b), and (c): The number of simple sequence repeats (SSRs) in three chromosomes of J. regia mitogenome. The “Mono”, “Di”, “Tri”, “Tetra”, “Penta”, and “Hexa” correspond to monomeric, dimeric, trimeric, tetrameric, pentameric and hexameric SSRs, respectively. (d), (e) and (f): The number of long repeats in three chromosomes of J. regia mitogenome. The count of long repeats is indicated by TR (tandem repeats), FR (forward repeats), RR (reverse repeats), PR (palindromic repeats), and CR (complementary repeats), respectively

Fig. 3
figure 3

The landscape of codon usage bias in the J. regia mitogenome. (a) The ENC plot of protein coding genes in the J. regia mitogenome. (b) The analysis of neutrality plot of protein coding genes in the J. regia mitogenome. (c) Relative synonymous codon usage (RSCU) in the J. regia mitogenome

Fig. 4
figure 4

The newly assembled complete chloroplast genome (a) and DNA transfer analysis (b) of J. regia. The orange, pink and green lines between the arcs correspond to the homologous genomic fragments between mitochondrial chromosome 1, 2, and 3 and the chloroplast genome, respectively

Fig. 5
figure 5

Phylogenetic analysis (a) and collinearity analysis (b) between J. regia and related species. The order of the species from top to bottom in the collinearity is consist with the phylogenetic tree. The pink line represents collinear block while the orange line indicated the inversion region

Fig. 6
figure 6

Selective pressure of shared protein coding genes in J. regia and related species. (a) The x-axis represents the Ks (synonymous substitutions) value, while the y-axis represents the Ka (non-synonymous substitutions) value. Pink dots represent gene pairs under positive selection (Ka/Ks > 1), and orange dots represents gene pairs under purifying selection (Ka/Ks < 1). (b) Overall distribution pattern of the Ka/Ks value of each shared gene

Fig. 7
figure 7

The RNA editing distribution pattern in protein coding genes of J. regia. (a) The number of predicted RNA editing sites in each protein coding genes. (b) The editing frequency distribution obtained from transcriptome data. (c) The venn plot showed a total of 539 RNA editing sites were predicted, with 288 supported by transcriptome validation. (d) An example for changes of local sequence of nucleotide (1351 bp to 1377 bp) and amino acid (451 aa to 459 aa) in nad4. (e) An example for changes of local sequence of nucleotide (562 bp to 576 bp) and amino acid (188 aa to 192 aa) in ccmB

Fig. 8
figure 8

The expression profile of protein coding genes of J. regia. (a) Expression patterns of mitochondrial protein coding genes in different tissues. Expression patterns of mitochondrial protein coding genes under abiotic stress treatment include (b) 5 ℃, 8 ℃, and 22 ℃ treatment for cultivar ‘DGM’, ‘HR’, ‘ZJ’, ‘LL’ and ‘LY’. (c) 4 ℃ treatment to leaf tissues for 0 h, 3 h, 6 h, 12 h and 24 h. (d) The number of differently expressed genes in treatment (b) and (c). (e) The UpSet plot of differently expressed genes in different compassion

Fig. 9
figure 9

Construction of the maximum likelihood tree based on the 38 species

Data availability

The raw sequences data of J. regia mitogenome were deposited in the SRA database under the accession number PRJNA1162256. The assembled mitogenome and genome annotation of J. regia were deposited in the figshare database (https://doi.org/10.6084/m9.figshare.27037318.v1). The transcriptome data used in this study could be obtained from the SRA database under the accession number PRJNA942426, PRJNA1051792, and PRJNA232394.

References

  1. Song YG, Fragnière Y, Meng HH, Li Y, Bétrisey S, Corrales A, Manchester S, Deng M, Jasińska AK, Văn Sâm H, et al. Global biogeographic synthesis and priority conservation regions of the relict tree family Juglandaceae. J Biogeogr. 2020;47(3):643–57.

    Article  Google Scholar 

  2. Shiono T, Kusumoto B, Yasuhara M, Kubota Y. Roles of climate niche conservatism and range dynamics in woody plant diversity patterns through the Cenozoic. Glob Ecol Biogeogr. 2018;27(7):865–74.

    Article  Google Scholar 

  3. Bai WN, Yan PC, Zhang BW, Woeste KE, Lin K, Zhang DY. Demographically idiosyncratic responses to climate change and rapid pleistocene diversification of the walnut genus Juglans (Juglandaceae) revealed by whole-genome sequences. New Phytol. 2018;217(4):1726–36.

    Article  CAS  PubMed  Google Scholar 

  4. Bai WN, Wang WT, Zhang DY. Phylogeographic breaks within Asian butternuts indicate the existence of a phytogeographic divide in East Asia. New Phytol. 2016;209(4):1757–72.

    Article  CAS  PubMed  Google Scholar 

  5. Bai WN, Wang WT, Zhang DY. Contrasts between the phylogeographic patterns of chloroplast and nuclear DNA highlight a role for pollen-mediated gene flow in preventing population divergence in an east Asian temperate tree. Mol Phylogenet Evol. 2014;81:37–48.

    Article  PubMed  Google Scholar 

  6. Wang WT, Xu B, Zhang DY, Bai WN. Phylogeography of postglacial range expansion in Juglans mandshurica (Juglandaceae) reveals no evidence of bottleneck, loss of genetic diversity, or isolation by distance in the leading-edge populations. Mol Phylogenet Evol. 2016;102:255–64.

    Article  CAS  PubMed  Google Scholar 

  7. Geng FD, Lei MF, Zhang NY, Fu YL, Ye H, Dang M, Zhang XD, Liu MQ, Li MD, Liu ZL, et al. Demographic complexity within walnut species provides insights into the heterogeneity of geological and climatic fluctuations in East Asia. J Syst Evol. 2024;62(5):1037–53.

    Article  Google Scholar 

  8. Li M, Ou M, He X, Ye H, Ma J, Liu H, Yang H, Zhao P. DNA methylation role in subgenome expression dominance of Juglans regia and its wild relative J. mandshurica. Plant Physiol. 2023;193(2):1313–29.

  9. Luo X, Zhou H, Cao D, Yan F, Chen P, Wang J, Woeste K, Chen X, Fei Z, An H, et al. Domestication and selection footprints in Persian walnuts (Juglans regia). PLoS Genet. 2022;18(12):e1010513.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Aradhya MK, Potter D, Gao F, Simon CJ. Molecular phylogeny of Juglans (Juglandaceae): a biogeographic perspective. Tree Genet Genomes. 2007;3(4):363–78.

    Article  Google Scholar 

  11. Zhang BW, Xu LL, Li N, Yan PC, Jiang XH, Woeste KE, Lin K, Renner SS, Zhang DY, Bai WN. Phylogenomics reveals an ancient hybrid origin of the Persian walnut. Mol Biol Evol. 2019;36(11):2451–61.

  12. Zhao P, Zhou HJ, Potter D, Hu YH, Feng XJ, Dang M, Feng L, Zulfiqar S, Liu WZ, Zhao GF, et al. Population genetics, phylogenomics and hybrid speciation of Juglans in China determined from whole chloroplast genomes, transcriptomes, and genotyping-by-sequencing (GBS). Mol Phylogenet Evol. 2018;126:250–65.

    Article  PubMed  Google Scholar 

  13. Manning WE. The classification within the Juglandaceae. Ann Mo Bot Gard. 1978;65(4):1058–87.

    Article  Google Scholar 

  14. Hu Y, Woeste KE, Zhao P. Completion of the chloroplast genomes of five Chinese Juglans and their contribution to chloroplast phylogeny. Front Plant Sci. 2017;7:1995.

    Article  Google Scholar 

  15. Feng X, Zhou H, Zulfiqar S, Luo X, Hu Y, Feng L, Malvolti ME, Woeste K, Zhao P. The phytogeographic history of common walnut in China. Front Plant Sci. 2018;9:1399.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Ji F, Ma Q, Zhang W, Liu J, Feng Y, Zhao P, Song X, Chen J, Zhang J, Wei X, et al. A genome variation map provides insights into the genetics of walnut adaptation and agronomic traits. Genome Biol. 2021;22(1):300.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Liu H, Zhou H, Ye H, Gen F, Lei M, Li J, Wei W, Liu Z, Hou N, Zhao P. Integrated metabolomic and transcriptomic dynamic profiles of endopleura coloration during fruit maturation in three walnut cultivars. BMC Plant Biol. 2024;24(1):109.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Wang J, Ye H, Zhou H, Chen P, Liu H, Xi R, Wang G, Hou N, Zhao P. Genome-wide association analysis of 101 accessions dissects the genetic basis of shell thickness for genetic improvement in persian walnut (Juglans regia L). BMC Plant Biol. 2022;22(1):436.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Jin F, Zhou Y, Zhang P, Huang R, Fan W, Li B, Li G, Song X, Pei D. Identification of key lipogenesis stages and proteins involved in walnut kernel development. J Agric Food Chem. 2023;71(10):4306–18.

    Article  CAS  PubMed  Google Scholar 

  20. Bernard A, Marrano A, Donkpegan A, Brown PJ, Leslie CA, Neale DB, Lheureux F, Dirlewanger E. Association and linkage mapping to unravel genetic architecture of phenological traits and lateral bearing in persian walnut (Juglans regia L). BMC Genomics. 2020;21(1):203.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Bernard A, Lheureux F, Dirlewanger E. Walnut: past and future of genetic improvement. Tree Genet Genomes. 2017;14(1):1.

    Article  Google Scholar 

  22. Ding YM, Cao Y, Zhang WP, Chen J, Liu J, Li P, Renner SS, Zhang DY, Bai WN. Population-genomic analyses reveal bottlenecks and asymmetric introgression from Persian into iron walnut during domestication. Genome Biol. 2022;23(1):145.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Martinez-Garcia PJ, Crepeau MW, Puiu D, Gonzalez-Ibeas D, Whalen J, Stevens KA, Paul R, Butterfield TS, Britton MT, Reagan RL, et al. The walnut (Juglans regia) genome sequence reveals diversity in genes coding for the biosynthesis of non-structural polyphenols. Plant J. 2016;87(5):507–32.

    Article  CAS  PubMed  Google Scholar 

  24. Pollegioni P, Woeste K, Chiocchini F, Del Lungo S, Ciolfi M, Olimpieri I, Tortolano V, Clark J, Hemery GE, Mapelli S, et al. Rethinking the history of common walnut (Juglans regia L.) in Europe: its origins and human interactions. PLoS ONE. 2017;12(3):e0172541.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Zhang J, Zhang W, Ji F, Qiu J, Song X, Bu D, Pan G, Ma Q, Chen J, Huang R, et al. A high-quality walnut genome assembly reveals extensive gene expression divergences after whole-genome duplication. Plant Biotechnol J. 2020;18(9):1848–50.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Han L, Luo X, Zhao Y, Li N, Xu Y, Ma K. A haplotype-resolved genome provides insight into allele-specific expression in wild walnut (Juglans regia L). Sci Data. 2024;11(1):278.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Marrano A, Britton M, Zaini PA, Zimin AV, Workman RE, Puiu D, Bianco L, Pierro EAD, Allen BJ, Chakraborty S et al. High-quality chromosome-scale assembly of the walnut (Juglans regia L.) reference genome. GigaScience. 2020;9(5).

  28. Marrano A, Martínez-García PJ, Bianco L, Sideli GM, Di Pierro EA, Leslie CA, Stevens KA, Crepeau MW, Troggio M, Langley CH, et al. A new genomic tool for walnut (Juglans regia L.): development and validation of the high-density Axiom™ J. regia 700K SNP genotyping array. Plant Biotechnol J. 2019;17(6):1027–36.

  29. Marrano A, Sideli GM, Leslie CA, Cheng H, Neale DB. Deciphering of the genetic control of phenology, yield, and pellicle color in persian walnut (Juglans regia L). Front Plant Sci. 2019;10:1140.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Famula RA, Richards JH, Famula TR, Neale DB. Association genetics of carbon isotope discrimination and leaf morphology in a breeding population of Juglans regia L. Tree Genet Genomes. 2018;15(1).

  31. Bernard A, Crabier J, Donkpegan ASL, Marrano A, Lheureux F, Dirlewanger E. Genome-wide association study reveals candidate genes involved in fruit trait variation in persian walnut (Juglans regia L). Front Plant Sci. 2020;11:607213.

    Article  PubMed  Google Scholar 

  32. Arab MM, Marrano A, Abdollahi-Arpanahi R, Leslie CA, Cheng H, Neale DB, Vahdati K. Combining phenotype, genotype, and environment to uncover genetic components underlying water use efficiency in Persian Walnut. J Exp Bot. 2020;71(3):1107–27.

    CAS  PubMed  Google Scholar 

  33. Sideli GM, Marrano A, Montanari S, Leslie CA, Allen BJ, Cheng H, Brown PJ, Neale DB. Quantitative phenotyping of shell suture strength in walnut (Juglans regia L.) enhances precision for detection of QTL and genome-wide association mapping. PLoS ONE. 2020;15(4):e0231144.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Bukucu SB, Sutyemez M, Kefayati S, Paizila A, Jighly A, Kafkas S. Major QTL with pleiotropic effects controlling time of leaf budburst and flowering-related traits in walnut (Juglans regia L). Sci Rep. 2020;10(1):15207.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Sideli GM, McAtee P, Marrano A, Allen BJ, Brown PJ, Butterfield TS, Dandekar AM, Leslie CA, Neale DB. Genetic analysis of walnut (Juglans regia L.) pellicle pigment variation through a novel, high-throughput phenotyping platform. G3-Genes. Genom Genet. 2020;10(12):4411–24.

    CAS  Google Scholar 

  36. Arab MM, Marrano A, Abdollahi-Arpanahi R, Leslie CA, Askari H, Neale DB, Vahdati K. Genome-wide patterns of population structure and association mapping of nut-related traits in persian walnut populations from Iran using the Axiom J. regia 700K SNP array. Sci Rep. 2019;9(1):6376.

  37. Ning DL, Wu T, Xiao LJ, Ma T, Fang WL, Dong RQ, Cao FL. Chromosomal-level assembly of Juglans sigillata genome using Nanopore, BioNano, and Hi-C analysis. GigaScience. 2020;9(2).

  38. Stevens KA, Woeste K, Chakraborty S, Crepeau MW, Leslie CA, Martínez-García PJ, Puiu D, Romero-Severson J, Coggeshall M, Dandekar AM, et al. Genomic variation among and within six Juglans species. G3-Genes. Genome Genet. 2018;8(7):2153–65.

    CAS  Google Scholar 

  39. Zhou H, Yan F, Hao F, Ye H, Yue M, Woeste K, Zhao P, Zhang S. Pan-genome and transcriptome analyses provide insights into genomic variation and differential gene expression profiles related to disease resistance and fatty acid biosynthesis in eastern black walnut (Juglans nigra). Hortic Res. 2023;10(3).

  40. Li X, Cai K, Zhang Q, Pei X, Chen S, Jiang L, Han Z, Zhao M, Li Y, Zhang X, et al. The manchurian walnut genome: insights into juglone and lipid biosynthesis. GigaScience. 2022;11:giac057.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Yan F, Xi RM, She RX, Chen PP, Yan YJ, Yang G, Dang M, Yue M, Pei D, Woeste K, et al. Improved de novo chromosome-level genome assembly of the vulnerable walnut tree Juglans mandshurica reveals gene family evolution and possible genome basis of resistance to lesion nematode. Mol Ecol Resour. 2021;21(6):2063–76.

    Article  CAS  PubMed  Google Scholar 

  42. Fitz-Gibbon S, Mead A, O’Donnell S, Li ZZ, Escalona M, Beraut E, Sacco S, Marimuthu MPA, Nguyen O, Sork VL. Reference genome of California walnut, Juglans californica, and resemblance with other genomes in the order Fagales. J Hered. 2023;114(5):570–9.

    Article  PubMed  PubMed Central  Google Scholar 

  43. Guzman-Torres CR, Trybulec E, LeVasseur H, Akella H, Amee M, Strickland E, Pauloski N, Williams M, Romero-Severson J, Hoban S, et al. Conserving a threatened north American walnut: a chromosome-scale reference genome for butternut (Juglans cinerea). G3-Genes. Genome Genet. 2023;14(2):jkad189.

    Google Scholar 

  44. Bi C, Paterson AH, Wang X, Xu Y, Wu D, Qu Y, Jiang A, Ye Q, Ye N. Analysis of the complete mitochondrial genome sequence of the diploid cotton Gossypium raimondii by comparative genomics approaches. Biomed Res Int. 2016;2016:5040598.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Zhou P, Zhang Q, Li F, Huang J, Zhang M. Assembly and comparative analysis of the complete mitochondrial genome of Ilex metabaptista (Aquifoliaceae), a Chinese endemic species with a narrow distribution. BMC Plant Biol. 2023;23(1):393.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Li Y, Gu M, Liu X, Lin J, Jiang H, Song H, Xiao X, Zhou W. Sequencing and analysis of the complete mitochondrial genomes of Toona sinensis and Toona ciliata reveal evolutionary features of Toona. BMC Genom. 2023;24(1):58.

    Article  CAS  Google Scholar 

  47. Niu Y, Lu Y, Song W, He X, Liu Z, Zheng C, Wang S, Shi C, Liu J. Assembly and comparative analysis of the complete mitochondrial genome of three Macadamia species (M. integrifolia, M. ternifolia and M. tetraphylla). PLoS One. 2022;17(5):e0263545.

  48. Ye H, Wang Y, Liu H, Lei D, Li H, Gao Z, Feng X, Han M, Qie Q, Zhou H. The phylogeography of deciduous tree ulmus macrocarpa (Ulmaceae) in Northern China. Plants. 2024;13(10):1334.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Liu H, Ye H, Zhang N, Ma J, Wang J, Hu G, Li M, Zhao P. Comparative analyses of chloroplast genomes provide comprehensive insights into the adaptive evolution of Paphiopedilum (Orchidaceae). Horticulturae. 2022;8(5):391.

    Article  Google Scholar 

  50. Hu Y, Woeste KE, Dang M, Zhou T, Feng X, Zhao G, Liu Z, Li Z, Zhao P. The complete chloroplast genome of common walnut (Juglans regia). Mitochondrial DNA B. 2016;1(1):189–90.

    Article  Google Scholar 

  51. Dong W, Xu C, Li W, Xie X, Lu Y, Liu Y, Jin X, Suo Z. Phylogenetic resolution in Juglans based on complete chloroplast genomes and nuclear DNA sequences. Front Plant Sci. 2017;8:1148.

    Article  PubMed  PubMed Central  Google Scholar 

  52. Yang Y, Forsythe ES, Ding YM, Zhang DY, Bai WN. Genomic analysis of plastid–nuclear interactions and differential evolution rates in coevolved genes across Juglandaceae species. Genome Biol Evol. 2023;15(8):evad145.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Xu LL, Yu RM, Lin XR, Zhang BW, Li N, Lin K, Zhang DY, Bai WN. Different rates of pollen and seed gene flow cause branch-length and geographic cytonuclear discordance within Asian butternuts. New Phytol. 2021;232(1):388–403.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Mu XY, Tong L, Sun M, Zhu YX, Wen J, Lin QW, Liu B. Phylogeny and divergence time estimation of the walnut family (Juglandaceae) based on nuclear RAD-Seq and chloroplast genome data. Mol Phylogenet Evol. 2020;147:106802.

    Article  PubMed  Google Scholar 

  55. Zhou H, Hu Y, Ebrahimi A, Liu P, Woeste K, Zhao P, Zhang S. Whole genome based insights into the phylogeny and evolution of the juglandaceae. BMC Ecol Evol. 2021;21(1):191.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Møller IM, Rasmusson AG, Van Aken O. Plant mitochondria – past, present and future. Plant J. 2021;108(4):912–59.

    Article  PubMed  Google Scholar 

  57. Barreto P, Koltun A, Nonato J, Yassitepe J, Maia IG, Arruda P. Metabolism and signaling of plant mitochondria in adaptation to environmental stresses. Int J Mol Sci. 2022;23(19):11176.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Lu G, Wang W, Mao J, Li Q, Que Y. Complete mitogenome assembly of Selenicereus monacanthus revealed its molecular features, genome evolution, and phylogenetic implications. BMC Plant Biol. 2023;23(1):541.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Wang L, Liu X, Xu Y, Zhang Z, Wei Y, Hu Y, Zheng C, Qu X. Assembly and comparative analysis of the first complete mitochondrial genome of a traditional Chinese medicine Angelica Biserrata (Shan Et Yuan) Yuan Et Shan. Int J Biol Macromol. 2024;257:128571.

    Article  CAS  PubMed  Google Scholar 

  60. Richardson AO, Rice DW, Young GJ, Alverson AJ, Palmer JD. The fossilized mitochondrial genome of Liriodendron tulipifera: ancestral gene content and order, ancestral editing sites, and extraordinarily low mutation rate. BMC Biol. 2013;11(1):29.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Skippington E, Barkman TJ, Rice DW, Palmer JD. Miniaturized mitogenome of the parasitic plant Viscum Scurruloideum is extremely divergent and dynamic and has lost all nad genes. Proc Natl Acad Sci USA. 2015;112(27):E3515–24.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Putintseva YA, Bondar EI, Simonov EP, Sharov VV, Oreshkova NV, Kuzmin DA, Konstantinov YM, Shmakov VN, Belkov VI, Sadovsky MG, et al. Siberian larch (Larix sibirica Ledeb.) Mitochondrial genome assembled using both short and long nucleotide sequence reads is currently the largest known mitogenome. BMC Genomics. 2020;21(1):654.

    Article  PubMed  PubMed Central  Google Scholar 

  63. Wang Y, Chen S, Chen J, Chen C, Lin X, Peng H, Zhao Q, Wang X. Characterization and phylogenetic analysis of the complete mitochondrial genome sequence of Photinia serratifolia. Sci Rep. 2023;13(1):770.

    Article  PubMed  PubMed Central  Google Scholar 

  64. Yu R, Chen X, Long L, Jost M, Zhao R, Liu L, Mower JP, dePamphilis CW, Wanke S, Jiao Y. De novo assembly and comparative analyses of mitochondrial genomes in Piperales. Genome Biol Evol. 2023;15(3):evad041.

    Article  PubMed  PubMed Central  Google Scholar 

  65. Wang J, Kan S, Liao X, Zhou J, Tembrock LR, Daniell H, Jin S, Wu Z. Plant organellar genomes: much done, much more to do. Trends Plant Sci. 2024;29(7):754–69.

    Article  CAS  PubMed  Google Scholar 

  66. She RX. Population genomics and phylogeny of walnut (Juglans). Master’s thesis. Northwest University; 2021.

  67. Su X, Liu Q, Guo H, Hu D, Liu D, Wang Z, Zhang P. Deciphering the mitochondrial genome of Juglans mandshurica (Juglandaceae). Mitochondrial DNA B. 2023;8(2):249–54.

    Article  Google Scholar 

  68. Doyle J. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochem Bull. 1987;19:11–5.

    Google Scholar 

  69. Chen S. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. iMeta. 2023;2(2):e107.

    Article  PubMed  PubMed Central  Google Scholar 

  70. De Coster W, D’Hert S, Schultz DT, Cruts M, Van Broeckhoven C. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics. 2018;34(15):2666–2669.

  71. Li H. New strategies to improve minimap2 alignment accuracy. Bioinformatics. 2021;37(23):4572–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017;27(5):722–36.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Kolmogorov M, Yuan J, Lin Y, Pevzner PA. Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol. 2019;37(5):540–6.

    Article  CAS  PubMed  Google Scholar 

  74. Li H, Durbin R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics. 2010;26(5):589–95.

    Article  PubMed  PubMed Central  Google Scholar 

  75. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE. 2014;9(11):e112963.

    Article  PubMed  PubMed Central  Google Scholar 

  76. Marçais G, Delcher AL, Phillippy AM, Coston R, Salzberg SL, Zimin A. MUMmer4: a fast and versatile genome alignment system. PLoS Comput Biol. 2018;14(1):e1005944.

    Article  PubMed  PubMed Central  Google Scholar 

  77. Tillich M, Lehwark P, Pellizzer T, Ulbricht-Jones ES, Fischer A, Bock R, Greiner S. GeSeq–versatile and accurate annotation of organelle genomes. Nucleic Acids Res. 2017;45(W1):W6–11.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Chan Patricia P, Lin Brian Y, Mak Allysia J, Lowe Todd M. tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res. 2021;49(16):9077–96.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinf. 2009;10(1):421.

    Article  Google Scholar 

  80. Lewis SE, Searle SMJ, Harris N, Gibson M, Iyer V, Richter J, Wiel C, Bayraktaroglu L, Birney E, Crosby MA, et al. Apollo: a sequence annotation editor. Genome Biol. 2002;3(12):research0082.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Beier S, Thiel T, Münch T, Scholz U, Mascher M. MISA-web: a web server for microsatellite prediction. Bioinformatics. 2017;33(16):2583–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R. REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001;29(22):4633–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27(2):573–80.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Zhang D, Gao F, Jakovlić I, Zou H, Zhang J, Li WX, Wang GT. PhyloSuite: an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol Ecol Resour. 2020;20(1):348–55.

    Article  PubMed  Google Scholar 

  85. Sharp PM, Li WH. Codon usage in regulatory genes in Escherichia coli does not reflect selection for ‘rare’ codons. Nucleic Acids Res. 1986;14(19):7737–49.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Romero H, Zavala A, Musto H. Codon usage in Chlamydia trachomatis is the result of strand-specific mutational biases and a complex pattern of selective forces. Nucleic Acids Res. 2000;28(10):2084–90.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Sueoka N. Directional mutation pressure and neutral molecular evolution. Proc Natl Acad Sci USA. 1988;85(8):2653–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Jin JJ, Yu WB, Yang JB, Song Y, dePamphilis CW, Yi TS, Li DZ. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol. 2020;21(1):241.

    Article  PubMed  PubMed Central  Google Scholar 

  89. Shi L, Chen H, Jiang M, Wang L, Wu X, Huang L, Liu C. CPGAVAS2, an integrated plastome sequence annotator and analyzer. Nucleic Acids Res. 2019;47(W1):W65–73.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2014;32(1):268–74.

    Article  PubMed  PubMed Central  Google Scholar 

  93. Rozas J, Ferrer-Mata A, Sánchez-DelBarrio JC, Guirao-Rico S, Librado P, Ramos-Onsins SE, Sánchez-Gracia A. DnaSP 6: DNA sequence polymorphism analysis of large data sets. Mol Biol Evol. 2017;34(12):3299–302.

    Article  CAS  PubMed  Google Scholar 

  94. Leigh JW, Bryant D. Popart: full-feature software for haplotype network construction. Methods Ecol Evol. 2015;6(9):1110–6.

    Article  Google Scholar 

  95. Wang D, Zhang Y, Zhang Z, Zhu J, Yu J. KaKs_Calculator 2.0: a toolkit incorporating gamma-series methods and sliding window strategies. Genom Proteom Bioinf. 2010;8(1):77–80.

    Article  CAS  Google Scholar 

  96. Yang Z, Nielsen R. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol Biol Evol. 2000;17(1):32–43.

    Article  CAS  PubMed  Google Scholar 

  97. Lenz H, Knoop V. PREPACT 2.0: Predicting C-to-U and U-to-C RNA editing in organelle genome sequences with multiple references and curated RNA editing annotation. Bioinform. 2013;7:1–19.

    CAS  Google Scholar 

  98. Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019;37(8):907–15.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, et al. Twelve years of SAMtools and BCFtools. GigaScience. 2021;10(2):giab008.

    Article  PubMed  PubMed Central  Google Scholar 

  100. Picardi E, Pesole G. REDItools: high-throughput RNA editing detection made easy. Bioinformatics. 2013;29(14):1813–4.

    Article  CAS  PubMed  Google Scholar 

  101. Chakraborty S, Britton M, Martínez-García PJ, Dandekar AM. Deep RNA-Seq profile reveals biodiversity, plant–microbe interactions and a large family of NBS-LRR resistance genes in walnut (Juglans regia) tissues. AMB Express. 2016;6(1):12.

    Article  PubMed  PubMed Central  Google Scholar 

  102. Zhou H, Ma J, Liu H, Zhao P. Genome-wide identification of the cbf gene family and ice transcription factors in walnuts and expression profiles under cold conditions. Int J Mol Sci. 2024;25(1):25.

    Article  CAS  Google Scholar 

  103. Liao Y, Smyth GK, Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2013;30(7):923–30.

    Article  PubMed  Google Scholar 

  104. Love MI, Huber W, Anders S. Moderated estimation of Fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550.

    Article  PubMed  PubMed Central  Google Scholar 

  105. Wu ZQ, Liao XZ, Zhang XN, Tembrock LR, Broz A. Genomic architectural variation of plant mitochondria—A review of multichromosomal structuring. J Syst Evol. 2020;60(1):160–8.

    Article  Google Scholar 

  106. Yang JX, Dierckxsens N, Bai MZ, Guo YY. Multichromosomal mitochondrial genome of Paphiopedilum micranthum: compact and fragmented genome, and rampant intracellular gene transfer. Int J Mol Sci. 2023;24(4):3976.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Tan H, Yu Y, Fu Y, Liu T, Wang Y, Peng W, Wang B, Chen J. Comparative analyses of Flammulina Filiformis mitochondrial genomes reveal high length polymorphism in intergenic regions and multiple intron gain/loss in cox1. Int J Biol Macromol. 2022;221:1593–605.

    Article  CAS  PubMed  Google Scholar 

  108. Yang H, Ni Y, Zhang X, Li J, Chen H, Liu C. The mitochondrial genomes of Panax notoginseng reveal recombination mediated by repeats associated with DNA replication. Int J Biol Macromol. 2023;252:126359.

    Article  CAS  PubMed  Google Scholar 

  109. Yang H, Chen H, Ni Y, Li J, Cai Y, Wang J, Liu C. Mitochondrial genome sequence of Salvia officinalis (Lamiales: Lamiaceae) suggests diverse genome structures in cogeneric species and finds the stop gain of genes through RNA editing events. Int J Mol Sci. 2023;24(6):5372.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  110. Roulet ME, Garcia LE, Gandini CL, Sato H, Ponce G, Sanchez-Puerta MV. Multichromosomal structure and foreign tracts in the Ombrophytum subterraneum (Balanophoraceae) mitochondrial genome. Plant Mol Biol. 2020;103(6):623–38.

    Article  CAS  PubMed  Google Scholar 

  111. Sheng W, Deng J, Wang C, Kuang Q. The garden asparagus (Asparagus officinalis L.) mitochondrial genome revealed rich sequence variation throughout whole sequencing data. Front Plant Sci. 2023;14:1140043.

    Article  PubMed  PubMed Central  Google Scholar 

  112. Li J, Li J, Ma Y, Kou L, Wei J, Wang W. The complete mitochondrial genome of okra (Abelmoschus esculentus): using nanopore long reads to investigate gene transfer from chloroplast genomes and rearrangements of mitochondrial DNA molecules. BMC Genom. 2022;23(1):481.

    Article  CAS  Google Scholar 

  113. Fan W, Liu F, Jia Q, Du H, Chen W, Ruan J, Lei J, Li DZ, Mower JP, Zhu A. Fragaria mitogenomes evolve rapidly in structure but slowly in sequence and incur frequent multinucleotide mutations mediated by microinversions. New Phytol. 2022;236(2):745–59.

    Article  CAS  PubMed  Google Scholar 

  114. Zhang S, Wang J, He W, Kan S, Liao X, Jordan DR, Mace ES, Tao Y, Cruickshank AW, Klein R, et al. Variation in mitogenome structural conformation in wild and cultivated lineages of sorghum corresponds with domestication history and plastome evolution. BMC Plant Biol. 2023;23(1):91.

    Article  PubMed  PubMed Central  Google Scholar 

  115. Bi C, Qu Y, Hou J, Wu K, Ye N, Yin T. Deciphering the multi-chromosomal mitochondrial genome of Populus simonii. Front Plant Sci. 2022;13.

  116. Kumazawa Y, Miura S, Yamada C, Hashiguchi Y. Gene rearrangements in gekkonid mitochondrial genomes with shuffling, loss, and reassignment of tRNA genes. BMC Genom. 2014;15(1):930.

  117. Cole LW, Guo W, Mower JP, Palmer JD. High and variable rates of repeat-mediated mitochondrial genome rearrangement in a genus of plants. Mol Biol Evol. 2018;35(11):2773–85.

    CAS  PubMed  Google Scholar 

  118. Yang Z, Ni Y, Lin Z, Yang L, Chen G, Nijiati N, Hu Y, Chen X. De novo assembly of the complete mitochondrial genome of sweet potato (Ipomoea batatas [L.] Lam) revealed the existence of homologous conformations generated by the repeat-mediated recombination. BMC Plant Biol. 2022;22(1):285.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Zeng Y, Shen L, Chen S, Qu S, Hou N. Codon usage profiling of chloroplast genome in Juglandaceae. Forests. 2023;14(2):378.

    Article  Google Scholar 

  120. Choi IS, Wojciechowski MF, Ruhlman TA, Jansen RK. In and out: evolution of viral sequences in the mitochondrial genomes of legumes (Fabaceae). Mol Phylogenet Evol. 2021;163:107236.

    Article  PubMed  Google Scholar 

  121. Sinn BT, Barrett CF. Ancient mitochondrial gene transfer between fungi and the orchids. Mol Biol Evol. 2019;37(1):44–57.

    Article  Google Scholar 

  122. Mower JP, Touzet P, Gummow JS, Delph LF, Palmer JD. Extensive variation in synonymous substitution rates in mitochondrial genes of seed plants. BMC Evol Biol. 2007;7(1):135.

    Article  PubMed  PubMed Central  Google Scholar 

  123. Cheng Y, He X, Priyadarshani SVGN, Wang Y, Ye L, Shi C, Ye K, Zhou Q, Luo Z, Deng F, et al. Assembly and comparative analysis of the complete mitochondrial genome of Suaeda Glauca. BMC Genomics. 2021;22(1):167.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  124. Sanders C, Turkarslan S, Lee D-W, Daldal F. Cytochrome c biogenesis: the Ccm system. Trends Microbiol. 2010;18(6):266–74.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  125. Kan SL, Shen TT, Gong P, Ran JH, Wang XQ. The complete mitochondrial genome of Taxus cuspidata (Taxaceae): eight protein-coding genes have transferred to the nuclear genome. BMC Evol Biol. 2020;20(1):10.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  126. Naydenov NG, Khanam S, Siniauskaya M, Nakamura C. Profiling of mitochondrial transcriptome in germinating wheat embryos and seedlings subjected to cold, salinity and osmotic stresses. Genes Genet Syst. 2010;85(1):31–42.

    Article  CAS  PubMed  Google Scholar 

  127. Elhafez D, Murcha MW, Clifton R, Soole KL, Day DA, Whelan J. Characterization of mitochondrial alternative NAD(P)H dehydrogenases in Arabidopsis: intraorganelle location and expression. Plant Cell Physiol. 2006;47(1):43–54.

    Article  CAS  PubMed  Google Scholar 

  128. Zhang X, Liu S, Takano T. Overexpression of a mitochondrial ATP synthase small subunit gene (AtMtATP6) confers tolerance to several abiotic stresses in Saccharomyces cerevisiae and Arabidopsis thaliana. Biotechnol Lett. 2008;30(7):1289–94.

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by the National Natural Science Foundation of China (32370386 and 32070372), Science Foundation for Distinguished Young Scholars of Shaanxi Province (2023-JC-JQ-22), Basic Research Project of Shaanxi Academy of Fundamental Science (22JHZ005 and 23JHZ009), Shaanxi Key Research and Development Program (2024NC-YBXM-064), Science and Technology Program of Shaanxi Academy of Science (2023 K-49, 2023 K-26, and 2019 K-06), Shaanxi Forestry Science and Technology Innovation Key Project (SXLK2023-02-20), and Qinling Hundred Talents Project of Shaanxi Academy of Science (Y23Z619F17).

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, H.Z. and P.Z.; Methodology, H.Y.; Software, H.Y.; Validation, H.Y. and H.L. (Hengzhao Liu); Formal analysis, H.L. (Haochen Li) and D.L.; Investigation, Z.G.; Resources, H.Z.; Writing—original draft, H.Y. and H.L. (Hengzhao Liu); Writing—review and editing, H.Y. and P.Z.; Supervision, P.Z.; Project administration, P.Z.; Funding acquisition, H.Z. and P.Z. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Peng Zhao.

Ethics declarations

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Ethical approval and consent to participate

This study’s material collections and experimental research complied with relevant institutional, national, and international guidelines and legislation. No specific permissions or licenses were required.

Competing interest

The authors declare that they have no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary Material 2

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ye, H., Liu, H., Li, H. et al. Complete mitochondrial genome assembly of Juglans regia unveiled its molecular characteristics, genome evolution, and phylogenetic implications. BMC Genomics 25, 894 (2024). https://doi.org/10.1186/s12864-024-10818-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12864-024-10818-w

Keywords