Development of SNP and InDel markers by genome resequencing and transcriptome sequencing in radish (Raphanus sativus L.)
BMC Genomics volume 24, Article number: 445 (2023)
Single nucleotide polymorphisms (SNPs) and insertions/deletions (InDels) are the most abundant genetic variations and widely distribute across the genomes in plant. Development of SNP and InDel markers is a valuable tool for genetics and genomic research in radish (Raphanus sativus L.).
In this study, a total of 366,679 single nucleotide polymorphisms (SNPs) and 97,973 insertion-deletion (InDel) markers were identified based on genome resequencing between ‘YZH’ and ‘XHT’. In all, 53,343 SNPs and 4,257 InDels were detected in two cultivars by transcriptome sequencing. Among the InDel variations, 85 genomic and 15 transcriptomic InDels were newly developed and validated PCR. The 100 polymorphic InDels markers generated 207 alleles among 200 Chinese radish germplasm, with an average 2.07 of the number of alleles (Na) and with an average 0.33 of the polymorphism information content (PIC). Population structure and phylogenetic relationship revealed that the radish cultivars from northern China were clustered together and the southwest China cultivars were clustered together. RNA-Seq analysis revealed that 11,003 differentially expressed genes (DEGs) were identified between the two cultivars, of which 5,020 were upregulated and 5,983 were downregulated. In total, 145 flowering time-related DGEs were detected, most of which were involved in flowering time integrator, circadian clock/photoperiod autonomous, and vernalization pathways. In flowering time-related DGEs region, 150 transcriptomic SNPs and 9 InDels were obtained.
The large amount of SNPs and InDels identified in this study will provide a valuable marker resource for radish genetic and genomic studies. The SNPs and InDels within flowering time-related DGEs provide fundamental insight into for dissecting molecular mechanism of bolting and flowering in radish.
Molecular marker is a valuable tool for genetics and breeding research in plant, such as fingerprinting genotypes, genetic map construction, QTL mapping, association analysis and marker-assisted selection (MAS) [1, 2]. In the past three decades, a large number of molecular markers have been successfully developed in plant, including restriction fragment length polymorphisms (RFLPs), random amplified polymorphic DNA (RAPDs), amplified fragment length polymorphisms (AFLPs), simple sequence repeats (SSRs), insertion/Deletions (InDels) and single-nucleotide polymorphisms (SNPs) [3, 4]. SSR and InDel are PCR-base markers with the advantage of bi-allelic, co-dominant, abundance and low-cost . SNP and InDel are the most abundant genetic variations and widely distribute across the genomes in plant . With the common characteristics of SSR, InDel markers have receive more and more attention. With the development of next-generation sequencing (NGS), InDel markers has extensively developed and applied in crop breeding [7, 8]. In soybean, a total of 17,613 InDels were detected in 56 soybean accessions and a genetic map with 300 InDel markers was constructed in 20 linkage groups . The promoter of Sl-ALMT9 with a 3 bp InDel was increased the expression levels of Sl-ALMT9 and fruit malate contents in cultivate tomato . In total, 47,558 InDels were identified between the two Cannabis accessions, 14 InDels were applied to perform the genetic structure analysis . In mung bean, 129 InDel markers were developed and used to construct a genetic linkage map by the genome resequencing between two parents, the major effect QTL qYSC4 for young stem color on chromosome 4 was narrowed in a 347 kb interval . In all, 318 InDel markers were developed across the eight chromosomes by comparing the assembled genomic sequences of two Medicago truncatula varieties, gibberellin 3-β-dioxygenase 1 gene for in the dwarf mutant crs were isolated .
Radish (Raphanus sativus L.) is an important root vegetable belonging to cruciferous family. A large number of InDel markers have been characterized and applied for genetic and genomic studies in radish [3, 11]. A total of 9,436 InDel were detected in three radish transcriptome and 40 InDel markers were used for genetic diversity analysis in 32 radish accessions . In total, 99 EST-SSR and InDel markers were used to construct linkage groups and QTL mapping, QTL for late-bolting trait was located a 1.1-cM region between InDel520 and InDel535 . Previous studies indicated that the R locus was identified between RsInDel4 and RsInDel11, and RsMYB90 was defined as a candidate gene underlying the taproot skin color trait . However, the number of InDel markers is far from sufficient for radish genetic studies.
The transition from the vegetative phase to the reproductive phase is the most important stages in the life cycle of flowering plants, which is control by multiple environmental signals and genetic pathway. Numerous studies have found that more than 300 genes associated with flowering time were mainly involved in six key pathways, including photoperiod, vernalization, ambient temperature, age, autonomy, and gibberellin pathways [14, 15]. In Arabidopsis, FLOWERING LOCUS C (FLC) as a repressor of flower plays crucial roles in the vernalization response . FLOWERING LOCUS T (FT) in the photoperiod pathway as a central floral integrator was exhibited to delay flowering . Bolting and flowering times as two important agronomic traits determine production and quality of radish. Previous studies showed that 142 bolting and flowering time related genes were acquired by transcriptome sequencing . It was found that 218 homologs of Arabidopsis flowering-time genes were obtained in radish, of which 49 genes were identified as differentially expressed in two radish accessions . A total of 254 flowering genes in R. sativus were characterized based on sequence similarities analysis . Two QTLs each for bolting and flowering times were identified on chromosome R06 by QTL mapping . A 1627-bp insertion near the 5′ end of the first intron of RsFLC2 was associated with late-bolting trait in radish . A large number of bolting and flowering genes have been reported in radish, no study on identify the genetic variation in flowering genes was performed in radish.
In this study, the SNP and InDel markers were developed by genome resequencing and transcriptome between early and late bolting cultivars. The InDel markers were developed and validated by PCR based markers. Population structure and clustering analysis were performed based on genotyping 200 radish cultivars with the developed InDel markers. The differential expressed genes (DEGs) of vegetative growth were identified between two cultivars. The homologs of flowering-time related gene (FTR) in radish were characterized. The transcriptomic SNP and InDel markers within flowering-time related genes were obtained. These results could provide an abundant marker for genetic and genomic studies in radish.
Materials and methods
Plant materials and DNA extraction
In total, 200 radish cultivars were collected from Institute of Vegetable Industry Technology Research in Guizhou University. The detail information of all cultivars are listed in Table S1. The seeds of two high-generation radish inbred lines ‘YZH’ (early-flowering time, 65d) and ‘XHT’ (late-flowering time, 185d) were planted in plastic pots and cultured in a chamber at 22 ℃ for 14 h light and 10 h dark. After 25 days, the young leaves of ‘YZH’ and ‘XHT’ were taken for three biological replicates for genome resequencing and transcriptome sequencing. Leaves of all radish cultivars were collected and immediately frozen in liquid nitrogen, and stored at -80 °C for further use. Total genomic DNA was extracted using the EasyPure Genomic DNA Kit (TransGen Biotech, Beijing, China) according to the manufacturer’s instructions. The DNA with a final concentration of 10 ng/ul were used to conduct PCR amplifications.
Genome resequencing and identification of SNPs and InDels
The ‘YZH’ and ‘XHT’ in vegetative growth period was used to genome resequencing. Total genomic DNA was extracted and the quality of genomic DNA was quantified prior to library construction. According to manufacturer's recommendation, six sequencing libraries were generated using NEB Next® Ultra™ DNA Library Prep Kit for Illumina (NEB, USA). The DNA libraries were sequenced on Illumina platform to generate 150 bp paired-end reads. After removing low-quality reads, the clean reads were assigned to the radish reference genome using Burrows-Wheeler Aligner (BWA)-0.7.8 (parameter: mem -t 4 -k 32 -M) [20, 21]. The SNPs and InDels calling were performed using SAMtools-1.3.1 (mpileup -m 2 -F 0.002 -d 1000) . The variants were filtered with the following criteria: (1) The depth of the variate position > 4; (2) The mapping quality > 20. The functional annotation of variants was carried out by ANNOVAR (Annotate Variation).
RNA isolation and transcriptome sequencing
Total RNA were extracted from the young leaves of ‘YZH’ and ‘XHT’ in vegetative growth period (after 25 days after sowing) using a Plant RNA Mini Kit (Tiangen, Inc., China). To assess RNA integrity, the RNA Nano 6000 Assay Kit of the Bioanalyzer 2100 system (Agilent Technologies, CA, USA) were used . A total amount of 1 µg RNA per sample was used as input material for the RNA sample preparations. Sequencing libraries were constructed using NEBNext® UltraTM RNA Library Prep Kit for Illumina® (NEB, USA). All libraries were sequenced on Illumina Novaseq platform.
Analysis of differential expressed genes (DEGs) and function annotation
After removing adapter and low quality reads, clean reads were obtained. The clean reads were aligned to the radish reference genome (http://radish-genome.org) using Hisat2 (v2.0.5) . The reads numbers mapped to each gene was counted by feature Counts v1.5.0-p3. FPKM (fragments per kilobase of exon per million mapped fragments) method was applied to count the gene expression levels. The DEGs were determined with an padj < = 0.05 and |log2(foldchange)|> = 1 using R DESeq2 package. GO pathway enrichment analysis of DEGs were performed by the cluster Profiler R package (3.8.1) . The P value threshold (P ≤ 0.05) was regarded as significantly enriched GO terms. Genome Analysis Toolkit (GATK, version v3.8) was used to call SNPs and InDels. SNPs and InDels were filtered with parameters (quality scores (QUAL) > = 20 and quality by depth (QD) > = 4). SnpEff (4.3q) software was implemented to annotate SNPs and InDels.
Validation of InDel markers polymorphism
The length of primer pairs of InDels were varied from 18 to 24 bp, the predicted product sizes were ranged from 100 to 200 bp by Primer Premier 5.0 program. Polymerase chain reaction (PCR) reagents and amplification conditions were implemented as described previously by Luo et al. . The PCR products were separated on 8% polyacrylamide electrophoresis (PAGE) gel. All validated primers of InDel markers were shown in Table S3.
Genetic diversity, population structure and phylogenetic analysis
Popgene 32 program was used to determine values of the expected homozygosty and expected heterozygosity. PowerMarker version 3.2 was employed to calculate the number of alleles (Na), major allele frequency (MAF), and polymorphism information content (PIC) . InDel markers were divided as three types: highly informative with PIC > 0.5), moderately informative with 0.25 < PIC < 0.5 and slightly informative with PIC < 0.25. Genetic structure analysis of radish accessions was conducted by the Structure 2.3.4 program . The Bayesian model-based clustering was used to distribute individuals to groups with a predetermined number (K), which could be minimized Hardy–Weinberg and linkage disequilibrium within each group. The number of K was set from 1 to 10 with ten independent runs and 10,000 iterations were performed for estimation after a 10,000 iterations burn-in period. The subgroups and best K was estimated according to previous study . Phylogenetic analysis and the dendrogram was carried out in previous studies .
Total RNA of eight flowing point in times, including vegetative growth, five vernalization times (5d, 10d,15d, 20 d, 25 d, 30d) and first flowering time were extracted and the cDNA of each sample was synthesized using the SuperScript III First-Strand Synthesis System (Invitrogen). The qRT-PCR experiment were performed as described previously . The radish Actin were used as internal controls. The double delta CT (2-ΔΔCt) method was carried out to calculate relative transcription levels. Three technical replicates were conducted in each sample. The primer sequences were listed in Table S2.
To development of molecular markers in radish, two radish cultivars, ‘YZH’ and ‘XHT’ have significant differences in flower time was used to genome resequencing. A total of 12.0 Gb and 12.6 Gb base pairs (bp) were generated in ‘YZH’ and ‘XHT’, respectively. After filtering the adapter sequences and low-quality reads, 79,949,206 and 84,185,286 clean reads in ‘YZH’ and ‘XHT’ were obtained (Table S4), of which 92.14% and 90.61% clean reads had a coverage depth of 26 and 23.7 of the radish reference genome in ‘YZH’ and ‘XHT’, respectively . In total, 366,679 SNP and 97,973 InDel were identified in two cultivars (Additional file 3 and 4). The average frequency of SNP and InDel in the radish genome were calculated to be 1 SNP/1.22 kb and 1 InDel/4.6 kb. Most 39.7% (145,691) of the SNPs were located in intergenic regions, and 23.72% (86,973) were located in exonic sequence. The length distribution of InDels were ranged from 1 to 21 bp. The InDels length with 1 to 2 bp were the two most abundant type, accounting for 66.98% (65,632) of the total InDels (Fig. S1). Nine chromosomes of radish have different density, the highest number of InDels (21,734) were existed on chromosome 5, while the lowest number were displayed on chromosome 7 (6,416).
Transcriptome sequencing and DEG analysis
To gain insights into gene expression changes and development of molecular markers in transcriptional level, a total of 6 cDNA libraries were sequenced in ‘YZH’ and ‘XHT’. In total, 29.87 Gb and 21.29 Gb clean reads were obtained (Table S7), 85.93% and 86.74% of the reads were mapped to the radish reference genome, of which 78.67% and 79.06% were uniquely mapped in ‘YZH’ and ‘XHT’, respectively. The Pearson correlation coefficient between the three replicates of each sample displayed higher values (> 0.95) (Fig. 1a). The principal component analysis (PCA) result indicated that PC1 and PC2 explained 90.83% of the total variation (Fig. S2).
To explore genes associated with flowering time, genes with |log2(FoldChange)|> = 1 and padj < = 0.05 were defined as DEGs. In total, 11,003 DEGs were identified in between the two cultivars, of which 5,020 DEGs were upregulated and 5,983 DEGs were downregulated (Fig. 1b and Additional file 5). GO annotations reveals that all DEGs were assigned into 85 significantly enriched GO terms (Fig. 1c and Additional file 6). In the molecular function (43 terms), the major subcategories were iron ion binding, oxidoreductase activity, catalytic activity. For biological process (34 terms), catabolic process, organic substance catabolic process, cofactor metabolic process were the dominant terms. The ‘endoplasmic reticulum’ and ‘extracellular region’ terms were extraordinarily remarkable in the cellular component (8 terms). A total of 53,343 SNP were identified between two cultivars (Additional file 7), with an average frequency of 1 SNP/8.43 kb. The number of InDels with 1 to 2 bp was accounted for 72.23% (3,075) of the all InDels (4,257) (Fig. S1 and Additional file 8).
Screening Validation and polymorphism of developed InDel markers
To validate the polymorphic InDel markers, 260 genomic and 40 transcriptomic InDels were randomly selected from genome resequencing and transcriptome sequencing of two cultivars, respectively. To facilitate the visualization of InDel markers on PAGE, the lengths of InDels greater than or equal to 3 were selected for PCR validation. The results showed that 85 genomic and 15 transcriptomic InDels were amplified successfully and exhibited polymorphisms in two cultivars, with an 32.7% an 37.5% marker polymorphism, respectively. The polymorphism amplification results of the RsInDelR4-18 primer in 200 radish cultivars were shown in Fig. 2. Consequently, the 100 polymorphic InDel markers were used for further analysis.
Genetic diversity analysis and population structure
To assess the genetic diversity analysis of 200 Chinese radish accessions, 100 InDel primers were selected. The 100 polymorphic InDels markers generated 207 alleles among 200 Chinese radish germplasm. The number of alleles (Na) varied from 2 to 4, with an average of 2.07 alleles (Additional file 2). The major allele frequency (MAF) ranged from 0.48 to 0.93, with an average of 0.67. The expected homozygosity (HO) each InDel varied from 0.09 to 0.9, with an average of 0.62. The expected heterozygosity (HE) for each InDel ranged from 0.1 to 0.92, with an average of 0.38. The value of polymorphism information content (PIC) per locus ranged from 0.13 to 0.53, with an average of 0.33.
Population structure analysis was conducted based on the 200 radish accessions with 100 InDel markers using Structure 2.3.3 software. Delta K reached a maximum value at K = 2, indicating the 200 accessions could be divided into two groups (Fig. 3a and b). Apparently, six of seven accessions from Korea, one of two accessions from Japan, the majority of accessions with late flowering from northern China (Beijing, Shangdong, Heilongjiang) were clustered into the same group (Fig. 3c). The majority of accessions from southwest China (Guizhou, Yunnan, Sichuan) were clustered into same group. Cluster analysis implied that the 200 cultivars were divided into two groups with a genetic distance of 0.58 (Fig. 4). Group I contained 122 accessions, most varieties (58.4%) were derived from the northern China, seven accessions from Korea and two Japan were clustered into the same group. Group II contained 78 accessions, most of which (91.75%) were collected from the southwest China. The results of cluster and population structure analysis were basically consistent, but there were slight differences. These results demonstrated that the phylogenetic relationships among all accessions highly correlated with their geographical origins.
Identification of homologous genes for flowering time
To identify homologs of flowering-time related gene (FTR) in radish, 306 FTR genes in Arabidopsis were downloaded from the Flowering Interactive Database (http://www.phytosystems.ulg.ac.be/florid/) and aligned to radish reference genome using BLASTN. Top hits with E-values ≤ 1E−20 and identity ≥ 80% were used to screen for the corresponding homologous genes. The 637 radish FTR genes were identified and divided into nine flowering-related pathways (Additional file 9). The largest number of FTR genes were involved in autonomous pathway (286). The smallest number of FTR genes were involved in sugar pathway (31) (Additional file 5: Table S8). The DEGs associated with the flowering pathway between two cultivars were screened with |log2 fold change|> 1 and FDR < 0.05. A total of 145 flowering time genes were identified, of which 74 were upregulated and 71 were downregulated (Additional file 10). The differentially expressed FTR genes were mainly associated with autonomous (50), photoperiod (39), and vernalization pathway (12). Rs583930 and Rs094390, the homolog of FT plays key roles in flowering time integrator. In autonomous pathway, UBIQUITIN CARRIER PROTEIN 1 (UBC1), FLOWERING LOCUS Y (FY), AGAMOUS-LIKE 6 (AGL6), FLOWERING LOCUS VE (FVE), RELATIVE OF EARLY FLOWERING 6 (REF6), EMBRYONIC FLOWER 2 (EMF2) were detected. Several key genes in photoperiod pathway were also identified, such as PSEUDORESPONSE REGULATORS 3 (PRR3), PRR5, CYCLING DOF FACTOR 1 (CDF1), CDF4, CDF5, CRY2, PHYCOCHROME A (PHYA), CALCIUM-DEPENDENT PROTEIN KINASES 6 (CPK6), CPK33, PHYB, BBX19, CONSTANS (CO), and EARLY FLOWERING 4 (ELF4). Genes known to be involved in vernalization pathways were identified, such as VERNALIZATION INSENSITIVE 3 (VIN3), INDUCER OF CBF EXPRESSION 1 (ICE1), FRIGIDA (FRI), AGL19. LEAFY (LFY), AGL25, AGL27, TCP18, APETALA 2 (AP2) involved in the flower development and meristem identity pathways were detected. Many vital genes in aging (SPL15), ambient temperature (AGL31, PHYTOCHROME INTERACTING FACTOR 4 (PIF4), AGL27, hormone (GA2ox1 and GA2ox6), SUGAR SUCROSE TRANSPORTER 9 (SUC9), circadian clock (CIRCADIAN CLOCK–ASSOCIATED 1, CCA1), PRR5 and flowering time integrator genes (SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1, SOC1) and FT were also detected.
Many studies reported that SNPs and InDels within or near the coding sequences (CDS) were significantly associated with important agronomic traits in crops. To identify of genetic variation in flower-related genes, SNP and InDel markers located in flower-related DGEs were detected. A total of 150 transcriptomic SNPs and 9 transcriptomic InDels within 30 and 6 differentially expressed FTR genes were obtained, respectively (Additional file 11). A total of 12 differentially expressed FTR genes with at least 2 SNPs were identified, such as PHYA, TEM1, VIN3, PIF5, ADG1, FVE, ELF3. A total of 3 differentially expressed FTR genes with at least 2 InDels were identified, including ELF3, VIL2, TPS1. These results provided valuable information on the explaining significant difference in flower time between ‘YZH’ and ‘XHT’.
To assess the expression pattern of DEGs identified from RNA-Seq, ten candidate DEGs involved in the flowering pathway were selected for qRT-PCR analysis in leaves of ‘YZH’ and ‘XHT’ (Fig. 5). The expression of five DEGs ( RsAGL25, RsVIN3, RsUBC1, CO and RsPIF4) were highly expressed in ‘YZH’ during eight times. RsSOC1, RsCCA1, RsVIN3 were highly expressed in 30 d of vernalization. RsELF7, RsLFY, RsVIN3 were highly expressed in 15 d of vernalization in ‘XHT’. RsFT were highly expressed in the first flowering time. These results revealed that these DEGs might involve in the phase transition from vegetative stage to bolting and flowering in radish.
Identification of SNP and InDel markers in radish
SNP and InDel markers were the most abundant variations in the plant genomes, which had been widely utilized in genetic diversity analysis, gene mapping, genome wide association mapping and marker-assisted selection breeding [5, 28]. With the release of genome sequences information, providing important data reference for the resequencing of different radish varieties . Although a large number of InDel markers have been extensively detected in radish, the number of InDel markers is still insufficient. In this study, 366,679 genomic SNPs and 97,973 genomic InDels were identified in ‘YZH’ and ‘XHT’, with an average frequency of 1 SNP/1.22 kb and 1 InDel/4.6 kb. The frequency of genomic SNP was significantly lower than previously described in radish (average occurrence of 3.9/kb) . The frequency of genomic SNP and InDel in the radish were remarkably different with other plant species, including in Arabidopsis , Brassica rapa , tea plant . The distinct filtering criteria and the different genetic structure among different plant species were likely to cause significant differences in SNP and InDel density . In a previous study, 22,412 SNPs and 9,436 InDels were identified in three radish transcriptome . In this study, 53,343 SNPs and 4,257 InDels were identified from transcriptomes sequencing data in two cultivars. The large discrepancy in SNPs and InDels number may be caused by different materials and SNP calling strategies. The short InDels (1–2 bp) were most prevalent types in the radish genome and transcriptome, which was coincided with previously studies in tea plant  and sesame . Previous studies indicated that the InDel within the RsRf3 locus played crucial roles in involving transition to fertility in cytoplasmic male sterility radish . The SNP and InDel in RsTT8 promoter were successfully distinguished between red and white-fleshed radish . These studies demonstrated that SNP and InDel markers within functional genes were involved in important agronomic traits of radish. The newly identified SNP and InDel markers could provide abundant data information for the genetics and functional genomics research in radish.
Development of InDel markers and application of germplasm resources
PCR based InDels with the advantages of co-dominant, inexpensive, and highly polymorphic were favored by more and more researchers in the field of gel based genotyping technology . In this study, 85 genomic and 15 transcriptomic polymorphic InDels with an 32.7% and 37.5% marker polymorphism were validated by PCR based results, respectively, which was lower than previously reported results with 70% marker polymorphism . The mean PIC value of InDels was 0.33 by genotyping of the 200 accessions, were lower than the PIC of the radish SSR markers, which can be explained that most InDels are single-locus, while SSRs are multi-locus [34, 35]. The alleles of single-locus markers can be positioned in the particular genomic loci, which was convenient to rapidly carry out genetic and breeding studies .
The genetic structure analysis of different genotypes is beneficial to develop varieties with a wide range of genetic backgrounds . In the present study, 200 radish accessions were partitioned into two groups by population structure analysis. Apparently, the radish cultivars from northern China were clustered together, the radish cultivars from southwest China were clustered together. UPGMA cluster analysis revealed the 200 accessions were also clustered into two major groups. Previous studies indicated that flowering time are increased with latitude of origin . It is important to introduce high latitude late flowering genes into radish. These results indicated that the InDel markers developed in this study was an important marker for genetic diversity analysis in radish.
SNP and InDel markers within differentially expressed FTR genes
It has been widely accepted that the transition from vegetative to reproductive growth played important roles in the life cycle of an angiosperm plant . The leaf tissues of early and late flowering time cultivars at the vegetative stage was conducted to perform transcriptome sequencing. A total of 145 flowering-time related DGEs were detected in radish by performing BLASTN analysis. Many studies have indicated that the InDel marker played crucial roles in the regulation of flowering time in radish [11, 39]. Vernalization promotes flowering in the late-flowering in plant. FLC encoding the MADS-box protein was a central floral repressor in Arabidopsis . Previous studies found that the 1627-bp insertion in the first intron of RsFLC2 gene in “Ninengo” plants, resulting in late-bolting . One 9-bp deletion and two insertions (8 and 9 bp) were identified in the intron and promoter of RsFLC3 gene . In this study, the RsFLC genes were expressed in two cultivars, but no differentially expressed, illustrating RsFLC genes were no difference in vegetative phase for different cultivars. In wheat, VRN1 was upstream of FT and induced the upregulation of FT expression under LD conditions . In radish, a 647-bp insertion in promoter region of RsVRN1 gene leaded the late-bolting phenotype in NAU-LB. The florigen FT played distinctive roles in regulation of the transition to reproductive development and flowering in plant . The flower meristem-identity gene LFY played important roles in regulating Arabidopsis floral development . Previous studies found that a 16-bp deletion and 18-bp deletion were detected in the intron of FT and LFY gene in radish . In this study, the RsFT and RsLFY gene were differentially expressed in two cultivars by transcriptome sequencing and confirmed by qRT-PCR analysis.
In this study, SNP and InDel markers within differentially expressed FTR genes were identified, such as eleven SNPs and two InDels in RsELF3 gene, ten SNPs in RsTEM1 gene, five SNPs in RsVIN3 gene, two SNPs and two InDels in RsCDF5 gene, one SNPs in RsFVE gene, one SNPs in RsEMF2 gene, one SNPs in RsAGL6 gene, one SNPs in RsCPK33 gene. ELF3 gene regulated vegetative photomorphogenesis and the photoperiodic induction in Arabidopsis flower . TEM1 combinatorially interacted with FT rerepressed the floral transition in Arabidopsis . The level of VIN3 expression was associated with the duration of cold exposure and the degree of FLC repression in Arabidopsis . Previous studies observed the CDF5 protein delayed flowering through directly repressing FT transcription . FVE, a component of the autonomous pathway involved in a protein complex repressed FLC expression . Loss of function mutations of the EMF2 genes leaded to early flower in Arabidopsis, confirming the important role of EMF2 in phase transitions by restraining reproductive development . AGL6 enhanced FT expression in the flc-3 background and the expression of FLC was downregulated in agl6-1D mutant . CPK33 was an important component of the florigen complex formation by FD phosphorylation . These results could provide the marker data for the research of genes involve in bolting and flowering in radish.
In this study, a large amount of genomic, transcriptomic SNPs and InDels were identified from genome and transcriptome sequencing between ‘YZH’ and ‘XHT’, respectively. A total of 100 novel InDel markers were developed and applied to genotype 200 radish accessions. Population structure and phylogenetic relationship revealed that the radish cultivars from northern China were clustered together and the southwest China cultivars were clustered together. RNA-Seq analysis demonstrated that 145 flowering time-related DGEs were detected, most of which were involved in flowering time integrator, circadian clock/photoperiod autonomous, and vernalization pathways. The transcriptomic SNPs and InDels within differentially expressed FTR genes were detected. These findings provide insights into the development and application of molecular markers for genetic diversity analysis, and provide the information of SNPs and InDels within flowering time-related DGEs in radish.
Availability of data and materials
Raw sequencing reads of RNA-seq in this paper have been deposited in the NCBI Sequence Read Archive (SRA) under BioProject accession number PRJNA874186.
Restriction fragment length polymorphisms
Random amplified polymorphic DNA
Amplified fragment length polymorphisms
Simple sequence repeats
Quantitative trait locus
FLOWERING LOCUS C
FLOWERING LOCUS T
Flowering-time related gene
Differential expressed genes
Polymerase chain reaction
Number of alleles
Polymorphism information content
Major allele frequency
Principal Component analysis
UBIQUITIN CARRIER PROTEIN 1
FLOWERING LOCUS Y
FLOWERING LOCUS VE
RELATIVE OF EARLY FLOWERING 6
EMBRYONIC FLOWER 2
PSEUDORESPONSE REGULATORS 3
CYCLING DOF FACTOR 1
Calcium-dependent protein kinases 6
VERNALIZATION INSENSITIVE 3
INDUCER OF CBF EXPRESSION 1
PHYTOCHROME INTERACTING FACTOR4
SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1
FLOWERING LOCUS T
Li WP, Cheng JW, Wu ZM, Qin C, Tan S, Tang X, et al. An InDel-based linkage map of hot pepper (Capsicum annuum). Mol Breed. 2015;35(1):1–10. https://doi.org/10.1007/s11032-015-0219-3.
Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML, et al. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat Rev Genet. 2011;12(7):499–510. https://doi.org/10.1038/nrg3012.
Luo XB, Xu L, Liang DY, Wang Y, Zhang W, Zhu XW, et al. Comparative transcriptomics uncovers alternative splicing and molecular marker development in radish (Raphanus sativus L.). BMC Genomics. 2017;18(1):1–15. https://doi.org/10.1186/s12864-017-3874-4
Liu S, An Y, Tong W, Qin XJ, Samarina L, Guo R, et al. Characterization of genome-wide genetic variations between two varieties of tea plant (Camellia sinensis) and development of InDel markers for genetic research. BMC Genomics. 2019;20(1):1–16. https://doi.org/10.1186/s12864-019-6347-0.
Ye WJ, Yang Y, Wang PR, Zhang Y, Zhang LY, Tian DF, et al. InDel marker development and QTL analysis of agronomic traits in mung bean [Vigna radiate (L.) Wilczek]. Mol Breed. 2021;41(10):1–14. https://doi.org/10.1007/s11032-021-01233-0.
Mahmood S, Li Z, Yue X, Wang B, Chen J, Liu K. Development of INDELs markers in oilseed rape (Brassica napus L.) using re-sequencing data. Mol Breed. 2016;36(6):1–13. https://doi.org/10.1007/s11032-016-0501-z.
Pan G, Li Z, Huang SQ, Tao J, Shi YL, Chen AG, et al. Genome-wide development of insertion-deletion (InDel) markers for Cannabis and its uses in genetic structure analysis of Chinese germplasm and sex-linked marker identification. BMC Genomics. 2021;22(1):1–12. https://doi.org/10.1186/s12864-021-07883-w.
Ye J, Wang X, Hu TX, Zhang FX, Wang B, Li CX, et al. An InDel in the promoter of Al-ACTIVATED MALATE TRANSPORTER9 selected during tomato domestication determines fruit malate contents and aluminum tolerance. Plant Cell. 2017;29(9):2249–68. https://doi.org/10.1105/tpc.17.00211.
Wang JL, Kong LP, Yu KC, Zhang FG, Shi XY, Wang YP, et al. Development and validation of InDel markers for identification of QTL underlying flowering time in soybean. Crop J. 2018;6(2):126–35. https://doi.org/10.1016/j.cj.2017.08.001.
Cheng XF, Xie HL, Zhang KH, Wen JQ. Enabling Medicago truncatula forward genetics: identification of genetic crossing partner for R108 and development of mapping resources for Tnt1 mutants. Plant J. 2022;111:608–16. https://doi.org/10.1111/tpj.15797.
Wang J, Qiu Y, Cheng F, Chen X, Zhang X, Wang H, et al. Genome-wide identification, characterization, and evolutionary analysis of flowering genes in radish (Raphanus sativus L.). BMC Genomics. 2017;18(1):1–10. https://doi.org/10.1186/s12864-017-4377-z.
Wang QB, Zhang YJ, Zhang L. A naturally occurring insertion in the RsFLC2 gene associated with late-bolting trait in radish (Raphanus sativus L.). Mol Breed. 2018;38(11):1–12. https://doi.org/10.1007/s11032-018-0897-8.
Luo XB, Xu L, Wang Y, Dong JH, Chen YL, Tang MJ, et al. An ultra‐high‐density genetic map provides insights into genome synteny, recombination landscape and taproot skin colour in radish (Raphanus sativus L.). Plant Biotechnol J. 2020;18(1):274–286. https://doi.org/10.1111/pbi.13195
Fornara F, de Montaigu A, Coupland G. SnapShot: control of flowering in Arabidopsis. Cell. 2010;141(3):550–550. https://doi.org/10.1016/j.cell.2010.04.024.
Bouche F, Lobet G, Tocquin P, Perilleux C. FLOR-ID: an interactive database of flowering-time gene networks in Arabidopsis thaliana. Nucleic Acids Res. 2016;44(D1):D1167–71. https://doi.org/10.1093/nar/gkv1054.
Corbesier L, Vincent C, Jang S, Fornara F, Fan Q, Searle I, et al. FT protein movement contributes to long-distance signaling in floral induction of Arabidopsis. Science. 2007;316(5827):1030–3. https://doi.org/10.1126/science.1141752.
Nie SS, Li C, Xu L, Wang Y, Huang DQ, Muleke EM, et al. De novo transcriptome analysis in radish (Raphanus sativus L.) and identification of critical genes involved in bolting and flowering. BMC Genomics. 2016;17(1):1–16. https://doi.org/10.1186/s12864-016-2633-2
Jung WY, Park HJ, Lee A, Lee SS, Kim YS, Cho HS. Identification of flowering-related genes responsible for differences in bolting time between two radish inbred lines. Front Plant Sci. 2016;7:1844. https://doi.org/10.3389/fpls.2016.01844.
Ma Y, Chhapekar SS, Rameneni JJ, Kim S, Gan TH, Choi SR, et al. Identification of qtls and candidate genes related to flower traits and bolting time in radish (Raphanus sativus L.). Agronomy. 2021;11(8):1623. https://doi.org/10.3390/agronomy11081623
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60. https://doi.org/10.1093/bioinformatics/btp324.
Mun JH, Chung H, Chung WH, Oh M, Jeong YM, Kim N, et al. Construction of a reference genetic map of Raphanus sativus based on genotyping by whole-genome resequencing. Theor Appl Genet. 2015;128(2):259–72. https://doi.org/10.1007/s00122-014-2426-4.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. https://doi.org/10.1093/bioinformatics/btp352.
Zhang XY, Sun XZ, Zhang S, Yang JH, Liu FF, Fan J. Comprehensive transcriptome analysis of grafting onto Artemisia scoparia W. to affect the aphid resistance of chrysanthemum (Chrysanthemum morifolium T.). BMC Genomics. 2019;20(1):1–14. https://doi.org/10.21203/rs.2.10583/v5
Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019;37(8):907–15. https://doi.org/10.1038/s41587-019-0201-4.
Yu GC, Wang LG, Han YY, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284–7. https://doi.org/10.1089/omi.2011.0118.
Liu K, Muse SV. PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics. 2005;21(9):2128–9. https://doi.org/10.1093/bioinformatics/bti282.
Zhang TF, Gu MF, Liu YH, Lv YD, Zhou L, Lu HY, et al. Development of novel InDel markers and genetic diversity in Chenopodium quinoa through whole-genome re-sequencing. BMC Genomics. 2017;18(1):1–15. https://doi.org/10.1186/s12864-017-4093-8.
Wei LB, Miao HM, Li C, Duan, YH, Niu JJ. Zhang TD, et al. Development of SNP and InDel markers via de novo transcriptome assembly in Sesamum indicum L. Mol Breed. 2014;34(4):2205–2217. https://doi.org/10.1007/s11032-014-0174-4
Jander G, Norris SR, Rounsley SD, Bush DF, Levin IM, Last RL. Arabidopsis Map-Based Cloning in the Post-Genome Era. Plant Physiol. 2002;129(2):440–50. https://doi.org/10.1104/pp.003533.
Liu B, Wang Y, Zhai W, Deng J, Wang H, Cui Y, et al. Development of InDel markers for Brassica rapa based on whole-genome re-sequencing. Theor Appl Genet. 2013;126(1):231–9. https://doi.org/10.1007/s00122-012-1976-6.
Wang ZW, Wang CD, Mei SY, Gao L, Zhou Y, Wang T. An insertion–deletion at a pentatricopeptide repeat locus linked to fertility transition to cytoplasmic male sterility in radish (Raphanus sativus L.). Mol Breed. 2015;35(4):1–5. https://doi.org/10.1007/s11032-015-0301-x
Kim S, Yun K, Park HY, Ahn JY, Yang JY, Song H, et al. Development of Molecular Markers for Predicting Radish (Raphanus sativus) Flesh Color Based on Polymorphisms in the RsTT8 Gene. Plants. 2021;10(7):1386. https://doi.org/10.3390/plants10071386.
Jain A, Roorkiwal M, Kale S, Garg V, Yadala R, Varshney RK. InDel markers: An extended marker resource for molecular breeding in chickpea. PLoS One. 2019;14(3):e0213999. https://doi.org/10.1371/journal.pone.0213999
Wang QB, Zhang L, Zheng PJ. Genetic diversity and evolutionary relationship analyses within and among Raphanus species using EST-SSR markers. Mol Breed. 2015;35(2):1–12. https://doi.org/10.1007/s11032-015-0261-1.
Lee ON, Park HY. Assessment of genetic diversity in cultivated radishes (Raphanus sativus) by agronomic traits and SSR markers. Sci Hortic. 2017;223:19–30. https://doi.org/10.1016/j.scienta.2017.05.025.
Zhou XJ, Dong Y, Zhao JJ, Huang L, Ren XP, Chen YN, et al. Genomic survey sequencing for development and validation of single-locus SSR markers in peanut (Arachis hypogaea L.). BMC Genomics. 2016;17(1):1–14. https://doi.org/10.1186/s12864-016-2743-x
Kang ES, Ha SM, Ko HC, Yu H J, Chae WB. Reproductive traits and molecular evidence related to the global distribution of cultivated radish (Raphanus sativus L.). Plant Syst Evol. 2016;302(10):1367–1380. https://doi.org/10.1007/s00606-016-1336-0
Wellmer F, Riechmann JL. Gene networks controlling the initiation of flower development. Trends Genet. 2010;26(12):519–27. https://doi.org/10.1016/j.tig.2010.09.001.
Xu L, Wang Y, Dong J, Zhang W, Tang M, Zhang W, et al. A chromosome‐level genome assembly of radish (Raphanus sativus L.) reveals insights into genome adaptation and differential bolting regulation. Plant Biotechnol J. 2023;21:990–1004. https://doi.org/10.1111/pbi.14011.
Sheldon CC, Rouse DT, Finnegan EJ, Peacock WJ, Dennis ES. The molecular basis of vernalization: the central role of FLOWERING LOCUS C (FLC). Proc Natl Acad Sci. 2000;97(7):3753–8. https://doi.org/10.1073/pnas.060023597.
Shimada S, Ogawa T, Kitagawa S, Suzuki T, Ikari C, Shitsukawa N, et al. A genetic network of flowering-time genes in wheat leaves, in which an APETALA1/FRUITFULL-like gene, VRN1, is upstream of FLOWERING LOCUS T. Plant J. 2009;58(4):668–81. https://doi.org/10.1111/j.1365-313X.2009.03806.x.
Wickland DP, Hanzawa Y. The FLOWERING LOCUS T/TERMINAL FLOWER 1 gene family: functional evolution and molecular mechanisms. Mol Plant. 2015;8(7):983–97. https://doi.org/10.1016/j.molp.2015.01.007.
Benlloch R, Kim MC, Sayou C, Thévenon E, Parcy F, Nilsson O. Integrating long-day flowering signals: a LEAFY binding site is essential for proper photoperiodic activation of APETALA1. Plant J. 2011;67(6):1094–102. https://doi.org/10.1111/j.1365-313X.2011.04660.x.
Lu SX, Webb CJ, Knowles SM, Kim SH, Wang Z, Tobin EM. CCA1 and ELF3 Interact in the control of hypocotyl length and flowering time in Arabidopsis. Plant Physiol. 2012;158(2):1079–88. https://doi.org/10.1104/pp.111.189670.
Hu H, Tian S, Xie G, Liu R, Wang N, Li S, et al. TEM1 combinatorially binds to FLOWERING LOCUS T and recruits a Polycomb factor to repress the floral transition in Arabidopsis. Proc Natl Acad Sci. 2021;118(35):e2103895118. https://doi.org/10.1073/pnas.2103895118/-/DCSupplemental.
Sung S, Amasino RM. Vernalization in Arabidopsis thaliana is mediated by the PHD finger protein VIN3. Nature. 2004;427(6970):159–64. https://doi.org/10.1038/nature02195.
Henriques R, Wang H, Liu J, Boix M, Huang LF, Chua NH. The antiphasic regulatory module comprising CDF5 and its antisense RNA FLORE links the circadian clock to photoperiodic flowering. New Phytol. 2017;216(3):854–67. https://doi.org/10.1111/nph.14703.
Ausín I, Alonso-Blanco C, Jarillo JA, Ruiz-García L, Martínez-Zapater JM. Regulation of flowering time by FVE, a retinoblastoma-associated protein. Nat Genet. 2004;36(2):162–6. https://doi.org/10.1038/ng1295.
Yoshida N, Yanai Y, Chen L, Kato Y, Hiratsuka J, Miwa T, et al. EMBRYONIC FLOWER2, a novel polycomb group protein homolog, mediates shoot development and flowering in Arabidopsis. Plant Cell. 2001;13(11):2471–81. https://doi.org/10.1105/tpc.010227.
Yoo SK, Wu X, Lee JS, Lee JS, Ahn JH. AGAMOUS-LIKE 6 is a floral promoter that negatively regulates the FLC/MAF clade genes and positively regulates FT in Arabidopsis. Plant J. 2011;65(1):62–76. https://doi.org/10.1111/j.1365-313x.2010.04402.x.
Kawamoto N, Sasabe M, Endo M, Machida Y, Araki T. Calcium-dependent protein kinases responsible for the phosphorylation of a bZIP transcription factor FD crucial for the florigen complex formation. Sci Rep. 2015;5(1):1–9. https://doi.org/10.1038/srep08341.
This work was supported by National Natural Science Foundation of China (31960598) and Guizhou Provincial Science and Technology Projects ( 2256 and  general 213).
Ethics approval and consent to participate
Not applicable. All collect radish in this current study are available in the Institute of Vegetable Industry Technology Research, Guizhou University with permission from the Competent Authority. Experimental research on plants (either cultivated or wild), including the collection of plant material, comply with relevant institutional, national, and international guidelines and legislation.
Consent for publication
The authors declare no conflict of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The length of genomic InDel markers between two radish cultivars.Fig S2.Principal component analysis of the all samples base on the FKPM vales of all transcripts.Fig. S3.The length of transcriptomic InDel markers between two radish cultivars.Fig S4. The full-length gels of the RsInDelR4-18 in 200 accessions. Table S1. Radish materials used in this study.Table S2. Primer information for qRT-PCR. Table S4. Summary of genome resequencing dada in two radish cultivars. Table S7. Summary of transcriptome dada in two radish cultivars.
The primers of InDels used in this study and genetic diversity analysis data.
The information of SNPs between two radish cultivars in resequencing data.
The information of InDels between two radish cultivars in resequencing data.
The information of detected DEGs between two radish cultivars.
The enriched GO terms of DEGs between two radish cultivars.
The information of SNPs between two radish cultivars in transcriptome data.
The information of InDels between two radish cultivars in transcriptome data.
Flowering time-related (FTR) genes in radish identified using Arabidopsis FTR genes as queries by BLASTN analysis.
Identification of differentially expressed FTR genes in between two radish cultivars.
The information of SNPs and InDels detected in differentially expressed FTR genes.
About this article
Cite this article
Li, Y., Luo, X., Peng, X. et al. Development of SNP and InDel markers by genome resequencing and transcriptome sequencing in radish (Raphanus sativus L.). BMC Genomics 24, 445 (2023). https://doi.org/10.1186/s12864-023-09528-6