Development of PCR markers specific to Dasypyrum villosum genome based on transcriptome data and their application in breeding Triticum aestivum-D. villosum#4 alien chromosome lines

Background Dasypyrum villosum is an important wild species of wheat (Triticum aestivum L.) and harbors many desirable genes that can be used to improve various traits of wheat. Compared with other D. villosum accessions, D. villosum#4 still remains less studied. In particular, chromosomes of D. villosum#4 except 6V#4 have not been introduced into wheat by addition or substitution and translocation, which is an essential step to identify and apply the alien desired genes. RNA-seq technology can generate large amounts of transcriptome sequences and accelerate the development of chromosome-specific molecular markers and assisted selection of alien chromosome line. Results We obtained the transcriptome of D. villosum#4 via a high-throughput sequencing technique, and then developed 76 markers specific to each chromosome arm of D. villosum#4 based on the bioinformatic analysis of the transcriptome data. The D. villosum#4 sequences containing the specific DNA markers were expected to be involved in different genes, among which most had functions in metabolic processes. Consequently, we mapped these newly developed molecular markers to the homologous chromosome of barley and obtained the chromosome localization of these markers on barley genome. Then we analyzed the collinearity of these markers among D. villosum, wheat, and barley. In succession, we identified six types of T. aestivum-D. villosum#4 alien chromosome lines which had one or more than one D. villosum#4 chromosome in the cross and backcross BC3F5 populations between T. durum–D. villosum#4 amphidiploid TH3 and wheat cv. Wan7107 by employing the selected specific markers, some of which were further confirmed to be translocation or addition lines by genomic in situ hybridization (GISH). Conclusion Seventy-six PCR markers specific to chromosomes of D. villosum#4 based on transcriptome data were developed in the current study and their collinearity among D. villosum, wheat, and barley were carried out. Six types of Triticum aestivum-D. villosum#4 alien chromosome lines were identified by using 12 developed markers and some of which were further confirmed by GISH. These novel T. aestivum-D. villosum#4 chromosome lines have great potential to be used for the introduction of desirable genes from D. villosum#4 into wheat by chromosomal translocation to breed new wheat varieties. Electronic supplementary material The online version of this article (10.1186/s12864-019-5630-4) contains supplementary material, which is available to authorized users.

High-throughput RNA-seq technology can generate large amounts of transcriptome sequences and accelerate the development of chromosome-specific molecular markers. Recently, RNA-seq has been frequently used to develop molecular markers specific to chromosomes of wild relatives of cultivated wheat. For example, Thinopyrum intermedia genome-specific EST-SSR markers, Agropyron cristatum chromosome 6P-specific EST markers, Aegilops longissima chromosome-arm specific PCR markers and D. villo-sum#4 chromosome 6V#4S-specific PCR markers were developed using transcriptome data [31,[41][42][43]. Thereby, RNA-seq is a potential strategy to develop molecular markers specific to different chromosome arms of wild species of wheat.
In this study, we obtained the transcriptome data of D. villosum#4 by using RNA-seq and generated unigene sequences by transcriptome assembly after removing the transcripts from wheat by reference genome matching. Then, molecular primers specific to D. villosum#4 chromosomes 1V to 7V were designed based on the unigene sequences. Furthermore, six types of T. aestivum-D. villosum#4 candidate alien chromosome lines derived from the cross of T. durum-D. villosum#4 amphidiploid TH3 and wheat cv. Wan7107 followed by a backcross were identified by using a portion of the developed markers and genomic in situ hybridization (GISH). The results obtained in this study will be useful to breed addition, substitution and translocation lines of T. aestivum-D. villosum#4 and subsequent wheat varieties by marker assisted selection (MAS).
Confirmation of markers specific to D. villosum#4 chromosomes 1V to 7V To develop specific molecular markers to each chromosome of D. villosum#4, seven T. aestivum-D. villosum#3 addition lines (DA1V#3, DA2V#3, DA3V#3, DA4V#3, DA5V#3, DA6V#3, and DA7V#3), No. 1026 and CS were amplified with the selected candidate primers, in which CS was used as the negative control, and No. 1026 was used as the positive control. Results showed that some primers amplified a unique band in D. villo-sum#4 and their corresponding addition line (one of the seven addition lines from DA1V#3 to DA7V#3) containing the targeted D. villosum#3 chromosome, and did not amplify the same band in CS (Fig. 4); some primers did not amplify any unique bands in CS and the seven addition lines (DA1V#3 to DA7V#3), only amplifying unique bands in D. villosum#4 No. 1026 (Additional file 1). There are abundant polymorphisms between the D. villosum#4 accession used for RNA-seq and the D. villosum#3 accession used as the alien chromosome donor of the seven addition lines in this study. However, some primers amplified unique alien band in an unexpected addition line, other than in the expected corresponding addition line. Finally, 76 D. vil-losum#4 chromosome-specific primers, most of which can specifically amplify bands in addition lines of DA1V#3 to DA7V#3, were obtained, including 1 in 1V#4, 9 in 2V#4, 27 in 3V#4, 10 in 4V#4, 15 in 5V#4, 13 in 6V#4, and 1 in 7V#4 (Additional file 2). These specific molecular markers can potentially be used to trace different chromosome arms of D. villosum#4 carrying useful genes for wheat breeding.
Comparative analysis of the developed markers specific to D. villosum#4 genome with wheat and barley genome As described above, we developed the markers specific to D. villosum#4 chromosomes 1V to 7V based on those unigenes which had the highest similarity to those sequences on the chromosomes of wheat groups 1 to 7. However, the D. villosum#4 transcriptome data ( Fig. 2) showed that the D. villosum#4 unigene sequences had the most homology with Hordeum vulgare. To clarify the collinearity of the 76 newly developed specific PCR markers with the corresponding sequences among D. villosum, wheat, and barley. The chromosome localization of these markers on the barley was carried out by BLASTN (Additional file 2). It is illustrated that most corresponding sequences of the molecular markers specific to 1V to 6V have the same chromosome localization in wheat and barley, including one marker specific to 1V, 7 of 9 markers specific to 2V, 23 of 27 markers specific to 3V, 4 of 10 markers specific to 4V, 14 of 15 markers specific to 5V, 6 of 13 markers specific to 6V (Additional file 2).
Functional annotation of D. villosum#4 genes harboring the specific molecular markers The 76 specific molecular markers were developed based on D. villosum#4 transcriptome sequences, and thus were expected to correspond to D. villosum#4 or D. villosum-associated genes. The sequences containing the 76 specific molecular markers were annotated in detail using the Nr, Nt, Pfam, KOG/COG, Swiss-prot, KEGG, and GO databases ( Table 3). The annotations revealed that these markers were involved in different genes; most genes had functions in metabolic processes, and 22 genes had no annotation. For example, markers 6V-10, 6V-11, and 6V-12 were annotated to phyB activation tagged suppressor 1 (BAS1), which is a gene that regulates brassinosteroid levels and light responsiveness in Arabidopsis [44]. However, it is unknown whether a similar functioning gene may exist in D. villosum#4. Additionally, marker 5V-9 was annotated to E3 ubiquitin ligase which can catalyze the protein.  (Fig. 7). Based on our marker and GISH detection results, these plants most likely contain D. villosum#4 chromosome 5VL or 5VS. According to a previous study, a PM resistance gene Pm55 was identified to be located on D. villosum#2 chromosome 5VS [19]. Therefore, we inferred that there might be a homologous gene of   (Fig. 6d).

Discussion
This study was conducted based on transcriptome data from D. villosum#4. Currently, RNA-seq has been widely used in plant science for finding new transcripts, understanding gene expression patterns, excavating  single-nucleotide polymorphisms (SNPs), exploring RNA alternative splicing and gene structural variation due to its high sequencing depth [45][46][47][48][49][50]. Particularly, RNA-seq has been extensively used to fully reveal the global gene expression profile at a specific point in time and a specific location (e.g. 0, 24, 48 h after inoculation of leaves with Bgt). For non-model plants with limited genomic sequence information, application of RNA-seq can be used to discover gene coding regions despite the small number of repetitive elements and high GC contents compared to whole genomes, which makes the assembly of transcriptome data relatively easy [51]. RNA-seq is thus a potential technique for developing molecular markers in some plant taxa, especially those with limited existing genomic sequences. Compared with general molecular markers, the PCR markers developed from transcription data have particular advantages in plant genetics and breeding [52]. Firstly, transcription data contain a large amount of transcript information, and the PCR marker based on transcription data is related to a definite trait and the corresponding transcript of this marker may be directly associated with the gene controlling this trait [53]. Secondly, the transcript sequences are highly conserved in their homologous genes, so the PCR markers developed based on transcription data are versatile in function and would be useful for modifying linkage maps or comparative maps among closely related species [54,55].
The transfer of the foreign chromosomes into wheat through alien chromosome lines to confer desirable new traits also brings undesirable foreign genes. Nevertheless, wheat alien chromosome lines are very important foundational materials in chromosome engineering and wheat breeding programs. At present, research regarding the utilization of D. villosum focuses primarily on germplasm innovation. Wheat distant breeding using D. villosum will allow introduction of some desired genes into the general wheat background. This process is likely to be more successful when wheat-D. villosum alien chromosome lines are used as a bridge. Currently, wheat-D. villosum amphidiploid, addition, substitution, and translocation lines have been bred already. In order to efficiently identify D. villosum chromatin in the wheat background, including its targeted chromosomes and arms or fragments, it is necessary to develop a large number of markers specific to each chromosome arm or region of D. villosum. Current research on D. villosum, including marker development, is mostly focused on the accession D. villosum#2. Although some preliminary investigations on D. villo-sum#4 have been performed [11,22,31,37,39,56], this genetic resource has not yet been fully exploited and utilized. Developing markers specific to the D. villo-sum#4 genome will accelerate the application of the potential genes in D. villosum#4 and further expand the genetic background of wheat.
To develop molecular markers specific to D. villo-sum#4 using RNA-seq without knowing the chromosome locations of all the unigene sequences obtained by analyzing the transcriptome data, we initially submitted the unigene sequences to the URGI BLAST database https://urgi.versailles.inra.fr/blast/ to screen for unigene sequences which have high similarity to the sequences on the chromosomes of wheat groups 1 to 7. It was thus assumed that these unigene sequences were located on the chromosomes 1V to 7V of D. villosum#4. This might be a convenient approach to develop markers specific to the chromosomes of D. villosum#4 in the wheat background. Following this strategy, we located the D. villo-sum#4 unigenes to wheat genome, and then designed the specific molecular markers based on the differences between these unigenes in wheat and D. villosum#4 genomes.
Finally, we developed 76 PCR markers specific to the chromosome arms of D. villosum#4. Among them, enough markers specific to the chromosomes 2V, 3V, 4V, 5V, and 6V were obtained. The results revealed by transcriptome analysis indicated that these unigene sequences had the most homology with the corresponding sequences from Hordeum vulgare (Fig. 2). Further, we located the locations of the developed molecular markers on the barley H genome (Additional file 2), which will provide a reference for the localization of these molecular markers specific to D. villosum#4 in the wild relative species of common wheat. Based on the locations of these markers on the wheat groups 1-7 as compared with their locations on the barley H genome, we found that the locations of 55 markers specific to D. villosum#4 chromosomes were consistent with their chromosomal locations in wheat and barley (Additional file 2). These results suggest that there are some affinities among the three species. Eight markers including 2V-1, 2V-2, 4V-1, 4V-2, 4V-3, 6V-10, 6V-11, and 6V-12 display different chromosomal location in D. villo-sum#4, wheat, and barley, although these markers share the same chromosomal locations in wheat and barley (Additional file 2). Five markers of 4V-9, 4V-10, 5V-12, 6V-7, and 6V-9 have the same chromosomal location in D. villosum#4 and wheat, but different from that in barley. Four markers 3V-2, 3V-3, 3V-4, and 4V-4 have the same chromosomal location in D. villosum#4 and barley, but different from that in wheat (Additional file 2). Four markers 3V-1, 6V-4, 6V-13, and 7V-1 show different chromosomal location among the three plant species (Additional file 2). A similar phenomenon was also found in a previous report [29]. According to our results, the high homologous sequence of marker 2V-1 is located on chromosomes 5DS in wheat and 5H in barley, while it is not specifically amplified in the 5V addition line and it is specifically amplified in the 2V addition line on the contrary; the high homologous sequences of markers 3V-2, 3V-3, and 3V-4 are located on chromosome 5DL in wheat and 3H in barley, while they are not specifically amplified in the 5V addition line and they are specifically amplified in the 3V addition line; the high homologous sequences of markers 4V-1, 4V-2 and 4V-3 are located on chromosomes 7DS, 7AS and 5DL in wheat and 7H, 5H in barley, while they are not specifically amplified in the 7V, 5V addition line and they are specifically amplified in the 4V addition line (Additional file 2). This phenomenon suggested that there were some synteny among chromosomes 2V in D. villosum#4, wheat homoeologous groups 2, and 5, and barley 2H and 5H, between chromosomes 3V in D. villosum#4 and wheat homoeologous groups 3 and 5, and among chromosomes 4V in D. villosum#4, wheat homoeologous groups 4, 5 and 7, and barley 4H, 5H and 7H. Previously, it was reported that 59 4A ESTs could detect the loci on 5BL and 5DL, 72 4A ESTs could detect the loci on 7AS and 7DS, which confirmed the translocations involving chromosomes 4AL, 5BL, and 7AS in wheat [57]. This finding could explain part results of altered chromosomal location abovementioned. Taking together, the results aforementioned indicate that the collinearity of the developed markers on the chromosomal regions in D. villosum#4, wheat, and barley is interrupted. These structural differences reflect that the three plant species have undergone different evolutionary processes.
Update, more than 300 D. villosum accessions have been collected from different native habitats, and only a few (D. villosum #1 to D. villosum#5) have been used to develop wheat-D. villosum chromosome addition, substitution, and translocation lines [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21]. There is some genetic divergence among different D. villosum accessions [24,31,39]. In this study, we were only able to develop one marker each for 1V and 7V. This phenomenon may be caused by the relationship between the chromosomes of D. villo-sum#4 and wheat, and the polymorphism between different sources of D. villosum. While developing the markers specific to 1V and 7V of D. villosum#4, we found that most primers (86.67% in 1V and 70.59% in 7V) could not amplify any unique band in DA1V#3-DA7V#3, but that they could amplify bands in D. villosum#4. This indicated that there might be polymorphisms in the 1V and 7V between the accessions of D. villosum#4 and D. villosum#3. The specific reasons for this phenomenon need to be further studied. We will also try to map the correspondence D. vil-losum#4 unigenes to H genome for the development 1Vand 7V-specific markers in the near future.

Conclusions
Dasypyrum villosum is an important wild species of wheat and contain a lot of novel genes that can be used to improve agronomic traits of wheat. Most studies on D. villosum have focused on D. villosum#1, D. villosum#2, and D. villosum#3, whereas the number of investigations on D. villosum#4 is relatively less. We obtained the transcriptome data of D. villosum#4 by using RNA-seq. Then, 76 molecular markers specific to D. villosum#4 chromosomes 1V to 7V were developed based on transcriptome data. Furthermore, six types of T. aestivum-D. villosum#4 alien chromosome lines were identified by using 12 developed markers and some of which were further confirmed by GISH.
These novel T. aestivum-D. villosum#4 chromosome lines have great potential to be used for the introduction of desirable genes from D. villosum#4 into wheat by chromosomal translocation to breed new wheat varieties.

RNA-seq and transcriptome assembly
Fresh leaves of D. villosum#4 accession No. 1026 were collected at seedling stage and RNA was extracted using TRI-ZOL (Thermo Fisher Scientific Inc., Shanghai, China). RNA quality, purity, integrity, and concentration were tested by the methods described in our previous publications [31,43]. The RNA-seq was carried out at Beijing Novogene Company, Beijing, China (http://www.novogene. com/). Libraries for sequencing were generated using NEB-Next® Ultra™ RNA Library Prep Kit for Illumina® (NEB, USA). Then, the libraries were sequenced and paired-end reads were generated on the Illumina Hiseq platform. Clean data were obtained by removing the low quality reads after the raw data was refined through in-house Perl scripts. The high quality clean data were used for further analyses. Transcriptome assembly was conducted using the Trinity software with the '--min_kmer_cov' parameter for a mismatch rate set to 2 by default, and all other parameters were also set to the default [58].

Gene functional annotation
Gene function annotation was performed using Nr, Nt, Pfam, KOG/COG, Swiss-Prot, KO, and GO databases.
Development of markers specific to the genome of D. villosum#4 based on RNA-seq data The strategy for screening candidate markers specific to the genome of D. villosum#4 was previously described by Li et al. [31]. Primers were designed using Primer Premier 6.0 (http://www.premierbiosoft.com/primerdesign/) based on those candidate unigene sequences which have the highest similarity to the sequences on the chromosomes of wheat groups 1 to 7.

DNA extraction and PCR amplification
Leaves were sampled from two-month-old seedlings of wheat lines CS and Wan7107, D. villosum acces-sion#4 No. 1026, wheat-D. villosum#3 addition lines DA1V#3, DA2V#3, DA3V#3, DA4V#3, DA5V#3, DA6V#3, and DA7V#3, and the backcross population of TH3 and Wan7107, and stored at − 80°C. Then, the genomic DNA for developing specific makers and alien chromosome lines was isolated from all samples using a NuClean Plant Genomic DNA Kit (CW Bio Inc., China). And the genomic DNA used for probe in GISH was extracted from D. villosum#4 using the CTAB method. Finally, the DNA was dissolved in 50 μL ddH 2 O.

Chromosomes localization of D. villosum#4-specific markers on the barley H genome
The corresponding sequences of the 76 developed molecular markers specific to D. villosum#4 chromosomes were submitted onto website http://plants.ensembl.org/Multi/ Tools/Blast?db=core to obtain chromosomal localization information through BLASTN searching.

GISH identification of new alien chromosome lines between T. aestivum and D. villosum#4
Wheat seeds were soaked in water containing 0.7% H 2 O 2 for 24 h and put on three layers of moist filter paper in petri dishes. When the root tips grow to 2-3 cm in length, they were cut off and put into a 0.5 mL centrifuge tube (a hole was drilled on the lid of the tube before using). Then, the root tips were treated with nitrous oxide for 2 h in a special container and fixed in 90% acetic acid for 5 min. They were rinsed twice with sterile water. The apical milky parts of the root tips were excised and digested with cellulase and pectinase for 55 min. Cell spreading preparations at mitosis were made as previously described [43,59,60]. D. villosum#4 genomic DNA was labeled with fuorescein-12-dUTP as a probe using the nick translation method according to the manufacturer's instructions. GISH was performed as previously description [43,[59][60][61]. Hybridization signals were captured under an Olympus BX-51 fluorescence microscope equipped with a charge-coupled device system.