In the study, we analyzed the frequency of microsatellites identified from 43,197 contig sequences of N. nucifera ‘Chinese Antique’. The 804 Mb of genomic sequence analyzed covers 86.5% of the lotus genome. One SSR was detected in every 9.33 kb in the genome sequences. The frequency of SSRs in lotus is much lower than that reported in the genome sequences of Brassica[24, 29, 31, 32] and rice
, but higher than the estimated frequency of SSRs in the genome of sorghum
. In the contig sequences, dinucleotide repeat motifs (60.73%) were the most frequently detected, followed by tri- (31.66%), tetra- (5.77%) and penta-nucleotide (1.21%) motifs. The most abundant dinucleotide motif detected was the AG/GA type (31.08%), followed by AT/TA (26.90%) and AC/CA (2.75%). The most common trinucleotide motif detected was the AAG/AGA/GAA type, followed by AAT/ATA/TAA and ATG/GAT/TGA (Table
1). These motif types and their proportions in the lotus genome are in close agreement with the patterns observed in dicots such as Arabidopsis, Brassica[24, 31, 32] and papaya
, and in the monocot sorghum
, in which AT and AG combinations of base pair motif types are predominant. The distribution of SSRs in the lotus genome is different from those observed in humans and Drosophila, in which AC is the most frequent dinucleotide repeat motif, followed by AT and AG
. The GC repeat motif is extremely rare in eukaryotic genomes, except in rice
[33, 34], and was absent from the lotus genome. These data suggested that AG and AT motifs were rich in the lotus genome.
The pseudo-testcross strategy is suitable for construction of genetic maps using a F1 population and was applied first for genetic mapping in Eucalyptus. Given that lotus is protogynous and cross-pollinated, a high level of heterozygosity is predicted for the genomes of Nelumbo species. With this prediction in mind, we created a F1 mapping population to construct genetic maps of Nelumbo in the present study. The two parents, N. nucifera ‘Chinese Antique’ and N. lutea ‘AL1’, diverge strongly in their geographical distributions and important morphological traits, such as plant size, and shape and color of the leaves and petals. The two parents also show considerable genetic differences
[3–7], which was confirmed here by the high polymorphism (61.91%) detected using the novel SSR markers (Additional file
1). Of 609 markers identified in the mapping population, the markers originated from the female parent (40.56%) were less than those inherited from the male parent (59.44%).
234 markers (38.42%) showed distorted segregation at the P < 0.05 level (Table
2), which was higher than the distorted segregation ratio reported for Dendrobium, strawberry
 and ryegrass
. Segregation distortion is a common phenomenon in mapping studies with a F1 population
[37–39]. Distortion of segregation ratios may result from biological factors, such as genetic isolation mechanisms, chromosome loss, locus duplication, and gamete selection
[40, 41]. Nonbiological factors, such as scoring errors and sampling errors, also can lead to distortion in segregation ratios
[42, 43]. In the present study, both biological factors and technical problems may have caused the observed segregation distortion in the F1 population. The high level of homozygosity of the female parent ‘Chinese Antique’, which was unexpected, has contributed to the considerable segregation distortion in the F1 population. Hence, 129 of the markers with distorted segregation ratios derived from the female parent showed no segregation in the F1 progenies. Our recent analysis of the F1 population estimate heterozygosity to be 0.03% for ‘Chinese Antique’, and 0.37 for ‘AL1’ (unpublished data). Thus, the similarly low heterozygosity in the male parent aggravated the distorted segregation in the F1 population. We observed that 96 markers skewed to the male parent showed similar segregation ratios and were distributed in a group spanning 20.23 cM (data not shown).
Molecular marker selection for genetic mapping is crucial for the credibility of linkage maps. Due to their inherited characteristics, SSR markers have many advantages in genetic mapping. They can also serve as anchor markers during comparative mapping with related species
. However, the high level of homozygosity of the female parent precluded the availability of SSR markers assigned to the maternal map. Moreover, using a single type of DNA marker to construct a lotus linkage map would easily lead to uneven marker distribution and large intervals between adjacent markers. Therefore, in order to increase the coverage of the linkage map and reduce the gap between markers, SRAP markers were selected to analyze the genotypes of the F1 population. Given that SRAP primers usually amplify the genomic intron and exon regions of functional genes
, they complement the use of SSR markers. When SRAP markers were added to the framework map constructed using SSR markers, the total length of linkage groups increased from 857.89 to 890.16 cM, and the average interval between two adjacent markers decreased from 5.08 to 3.97 cM. Moreover, SSR markers in LG4-F and LG6-F of the maternal framework map were incorporated into one linkage group LG1-F in the integrated map. No large gaps (> 25 cM) were detected in these genetic linkage maps (Additional file
3 and Additional file
4). Thus, to some extent, the application of SSR and SRAP markers increased the length of the linkage map and reduced the distance between two adjacent markers.
Using normally segregating markers, genetic linkage maps for Nelumbo were constructed successfully (Figures
2). The integrated linkage maps comprised 171 SSR and 53 SRAP markers. These SSR markers were derived from 93 novel SSR primers and 20 previously published SSR primers, and were anchored to 97 contigs of ‘Chinese Antique’ (Table
3 and Additional file
4). The contig-based SSR markers can anchor corresponding contigs onto a genetic map and further establish direct links between genetic, physical, and sequence-based maps
[25, 29, 30]. This SSR-based linkage maps are important reference linkage maps with which to anchor contigs and assemble the genome sequence of Nelumbo. As a result, these 97 contigs were merged into 60 scaffolds (Additional file
4). In the following genome assembly work, on the basis of the order of SSR markers in linkage groups and single nucleotide polymorphism (SNP) markers identified by restriction-site associated DNA sequencing (RAD-seq) technology, 43,197 contigs of lotus were assembled into approximately 3,605 sequence scaffolds (unpublished data).
In theory, the number of linkage groups should be consistent with the number of haploid chromosomes. As a genus comprising only diploid species, members of Nelumbo have eight pairs of chromosomes. In the present study, 7 and 11 linkage groups were detected for the female and male plants, respectively, based on SSR and SRAP markers (Table
3). Thus, the numbers of linkage groups did not match with the haploid chromosome number. It may be inferred that some linkage groups in the paternal map should correspond to one chromosome, but were divided because a large interval existed between the linkage group. No markers were detected in the eighth linkage group used to create the maternal map. This may be attributable to the low degree of heterozygosity in the maternal genome. Many preliminary genetic maps for other plant species usually contain a higher number of linkage groups than expected
[37, 45]. Because the mapping population used in the present study was the F1 generation of the cross between ‘Chinese Antique’ and ‘AL1’, no recombination between homoeologous chromosomes was possible
. Therefore, the markers in the two parental linkage groups could not integrate into one group. The high quality of the DNA markers and the accuracy of genetic mapping were exemplified by the clear separation of linkage groups for ‘Chinese Antique’ and ‘AL1’ (Figures
To improve the linkage map, the number of male and female linkage groups should be supposed equal to the haploid chromosome number, and the density of linkage group should be increased. The use of more advanced genotyping technology, such as high-throughput sequencing for SNP discovery at the whole-genome or -transcriptome scale, would aid the construction of a complete genetic linkage map. RAD-seq of multiple individuals using Illumina technology can identify and score thousands of SNP markers randomly distributed across the target genome
[47, 48]. The platform can effectively generate dense linkage maps for QTL analysis
[49–51]. Our recent work involving the RAD-seq approach identified 6,622 SNPs in the present F1 mapping population and enabled construction of a high-density genetic map together with the SSR markers developed in the present study. This paternal map spanned 494.3 cM and comprised 4,031 markers (3,895 SNP and 136 SSR) in nine linkage groups (unpublished data). Using the common SSR markers as anchor markers, comparative analysis identified the collinearity of these linkage maps. The genetic maps generated in the present study can serve as reference linkage maps of Nelumbo species for efficient studies of comparative genetics, QTL analysis of traits of economic importance, and molecular breeding with MAS.