The construction of a high-density linkage map for identifying SNP markers that are tightly linked to a nuclear-recessive major gene for male sterility in Cryptomeria japonica D. Don

Background High-density linkage maps facilitate the mapping of target genes and the construction of partial linkage maps around target loci to develop markers for marker-assisted selection (MAS). MAS is quite challenging in conifers because of their large, complex, and poorly-characterized genomes. Our goal was to construct a high-density linkage map to facilitate the identification of markers that are tightly linked to a major recessive male-sterile gene (ms1) for MAS in C. japonica, a species that is important in Japanese afforestation but which causes serious social pollinosis problems. Results We constructed a high-density saturated genetic linkage map for C. japonica using expressed sequence-derived co-dominant single nucleotide polymorphism (SNP) markers, most of which were genotyped using the GoldenGate genotyping assay. A total of 1261 markers were assigned to 11 linkage groups with an observed map length of 1405.2 cM and a mean distance between two adjacent markers of 1.1 cM; the number of linkage groups matched the basic chromosome number in C. japonica. Using this map, we located ms1 on the 9th linkage group and constructed a partial linkage map around the ms1 locus. This enabled us to identify a marker (hrmSNP970_sf) that is closely linked to the ms1 gene, being separated from it by only 0.5 cM. Conclusions Using the high-density map, we located the ms1 gene on the 9th linkage group and constructed a partial linkage map around the ms1 locus. The map distance between the ms1 gene and the tightly linked marker was only 0.5 cM. The identification of markers that are tightly linked to the ms1 gene will facilitate the early selection of male-sterile trees, which should expedite C. japonica breeding programs aimed at alleviating pollinosis problems without harming productivity.


Background
High-density linkage maps facilitate not only the understanding of genome structure and comparative genomic studies, but also quantitative trait loci (QTL) mapping and the construction of partial linkage maps around target loci to develop markers for marker-assisted selection (MAS). MAS is an effective method for accelerating the screening of target genes using tightly-linked molecular markers. Effective MAS for QTLs or specific genes has been reported in crops such as barley [1], rice [2] and tomato [3]. However, it is difficult to employ MAS in conifers, primarily because their genomes are very big and complex and are largely uncharacterized. Indeed, very few DNA markers linked to major genes have been reported in conifers; the only ones of note are some dominant markers linked to a major dominant resistance gene in Pinus lambertiana [4,5], P. monticola [6], P. taeda [7], and P. thunbergii [8,9].
Sugi (Cryptomeria japonica D. Don) is an allogamous, diploid, wind-pollinated conifer species with a haploid chromosome number (n) of 11 (2n = 22). Using flow cytometry, the DNA content of its haploid cells was estimated to be 11.045 pg/C [10], which corresponds to a haploid genome size of 10.8 Gb as calculated using the following expression: genome size (bp) = 0.978 × 10 9 × DNA content (pg) [11]. C. japonica is frequently used for commercial afforestation in Japan, and about 45% of all the man-made forests of Japan are composed of this species. However, since the 1970s, the incidence of C. japonica pollinosis in Japan has risen in line with the increasing number of man-made C. japonica forests [12]. Today, C. japonica pollinosis is a serious social problem in Japan, affecting almost 20% of the population. In 1992, a genetically male-sterile C. japonica tree whose sterility is determined by a major recessive gene (ms1) was found in Toyama prefecture [13,14]; this gene is expected play an important role in breeding for reduced pollen dispersal. Since the discovery of this male-sterile individual, considerable effort has been expended on characterizing male sterility in C. japonica, identifying male-sterile and plus-trees, creating artificial crosses between male-sterile and plus-trees, and propagating male-sterile trees [15]. Male-sterile trees are currently identified by direct inspection of the male strobili using a magnifying glass or a microscope. Conversely, plus-trees that are heterozygous for ms1 have been identified by examining segregation data for the progeny arising from artificial crosses; such trees provide important breeding material for seed production while avoiding the problems associated with inbreeding depression. However, these methods are very time-, labour-, and space-intensive; as such, it would be extremely useful to develop a MAS-based method for selecting trees carrying the male-fertile gene.
A composite linkage map for C. japonica was constructed using data for two pedigrees, YI and KO [16]. A total of 438 markers were assigned to 11 large linkage groups and some small or non-integrated linkage groups; the total observed map length was 1372.2 cM, and the average marker interval was 3.0 cM. In general, the most efficient way to study the linkage of a given target gene in a mapping population for which no linkage map is available is to start with a high-density linkage map.
In a recent study on spruces, it was found that the Golden Gate single nucleotide polymorphism (SNP) assay system developed by Illumina greatly facilitates the genetic mapping of species whose genomes have not been extensively studied [17]. To design an SNP genotyping array, it is necessary to identify a large number of SNPs. In the case of C. japonica, a database of 55,543 expressed sequence tags (ESTs) has been constructed from cDNA libraries obtained from seedlings, inner bark, female strobili, male strobili, pollen, leaves, vegetative buds and heartwood; this database is freely-available on the internet (ForestGEN; http://forestgen.ffpri.affrc. go.jp/en/info_cj.html) [18][19][20][21][22][23]. A number of cDNAbased sequence-tagged site (STS) markers have been also identified using these ESTs [24][25][26]. A library of C. japonica SNPs suitable for use in a Golden Gate SNP array has been identified [27], and complements the existing C. japonica EST databases, which will also be useful in identifying candidate genes associated with male gametophyte development and male sterility on the basis of sequence similarity and microarray expression analysis.
This paper reports a study in which a high-density linkage map for C. japonica was constructed and the mapping population was subjected to the GoldenGate genotyping assay. Using information from the high-density linkage map, we mapped the ms1 gene and constructed a partial linkage map around the ms1 locus. This allowed us to identify SNP markers that are tightly linked to the ms1 gene for use in MAS. The paper also includes a discussion of the importance of high-density maps and MAS markers in tree breeding.

Plant materials and DNA extraction
A high density linkage map for the YI pedigree of C. japonica was constructed using 150 half-sib progenies from crosses between two F 1 plants (YI96 ('Yabukuguri × Iwao') × YI38 ('Yabukuguri × Kuji34')) ( Figure 1) [16]. In addition, to avoid the effects of inbreeding depression, a partial linkage map around the ms1 locus (ms1 is a recessive male-sterile gene) was constructed for 209 progenies of the TO-S pedigree, which was derived from a cross between a TO2 F 1 plant and the elite tree Suzu-2 ( Figure 1). The TO2 F 1 plant was a cross between 'Toyama1 [male-sterile, ms1/ms1]' and 'Ohara-2 [malefertile, Ms1/Ms1]' and is thus a male-fertile tree that is heterozygous for the ms1 gene (Ms1/ms1). Suzu-2 is also male-fertile and heterozygous for the ms1 gene [male-fertile, Ms1/ms1] [28]. Of the 209 TO-S progeny, 142 were male-fertile and 67 were male-sterile. The actual segregation ratio thus deviated significantly from the expected ratio of 3:1 (X 2 = 5.55, df = 1, P <0.05); the reason for this is as yet unknown.
Needle tissue was collected from the parents and progeny of both the YI and TO-S pedigrees. Genomic DNA was extracted from individual needles using a modification of the CTAB method [29]. For use in the Illumina GoldenGate assay, the extracted DNA was purified using a genomic DNA purification kit (Promega) and its concentration was standardized (100-200 ng/uL).

Genetic markers
For the YI pedigree, six kinds of genetic markers were used to construct the linkage map: cleaved amplified polymorphic sequences (CAPS) markers, restriction fragment length polymorphism (RFLP) markers, microsatellite (simple sequence repeat; SSR) markers, ESTderived microsatellite (EST-SSR) markers, amplicon length polymorphism (ALP) markers and single nucleotide polymorphisms (SNP) merkers. The segregation data for 121 CAPSs, 117 RFLPs, 34 SSRs, 1 ALP and 5 SNP markers were obtained in the previous study [16]. In this study, we added 16 EST-SSR markers and 968 SNP markers including 761 gSNPs, 159 hrmSNPs, 33 ssSNPs and 15 meaSNPs to the linkage map for the YI pedigree (refer to latter sections for these names of SNPs). Detailed information on these markers can be found at Sugi Genome Database website (http://www. ffpri.affrc.go.jp/labs/cjgenome/).

SNP genotyping for randomly-developed markers
A large set of SNP markers was genotyped using the GoldenGate assay; collectively, this set is referred to as "gSNPs". For the YI pedigree, multiplexed genotyping of the gSNP markers was carried out using the 1536-plex GoldenGate array, in accordance with the manufacturer's protocol. A detailed description of the procedures employed and results obtained in the course of discovering these SNPs can be found elsewhere [27]. A total of 0.5-1.0 μg of genomic DNA per sample (at a concentration of 100-200 ng/μl) was used in the GoldenGate assay. To screen the gSNP markers linked to the ms1 gene, 17 male-sterile progenies and the parents of TO-S pedigree were also genotyped using this assay, since the gSNP markers found to be monomorphic in the YI pedigree were excluded from linkage mapping in the TO-S pedigree. The GoldenGate assay employs highly multiplexed allele-specific extension methods and universal PCR amplification reactions. The PCR products, which were fluorescently labeled by the incorporation of 5'labeled primers P1 (Cy3) and P2 (Cy5), were hybridized to capture probes on the beads in the array. The ratio of the fluorescent signals from 2 allele-specific ligation products was used to determine the sample's genotype. Signal intensity data processing, clustering and genotype calling were performed using the genotyping module in the BeadStudio software (Illumina). Genotyping was conducted exclusively on the basis of SNPs with an Illumina GenTrain score in excess of 0.25; the GenTrain score provides a measure of the reliability of SNP detection based on the distribution of genotypic classes. For each analyzed SNP, individual genotypes with an Illumina GenCall score below 0.25 were excluded; the GenCall score provides a measure of the reliability of an individual SNP call relative to the distribution of genotypic classes.
High Resolution Melting (HRM) analysis was also used to obtain SNP genotyping data for linkage map construction; the SNPs identified in this way are henceforth referred to as "hrmSNPs". The development procedure and analyzed condition was reported elsewhere [30]. The linkage map also incorporated EST-SSR markers; details on the use of this data have been reported previously [31].

Screening and genotyping of candidate genes
Sequence information for genes related to male gametophyte development and sterility were collected by reviewing the literature on other plant species such as Arabidopsis [32][33][34][35][36][37][38][39][40][41][42], Brassica [43], Oryza [44][45][46], Nicotiana [47] and Petunia [48]. The sequences from these publications were then compared to the available C. japonica EST sequences using TBLASTN (Additional file 1) [49]. In addition, genes related to male-sterility were screened by comprehensive expression analysis using a microarray containing around 366,000 probes derived from 22,882 tentative consensus sequences obtained from ESTs. Genes that exhibited at least a four-fold difference in expression between male-sterile and male-fertile strobili during the end of September and mid-October were selected for further analysis (Futamura et. al, manuscript in preparation). These time points were chosen because they represent the periods immediately before and immediately after male gametogenesis was observed in male-sterile strobili.
To identify SNP markers, PCR primers for the selected candidate genes were designed using the Pri-mer3 software. PCR amplifications were carried out using a Model 9700 GeneAmp PCR system (Applied Biosystems) in reaction mixtures with a total volume of 15 μL containing 20 mmol/L Tris-HCl (pH 8.0), 50 mmol/L KCl, 2 mmol/L MgCl 2 , 0.2 mmol/L of each deoxynucleoside triphosphate, 0.2 μmol/L of each primer, 5 ng template DNA, and 1.0 units of Taq polymerase (Promega). The following thermal profile was used: 3 min denaturation at 94°C, followed by 30 cycles of 45 s denaturation at 94°C, 30 s annealing at 55-64°C, and 30 s extension at 72°C, with a final extension step of 72°C for 10 min. Amplification products were separated by electrophoresis in 2% (w/v) agarose gels run in 1 × TAE buffer. The gels were then stained with ethidium bromide and visualized under UV light. Each PCR fragment was sequenced using the BigDye Terminator kit (Applied Biosystems) and ABI Prism 3100 DNA sequencer (Applied Biosystems) to identify SNPs in the parents of the YI pedigree (YI96 and YI38). Primers for both the forward and the backward direction were used. SNP markers associated with genes having significant similarities to genes from other species that are known to be important in male gametophyte development and/or male-sterility were collectively referred to as ssSNPs, while SNP markers associated with genes exhibiting differential expression between male-fertile and male-infertile individuals were collectively referred to as meaSNPs. The ssSNP and meaSNP markers were genotyped by sequencing. For the mapped meaSNP markers, proteins with high similarities to their original ESTs were identified from the NCBI RefSeq database using BLASTX (Additional file 2) [49] Construction of the linkage map for the YI pedigree Linkage analyses were conducted for all hrmSNP (Additional file 3) [30], EST-SSR [31], gSNP [27], ssSNP (Additional file 1) and meaSNP (Additional file 2) markers that exhibited polymorphism in the parents of the YI pedigree. The segregation data for CAPS, RFLP, SSR, SNP, and ALP markers in the previous study [16] were used together with the genotype data obtained in this work to construct a linkage map.
Chi-squared tests were performed for each locus to assess its deviation from the expected Mendelian segregation ratio. Loci exhibiting extreme segregation distortion (P <0.001) were excluded from further linkage analysis. All linkage analyses were performed using the JoinMap v3.0 software with the parameter CP (crosspollination) [50]. During the construction of the maps, markers were assigned to tentative linkage groups using logarithm of odds ratio (LOD) thresholds of 3.0 to 9.0, with increments of 1.0; an LOD threshold of 8.0 was ultimately used when defining groups of markers. Map distances were calculated using the Kosambi mapping function [51]. For the other parameters such as recombination frequency threshold and a ripple value, default settings were used. Images of the linkage groups were drawn using the Mapchart v2.0 software [52].

Estimation of genome length and map coverage
The observed genome length (G o ) for the linkage map was calculated as the sum of the sizes of the linkage groups. The expected genome length, G e , was estimated using method 4 of Chakravarti et al. (1991) [53], in which the total length of the linkage groups is multiplied by the factor (m + 1)/(m-1), where m is the number of markers in the linkage groups. The observed map coverage, C o , is the ratio of the observed and the estimated genome lengths, i.e. G o /G e . The expected genome coverage, C e , was calculated using the equation of Bishop et al. (1983) [54]: Here, R is the haploid number of chromosomes, N is the number of positioned loci, X is the maximum observed map distance between two adjacent assigned markers in cM at or above a minimum LOD threshold value of 8.0, and G e is the estimated genome length.

Analysis of marker distribution
If the markers were randomly distributed and the genome is divided in N intervals, the number of markers per interval would follow a Poisson distribution with a mean of μ. To determine whether the markers were randomly distributed, all linkage groups were divided into 1, 2, 3, 4, 5, 10 and 15 cM intervals. The number of intervals that contained markers were counted and the average number of markers per interval (μ) was calculated. If the average number of random occurrences per interval is μ, then the probability that x markers will fall within a given interval is We compared the actual distribution of markers to that expected for a Poisson distribution using the chisquared test as described by Kang et al. (2010) [55].

Localization of ms1 gene
Microsatellite markers were used to identify the linkage group on which the target gene is located because of their high polymorphism and straightforward analysis. Thus, to identify the linkage group containing the ms1 gene, a total of 19 microsatellite markers on the linkage map (Additional file 4) that were polymorphic in the parents of the TO-S pedigree were genotyped for 48 of the male-sterile progeny of the TO-S pedigree. If markers were linked to ms1, they should be significantly deviated from segregation rates expected from parental genotypes. PCR amplifications were carried out using the Model 9700 GeneAmp PCR System (Applied Biosystems). A reaction mixture with a total volume of 8 μL was used, consisting of 1 × Multiplex PCR master mix (Qiagen), fluorescently-labeled forward primers (0.2 μM), reverse primers (0.2 μM), and 5 ng of genomic DNA. The following thermal profile was used: 15 min at 94°C, then 32 cycles of 30 sec at 94°C, 90 sec at 55-62°C , 60 sec at 72°C, followed by 30 min at 72°C; the results were analyzed using a 3100 genetic analyzer (Applied Biosystems). The independence of the segregation of the SSR markers and the ms1 gene was investigated using chi-square tests to identify markers linked to the ms1 gene.
The construction of a partial linkage map around the ms1 locus Linkage analysis using EST-SSR markers indicated that the ms1 gene was located in the 9 th linkage group, which is hereafter referred to as "LG9." The markers in LG9 that exhibited polymorphism in the parents of the TO-S pedigree were therefore used to construct a partial linkage map around the ms1 locus. In addition, gSNP markers that exhibited polymorphism in the parents of the TO-S pedigree and whose segregation with the ms1 gene deviated significantly from that expected in the absence of linkage (as judged by the chi-squared test, P <0.05) in 17 male-sterile progenies were also used in the construction of the partial linkage map.
The markers of LG-9 in the TO-S pedigree except two microsatellite markers and four CAPS markers were genotyped on a BioMark 48.48 Dynamic Array (Fluidigm) using KASPar assays. Primer pairs suitable for the KASPar assays were designed on the basis of the sequences of the relevant markers (gSNP, meaSNP, ssSNP, hrmSNP and CAPS); see Additional file 5. A total of 6.5 ng of genomic DNA per sample (at a concentration of 5 ng/μl) was used for specific target amplification (STA); the KASPar reactions were performed using the STA products after dilution by a factor of 100. The primers were designed and the assays were performed as specified by the manufacturer. The data obtained were analyzed using the Fluidigm SNP Genotyping Analysis software to obtain genotype calls.
Linkage analyses for the TO-S pedigree were performed using the same conditions as were used for the YI pedigree. Loci whose segregation patterns deviated significantly from Mendelian ratios were not excluded from the further linkage analysis in the TO-S pedigree because distortion of loci linked to the ms1 gene was expected in this case.

Results
Identifying SNP markers associated with candidate genes related to male gametophyte development and male sterility On the basis of the sequence similarity results, 238 primer pairs were designed for various candidate genes; of these, 141 generated PCR products that could be separated and identified after electrophoresis on a 2% agarose gel. Of these obtained STSs, 36 were polymorphic in the parents of the YI pedigree (Additional file 1).
In the microarray expression analysis, significant differences in expression were observed between trees with male-fertile and male-sterile strobili using probes derived from 32 different tentative consensus sequences. Primer pairs were designed on the basis of these sequences and the 17 meaSNP markers that exhibited polymorphism in the parents of the YI pedigree (Additional file 2).

Construction of a linkage map for the YI pedigree
Of the 1290 markers that exhibited no segregation distortion, 1279 could be assigned to specific linkage groups; in total, 1262 markers were mapped, including 761 gSNPs, 159 hrmSNPs, 121 CAPSs, 117 RFLPs, 34 SSRs, 33 ssSNPs, 15 meaSNPs, 16 EST-SSRs, 5 SNPs and 1 ALP (Figures 2, 3 and 4). The observed and estimated map lengths were 1405.2 cM (G o ) and 1430.6 cM (G e ), respectively; the observed and estimated map coverages were estimated to be 98.2% (C o ) and 100.0% (C e ), respectively. The maximum observed map distance between two adjacent assigned markers in cM (X) was 17.3 cM. Significant deviations from the Poisson distribution of markers were observed for marker intervals of 1 cM, 2 cM, 3 cM, 4 cM, 5 cM, 10 cM and 15 cM (P < 0.001), indicating that the 1262 markers used to construct the linkage map for the YI pedigree were not randomly distributed (Additional file 6). Linkage map for the YI pedigree is shown on the Tree-Genes database (http://dendrome.ucdavis.edu/cmap/, accession: TG122).
The construction of a partial linkage map around the ms1 locus 19 SSRs were screened to identify markers linked to the ms1 gene. Independence testing was conducted using the chi-squared method; two suitable SSRs were found, both of which are on LG9: CJG0101S and CJG0177_S (Additional file 4).
A partial linkage map around the ms1 locus was constructed for the TO-S pedigree using the markers assigned to LG9 in the YI pedigree. While gSNP markers were not used in the construction of the YI linkage map, those that appeared to be linked to the ms1 gene on the basis of the segregation patterns observed in 17 male-infertile TO-S progeny were also used in the construction of the partial map. In total, 42 markers were ultimately mapped in the vicinity of the ms1 locus (Figure 5). Three markers (ssSNP_005, meaSNP_007 and ssSNP_013) identified by the candidate gene approach were mapped to LG9 but were not closely linked to the ms1 gene (i.e. were separated from it by > 20 cM). The YI map indicated that most of the ssSNP and meaSNP markers considered were associated with other linkage groups. The partial linkage map around the ms1 locus was constructed for the TO-S pedigree is shown on the TreeGenes database (http://dendrome.ucdavis.edu/ cmap/, accession: TG123).
The marker most tightly linked to the ms1 gene was hrmSNP970_sf, which was separated from it by only 0.5 cM in the TO-S pedigree.

A high-density linkage map for the YI pedigree
The success rate for SNP genotyping (i.e. the SNP conversion rate) with the GoldenGate assay was 81.6% in Picea glauca [17], 82.0% in P. mariana [17] and 66.9% in Pinus taeda [56]; these values are similar to those we observed for C. japonica (84.9%), although the evaluation criteria used in this work differed slightly from those used in previous studies. These values are slightly lower than that obtained in crops, e.g. 91.3% in barley [57], 89.0% in soybean [58] and 92.0% in maize [59]. The lower conversion rate for conifers compared to food crops may be due to the complexity of conifer genomes, which would be expected to hamper the development of specific probes for the assay, as suggested by Pavy et al. (2008) [17]. Nonetheless, the high success rate achieved in this work suggests that Illumina's highthroughput GoldenGate SNP genotyping assay is an efficient method for mapping EST-based markers and enriching linkage maps for almost any species.
The GoldenGate assay was used to construct a highdensity linkage map of C. japonica featuring 1262 markers. The observed map length (G o ) was 1405.2 cM, and the average marker interval was only 1.1 cM. A total of 11 distinct linkage groups were identified, which corresponds to the haploid number of chromosomes in C. japonica. The number of mapped markers in C. japonica was higher and the average interval between markers was smaller than those reported for other conifers ( Table 2). While anonymous markers such as AFLPs were used extensively in previous conifer linkage maps, most of the mapped markers in C. japonica were highly informative co-dominant markers derived from EST. The incorporation of EST-based markers into the linkage map should facilitate comparative mapping between C. japonica and related species such as Chamaecyparis obtusa, which is the second most important forestry tree species in Japan; it has also been shown that around 30% of C. japonica EST-based STS markers are directly applicable to C. obtusa [60]. In addition, when ESTbased markers are identified in a target trait locus, they are generally more effective than other kinds of markers for identifying genes that affect the relevant traits [61,62].
The genome length in C. japonica estimated in our study is 1430.6 cM (Kosambi). The observed and expected genome coverages were 98.2 (C o ) and 100.0% (C e ), respectively. It thus appears that the C. japonica linkage map developed in this work is almost saturated.
As was shown to be the case in P. mariana [55], chisquared testing indicated that the distribution of markers in the C. japonica genome was non-random. This suggests that there are marker-rich and marker-poor  regions in the C. japonica linkage map. If markers were distributed equally over the genome, they would be expected to concentrate in regions of suppressed recombination such as centromeric regions, in which the map distance between markers becomes shorter than their physical separation, as reported for barley [63] and maize [64]. Feuillet and Keller (2002) [65] suggested that genes are not distributed randomly and there are gene-rich and gene-poor regions in species with large genomes. The non-random distribution of markers in the C. japonica linkage map created in this work might thus reflect the distribution of genes in this species, since most of the mapped markers were based on ESTs.
To fill in the gaps, it would be desirable to add data on markers for non-coding regions such as genomic microsatellite markers or random genetic markers such as AFLPs.
It seems unlikely that any of the ssSNP and meaSNP markers identified in this work correspond to the ms1 gene. It is possible that this is because the genes associated with male gametophyte development and male sterility in other species are not closely related to the ms1 gene in C. japonica. Alternatively, the putative homologs detected in this study might not be orthologous to those involved in male gametophyte development in other species; instead, they may be paralogous, with similar domains. While one would expect that genes involved in male gametogenesis would exhibit differential expression in male-sterile and male-fertile individuals, and that this difference would be detectable by analysing the microarray expression data, it is possible that the difference may be statistically insignificant owing to the limitations of the methodology. Future studies in this area should aim to address these issues. In addition to the problems related to the identification of a specific sequence corresponding to ms1, it should be noted that 217 (80.4%) of the 270 candidate genes could not be located on the linkage map due to a lack of polymorphism in the parents of the YI pedigree or because the sequence data was very complex. The efficiency of the mapping could potentially be improved to address these issues by designing primers that span exon/intron junctions. Once suitable candidate genes have been identified, they should be exploited in studies on the TO-S pedigree.
A partial linkage map around the ms1 locus Although we were unable to assign a specific sequence for ms1, we were able to determine that it is located on LG9 and to construct a partial linkage map around the ms1 locus ( Figure 5). To our knowledge, this is the first case in which a recessive major gene has been localized on a linkage map in conifers, although they have previously been identified using morphological and  biochemical mutants [66][67][68][69]. The closest marker to the ms1 locus was hrmSNP970_sf; the map distance between the locus and the marker was only 0.5 cM. By using the two closest markers to the ms1 locus (hrmSNP970_sf and gSNP01452), the male-sterility or -fertility of 96.6% of the 205 individuals in the TO-S pedigree could be accurately determined. In theory, this will facilitate the selection of individuals that are heterozygous for the ms1 gene without needing to create control crosses. The identification of these two adjacent markers will increase the viability of using MAS in the TO-S pedigree. This will make it possible to perform early selection of germinated seedlings, which could be useful in that it would reduce the expenditure of time, labour, and space on the growing of seedlings.
While the two markers will be very useful for TO-S pedigree, it will be necessary to identify the target gene itself to do MAS in other pedigrees. To this end, candidate gene approaches may become more useful as the amount of information on the species increases and techniques improve. An alternative approach for identifying target genes is genome walking using a BAC library. The likelihood of isolating a gene using genome walking depends on the physical distance (per cM) in the target species. On the basis of the estimated genome length calculated from the recombination rates observed in this study (G e ; 1430.6 cM), the average physical distance per cM would be roughly 7.5 Mb. This physical distance per cM in C. japonica suggests that the closest markers we identified may be around 3.8 Mb from ms1 locus. As such, genome walking would be impractical with current methods, but it would be feasible to isolate the ms1 gene by BAC walking if markers lying within 0.1 cM of the  target could be identified. We have constructed a BAC library for C. japonica that covers 4 times as much of the genome as the marker libraries employed in this work and relates to a mapping population with a large number of individuals (unpublished data). The construction of a more dense linkage map around the ms1   gene will greatly facilitate its isolation or that of more tightly-linked markers.

Conclusions
We have constructed a high-density linkage map for C. japonica using expressed sequence-derived co-dominant SNP markers that were primarily genotyped using the GoldenGate assay. A total of 1261 markers were assigned to 11 linkage groups with an observed map length of 1405.2 cM and a mean distance between adjacent markers of 1.1 cM. The number of linkage groups identified matches the basic chromosome number of C. japonica. While other conifer linkage maps have largely relied on anonymous markers such as AFLPs, most of the markers mapped for C. japonica were highly informative co-dominant markers derived from ESTs. The expected map coverage rate of this constructed linkage map was very high (100.0%), indicating that the linkage map developed in this work is almost saturated. The distribution of the mapped loci on the linkage map for the YI pedigree was not random, as demonstrated by a chi-squared test (c 2 = 3233.7, df = 13, P < 0.001).
Genetic male-sterility in C. japonica is known to be controlled by a major recessive gene (ms1). We mapped the ms1 gene to the 9 th linkage group and constructed a partial linkage map around the ms1 locus using information from the dense map constructed in this work. A total of 42 markers were located on this partial linkage map. A marker, hrmSNP970_sf that is tightly linked to the ms1 gene was identified; the two are separated by only 0.5 cM. The markers linked to ms1 identified in this work will facilitate the early selection of male-sterile trees, which should prove useful in C. japonica breeding programs.