Factors affecting the accuracy of genomic selection for growth and wood quality traits in an advanced-breeding population of black spruce (Picea mariana)

Background Genomic selection (GS) uses information from genomic signatures consisting of thousands of genetic markers to predict complex traits. As such, GS represents a promising approach to accelerate tree breeding, which is especially relevant for the genetic improvement of boreal conifers characterized by long breeding cycles. In the present study, we tested GS in an advanced-breeding population of the boreal black spruce (Picea mariana [Mill.] BSP) for growth and wood quality traits, and concurrently examined factors affecting GS model accuracy. Results The study relied on 734 25-year-old trees belonging to 34 full-sib families derived from 27 parents and that were established on two contrasting sites. Genomic profiles were obtained from 4993 Single Nucleotide Polymorphisms (SNPs) representative of as many gene loci distributed among the 12 linkage groups common to spruce. GS models were obtained for four growth and wood traits. Validation using independent sets of trees showed that GS model accuracy was high, related to trait heritability and equivalent to that of conventional pedigree-based models. In forward selection, gains per unit of time were three times higher with the GS approach than with conventional selection. In addition, models were also accurate across sites, indicating little genotype-by-environment interaction in the area investigated. Using information from half-sibs instead of full-sibs led to a significant reduction in model accuracy, indicating that the inclusion of relatedness in the model contributed to its higher accuracies. About 500 to 1000 markers were sufficient to obtain GS model accuracy almost equivalent to that obtained with all markers, whether they were well spread across the genome or from a single linkage group, further confirming the implication of relatedness and potential long-range linkage disequilibrium (LD) in the high accuracy estimates obtained. Only slightly higher model accuracy was obtained when using marker subsets that were identified to carry large effects, indicating a minor role for short-range LD in this population. Conclusions This study supports the integration of GS models in advanced-generation tree breeding programs, given that high genomic prediction accuracy was obtained with a relatively small number of markers due to high relatedness and family structure in the population. In boreal spruce breeding programs and similar ones with long breeding cycles, much larger gain per unit of time can be obtained from genomic selection at an early age than by the conventional approach. GS thus appears highly profitable, especially in the context of forward selection in species which are amenable to mass vegetative propagation of selected stock, such as spruces.


Results:
The study relied on 734 25-year-old trees belonging to 34 full-sib families derived from 27 parents and that were established on two contrasting sites. Genomic profiles were obtained from 4993 Single Nucleotide Polymorphisms (SNPs) representative of as many gene loci distributed among the 12 linkage groups common to spruce. GS models were obtained for four growth and wood traits. Validation using independent sets of trees showed that GS model accuracy was high, related to trait heritability and equivalent to that of conventional pedigree-based models. In forward selection, gains per unit of time were three times higher with the GS approach than with conventional selection. In addition, models were also accurate across sites, indicating little genotype-by-environment interaction in the area investigated. Using information from half-sibs instead of full-sibs led to a significant reduction in model accuracy, indicating that the inclusion of relatedness in the model contributed to its higher accuracies. About 500 to 1000 markers were sufficient to obtain GS model accuracy almost equivalent to that obtained with all markers, whether they were well spread across the genome or from a single linkage group, further confirming the implication of relatedness and potential longrange linkage disequilibrium (LD) in the high accuracy estimates obtained. Only slightly higher model accuracy was obtained when using marker subsets that were identified to carry large effects, indicating a minor role for short-range LD in this population.
(Continued on next page)

Background
Genomics of forest trees is rapidly gaining momentum as it promises to unravel the genetic control of adaptive and economically important traits, in part to satisfy the increasing demand for high quality wood fibre worldwide but also, to cope with the increasing challenges imposed by changing climates and environments [1,2]. Uses of genomic information for breeding are diverse and may rely on reconstruction of the pedigree, verification of co-ancestry in breeding populations for genetic diversity management purposes, or on the correlation of marker information with phenotypes for selection [3]. For example, marker-assisted selection (MAS) was among the first approaches suggested to accelerate tree improvement [4]. In conifers, with the use of current statistical methods and correction for multiple testing, a limited number of markers or genes have been reported to be genetically linked to economically important traits, such as wood quality and/or tree growth [5][6][7]. Individual Single Nucleotide Polymorphism (SNP) markers are largely constrained to explaining only a minor proportion of the variation, as they rarely explain more than 5% of quantitative trait variation [5][6][7][8][9]. Therefore, for most quantitative traits, the association approach does not appear useful enough in breeding selection where accurate predictions of genetic values are necessary. When using large progeny sets, the proportion of trait variance explained by individual quantitative trait loci (QTLs) has only been slightly higher, which is consistent with the multigenic control of these traits [9][10][11]. Additionally, the weak linkage disequilibrium (LD) between markers and QTLs across different genetic backgrounds, as well as the narrow range captured in typical QTL studies limits their use to within family selection.
To overcome the limitations of MAS, genomic selection (GS) uses dense genomic marker information (called genomic signatures or profiles) of individuals, as well as their parents when available, to predict their breeding value [12]. Contrary to MAS, GS models simultaneously estimate the effects of all available markers in a training population. Predictions or genomic-estimated breeding values (GEBVs) are then made for progeny of the same or future generations [12]. One of the basic assumptions is that the markers are scattered throughout the genome so that at least some of them are in direct linkage with causal loci [13].
Most economically important traits are of quantitative nature and controlled by many genes [2]. Consequently, it is crucial that an adequate number of markers are used to attain sufficient coverage of the genome, and that as many loci as possible are in LD with QTLs [14]. The combination of the rapidly decreasing costs of highthroughput genotyping as well as the significant advances in sequencing and subsequent computation has facilitated GS-based selection approaches to be considered in organisms even with very large genomes, such as trees and conifers, in particular.
Generally, tree improvement of boreal conifer species is characterized by breeding cycles of 30 years and longer, which includes production of crosses, field evaluation of progeny and performing selections, and the propagation of selected superior material through sexual or vegetative means [30]. Early selection methods that facilitate accurate prediction of mature phenotypes at a younger stage are therefore vital to shorten breeding cycles and ultimately improve the cost-efficiency of such breeding programs. Traditionally, these methods have relied on indirect methods of phenotypic selection, which may be less effective, especially when genetic correlations between the juvenile and mature traits are low [31]. Performing selection directly on genomic marker information would, in theory, avoid the loss of prediction accuracy due to imperfect correlation between different life stages, and would mostly be dependent on the heritability of the target trait at the mature stage [32].
Genomic selection promises to significantly reduce the time commitment for completing a breeding cycle and thus to increase genetic gain per time unit by avoiding the field testing stages, that often represent the largest time commitment [3,32]. Technically, the estimates of breeding values could be obtained at the seedling stage or even from seed or somatic tissue [1]. Hence, the time required to select elite genotypes of northern conifers, which is largely impacted by their slow inherent growth could be overcome, and mature phenotypes would only be needed for model construction and validation [26].
Previous GS studies in forest trees have reported moderate to high accuracy of selection models with correlations of between 0.6 and 0.8 when comparing GEBVs with conventional EBVs, when full-sib families were considered [20,25,33]. This suggests that under certain conditions of high relatedness where long-range linkage disequilibrium (LD) is likely more a potent factor of accuracy than short-range LD, the GS approach can easily compete with the conventional pedigree-based breeding approach or even outperform it due to the reduction in time needed to complete a breeding cycle [25,32,33]. In contrast, lower accuracy values (0.3 and 0.5) have been observed when half-sib families were considered [26,28], and GS models developed in unrelated trees had very low power to make prediction [25,26]. The marginal to low performance of these latter two cases of GS is linked to the larger effective population size and low LD, which is common to natural populations of undomesticated trees [34][35][36]. Large population sizes indeed results in more recombination and a greater diversity among available gametes within populations, from high outcrossing rates in wind-pollinated species such as black spruce [37]. Moreover, the genetic drift common to small populations with fewer parents generating non-random associations among alleles at different loci, i.e., gametic disequilibrium or LD [38], is not a significant factor in such conditions. Black spruce (Picea mariana [Mill.] B.S.P.) is one of the most abundantly distributed trees throughout the transcontinental boreal forest of Canada and Alaska. It is also one of the most reforested species in Canada, with 65 million seedlings being planted per year in the province of Québec alone, and has been the subject of advanced-generation breeding programs [30].
The objectives of this study were to (1) estimate accuracies and gains per unit of time derived from GS for wood quality and growth traits in an advanced-breeding population of the black spruce; (2) assess variation in accuracy from building site-specific or multiple-site models; (3) explore the roles of relatedness, short-and long-range linkage disequilibrium in genomic prediction accuracy by simulating different family structures and testing different subsets of markers; and (4) investigate the effect of sample sizes on model accuracy.
To meet these objectives, we used trait data obtained from 25-year-old black spruces grown in a genetic test replicated on two environmentally contrasted sites in Québec Canada. The genetic test consisted of full-sib families generated using a partial diallel mating design. Trees were genotyped for 4993 SNP markers representative of as many distinct gene loci, and originating from a black spruce high-confidence SNP catalogue previously assembled and validated [39].

Sampling of plant material
Phenotypic and genetic data were obtained for 734 25year-old progeny trees belonging to 34 controlledpollinated families derived from a partial diallel mating design that consisted of 27 parents originating from 9 provenances from the Canadian provinces of Ontario, New Brunswick and Manitoba, and one from Maine in the northeastern United States. The parental trees had previously been identified for their superior growth and stem form in a range-wide provenance trial established on 4 sites in Québec [40]. Progeny were raised as cuttings in the nursery for three years and then established in 1991 by the Ministère des Forêts, de la Faune et des Parcs du Québec (MFFPQ) on two forest sites using a randomized complete block design, with 4-tree row plots and a 2 m by 2 m spacing. The test sites are located in two contrasting environments in the eastern part of the province of Québec: 1) the Matapedia arboretum (latitude: 48°32'N, longitude: 67°25'W, elevation 216 m, 333 trees sampled), which is characterized by a warmer climate typical of the balsam firyellow birch forest, and 2) the Robidoux site (latitude: 48°18'N, longitude: 65°31'W, elevation 275 m, 401 trees sampled), which is characterized by a colder climate typical of the balsam firwhite birch forest domain and under the influence of the Chic-Choc mountains (1270 m) on the Gaspésie peninsula. Needle tissues for DNA extraction were sampled in the upper third of the crown, immediately placed on ice and stored at −10°C until further processing. DNA extraction protocols used for collected samples are described in detail in Pavy et al. [41].

Phenotypic trait determination
Tree height and diameter at breast height (DBH, 1.3 m above ground) were recorded in 2013 at the age of 25 years, and a wood increment core was extracted from the south facing side of each tree. Cores were stored in a freezer, conditioned to 7% moisture and cut to 1.68 mm thickness prior to X-ray densitometry analyses (Quintek Measurement Systems, TN). Wood density was calculated as a ring area weighted mean from recorded pith to bark wood density profiles. Microfibril angle (MFA) was estimated using X-Ray diffraction, on a Bruker D8 Discovery unit equipped with an area array detector, as per Ukrainetz et al. [42].

Genotyping
In recent years, significant genomic resources have been developed for white spruce and Norway spruce (Picea abies (L.) Karst.), including draft genome sequences [43][44][45][46], genetic and QTL maps [11,47,48], gene catalog and expression chips [49], large SNP registries and high throughput genotyping chips [40,50], but comparatively few genomic resources exist for black spruce. Before investigating GS in black spruce, exome capture and sequencing were used together with an in-house bioinformatic pipeline to produce a registry of 97,000 high-confidence SNPs pertaining to around 15,000 gene sequence clusters [39]. In addition, we compiled successfully genotyped gene SNPs from previous black spruce genomic studies [9,41,[51][52][53]. Altogether, this information was used to develop an Infinium iSelect SNP genotyping array (Illumina, San Diego, CA) containing 5,300 SNPs representing as many distinct black spruce gene contigs [39]. Based on control DNA replicates, the chip genotyping reproducibility rate obtained was 99.9%. For genomic selection modelling and analyses, genotyping information from a total of 4,993 SNPs representing as many distinct gene loci spread among the 12 spruce linkage groups was retained for each of the 734 progeny trees, resulting in a total of >3.6 million SNP calls. Given the estimated 1,850 centimorgans (cM) of the black spruce genome [51,54], this corresponds to a marker density of approximately 2.7 per cM. For all 4,993 SNPs retained for GS analyses, the following in-house quality criteria were met: biallelic SNPs matching with both parental genotypes, a GenTrain (Illumina, San Diego, California) quality score ≥0.25, a call rate ≥85%, a fixation index |F IS | ≤ 0.50, and a minor allele frequency (MAF) ≥0.0055. Furthermore, SNPs with minor frequency alleles that were present in less than 10 individuals were not considered. Of the retained markers, only 2% had a minimum allele frequency (MAF) < 5%, indicating a small number of rare allele markers.

Estimating "true" breeding values
For each trait, an individual tree (so called "animal") model was fit using the GS3 software [55] in order to estimate reference or "true" breeding values: where, β is a vector of fixed effects, including an overall mean and the fixed site effect, p is the permanent random block effect, u is the vector of random additive polygenic effects following a distribution~N(0, Aσ 2 u ) and e is the error term with~N(0, Iσ 2 e ). X, T and S are incidence matrices, and A is the numerator of the relationship matrix describing the additive relationship among individuals and I is the identity matrix. All the trees sampled were used to estimate these reference breeding values.

Basic genomic selection models with all information
In the first set of analyses, phenotypes from both sites were combined and GS models were fit using a simple Bayesian framework in the GS3 software [55,56], considering all SNPs in the models. Phenotypes were standardized by block-within-site effects and site standard deviation in order to account for differences between sites. The genomic-estimated breeding value (GEBV) of each tree was estimated for all phenotypic traits using SNP marker information whose effect was estimated with the linear mixed model: where, y is the vector of standardized phenotypes, β is a vector of fixed effects including an overall mean, α is the random marker effect, e the random error, and X and Z are incidence matrices. The latter was built from the number of alleles observed for each individual and SNP, and coded as 0, 1 and 2 representing the number of copies of the minor allele. Approximately 1.8% of the total number of genotypes was missing and imputed as the mean of the respective marker rounded to the next genotype value. Given that only biallelic markers were retained for the analyses, values of +0.5a j and −0.5a j were arbitrarily assigned to alleles 1 and 2 respectively, which follows conventional parametrization where the difference between the two homozygotes equals two a j [55]. Marker effects were assumed to follow a normal distribution~N(0, Iσ 2 a ), where I is the identity matrix. The model hence assumes common variance and all marker coefficients are minimized to the same extent, which is commonly called ridge regression. The approach was deemed appropriate for similar traits evaluated in white spruce and assumingly controlled by many genes of small effect [25], and was hence retained for all analyses in this study. GEBVs were estimated aŝ where, Z' ij is the indicator co-variate (−1, 0 or 1) for the i th tree at the j th locus and â j is the estimated effect at the j th locus. The GS3 software uses the Gauss-Seidel algorithm with residual update for best linear unbiased prediction (BLUP), and for best linear unbiased estimation of random and fixed effects respectively [55]. The algorithm is extended by Gibbs sampling for estimation of variance components. The Gibbs sampler was run for 100,000 iterations with a burn-in of 20,000 iterations. Every thousand iterations, a sample was retained and convergence of the posterior distribution was verified using trace plots. Flat priors were found to give the most stable results of convergence for the various models and subsamples tested.

Model validation and estimation of accuracy
Tenfold cross-validation was performed for pedigreebased and marker-based models in order to obtain predicted breeding values for both approaches. For each round of cross-validation, 10% of the trees were randomly drawn from each family and set aside for validation, the remainder being used for model training. Each individual was thus included in one validation set. Model quality was evaluated by the accuracy, r(GEBV, EBV), which is defined here as the correlation between the cross-validated genomic-predicted breeding values and the "true" or reference breeding values previously calculated using pedigree information and all available sampled trees. The predictive ability, r(y, ŷ), was similarly evaluated as the correlation between predicted and actual phenotypes.

Testing site specificity
Different GS scenarios were also tested to investigate the influence of site on model quality: the basic model contained all 734 trees from the two sites combined (see above); another set of models were trained and validated with only data from a single site, and the last models were trained with data from one site and validated with data from the second site.

Testing the effect of relatedness on accuracy estimates
In order to address questions related to the influence of relatedness and long-range linkage disequilibrium (LD) on genomic selection accuracy, we investigated the effect of family structure. Training and validation sets were genetically related in the basic models, i.e. different trees, but originating from the same full-sib families. For comparison purposes, a genomic selection scenario with half-sib families was also tested where the training and validation sets shared female progenitors, but form different families, hence implicating different males, or vice versa. Estimates of model accuracy were also obtained by using completely unrelated trees from families and provenances excluded from model training.

Testing subsets of markers
To investigate the effect of LD, and that of various subsets of markers on GS model accuracy, different subsets were delineated and used to build models. Hence, genomic selection models were constructed with subsets of 1,000, 500 and 250 markers randomly drawn from the 4,993 SNPs available, as well as with a set of 250 markers having the largest absolute effects and a set containing the 4,743 remaining markers. Accuracy and predictive ability of these reduced models were then compared with those of the basic model. GS models were also constructed separately with markers identified for each of the 12 black spruce linkage groups, in order to investigate the contribution of various genomic regions to the genetic control of the different traits and the influence of LD on GS model accuracy. Given the very high synteny and collinearity between the white spruce and black spruce genomes [51], we used the more complete linkage mapping information of white spruce gene homologs to approximate the genomic positions of black spruce genes carrying the SNPs considered herein. Of the 4,993 black spruce gene contigs of the present study (with gene nomenclature according to the white spruce gene catalogue of Rigault et al. [49] as described in Pavy et al. [39]), 2,928 genes had homologs mapped on the most recent white spruce reference genetic map containing nearly 9,000 genes [48]. These homologs were well distributed over the 12 linkage groups with, on average, 244 (+/-15) gene homologs represented per chromosome.
Testing the size of training data set Random subsets of individuals of various sizes were used to examine the minimum number of trees needed to build GS models without significantly loosing accuracy. Starting with the complete sample set of 4,993 SNPs and 734 trees, about one third of the trees was iteratively removed, creating subsets of 490, 330, 224, 147 and 106 trees that were subsequently analysed with all the 4,993 SNPs.

Genetic gain estimations
Estimates of genetic gain from conventional selection or from GS were obtained by using predicted breeding values for each trait and by considering a selection intensity of 5%. Gains per year were obtained by assuming a spruce breeding cycle of a minimum of 28 years for conventional selection, and a shortened cycle of 9 years for GS, with 4 years for crosses and production of seedlots that are full-siblings to the training population, followed by 1 year for selection of individuals using markers and genomic selection models, and 4 years for vegetative propagation of selected individuals for seedling production. This last scenario thus assumes GS under a forward selection scheme, which is possible in black spruce [1].

Results
Genotypic and phenotypic information was gathered from a total of 734 25-year-old black spruce trees planted in two different forest environments, and representing an advanced-generation breeding population of full-sib families from crosses involving parents from nine Canadian provenances and one from Maine in the U.S.A. Population structure was found to be weak, in congruence with previous genetic studies relying on molecular markers and implicating populations from eastern Canada [52,53,57]. Indeed, spectral decomposition of the genomic relationship matrix [15,25] according to the geographic distribution of origins revealed that about five percent of the variation was captured by the first eigenvector. We concluded that the genetic and genomic relationship matrices were sufficient to capture relatedness among individuals, and as such population structure was not considered in subsequent GS analyses.

High accuracy of GS models in combined-site analyses
When the two test sites were simultaneously considered, with all markers and independent trees for validation, the accuracy of the GS models was high for all traits assessed. Moreover, these accuracy estimates were similar to those of the polygenic models using pedigree and phenotypic information ( Table 2). Differences in accuracy between the different phenotypic traits for both the GS and polygenic models were also minor. Larger differences were found for predictive ability where the most accurate predictions were recorded for height growth. The accuracies largely mirrored the trends in individual trait heritability (Table 1), where high estimates of genetic control were observed for MFA (h 2 = 0.74) and height growth (h 2 = 0.68), and somewhat weaker values observed for DBH and wood density (h 2 = 0.57 and h 2 = 0.41, respectively).
In terms of genetic gains, the model containing trees from both sites resulted in the largest gain predictions ( Table 2). The gain ratio, or in other words, that from marker models versus conventional pedigree selection was high for all traits, with DBH showing the weakest ratio (below 90%). Gain ratios above 100% were obtained for wood traits, indicating that marker-based models would allow for better gains to be obtained than pedigree-based models. When considering the potential time saved for breeding based on GS, the annual gain ratios for marker-based models versus conventional selection increased considerably to being around 3. This is because GS is conducted without field testing and thus, avoids the delays associated with tree growing and assessing phenotypic trait variation from mature trees.

Between-site differences in genomic selection models in black spruce
In order to assess the extent of the genotype-byenvironment interaction, and to evaluate if GS model accuracy is affected by sites, we constructed and validated GS models for each of the two different sites ( Table 2). Accuracy estimates of models built using data from one site only were slightly inferior compared with accuracies obtained with the combined-site analyses, especially for site 1. Accuracy estimates of models developed with data collected on one site and validated with data collected from the second site were high, and only marginally lower than estimates for validations carried out within the same site. These results indicate that genotype-byenvironment interaction is low, and that GS models can most likely be applied over a large range of sites without the need for increased tree sampling for independent model construction, given that the two sites in the current study represented quite contrasting environmental conditions.
The relatively good overall model accuracies obtained by applying models developed on one site and applied to the second one does not mask important within-site environmental differences. Field records identified more trait variation and vegetation competition on site 1 (Matapedia, warmer site) compared with site 2 (Robidoux, colder site), which led to lower heritability estimates for site 1 (Table 1). Additionally, a smaller number of trees were available for site 1 compared to site 2 (333 and 401 trees analysed, respectively). Hence, GS models trained on site 2 and validated on site 1 or site 2 were marginally more accurate, which was especially true for wood density and diameter. The same trend was observed for prediction models based on pedigree and phenotypic information.

Relatedness
Models built with a half-sib structure led to a large decrease in accuracy, predictability and genetic gain ( Table 3). Although model validation was set up equally for all traits, accuracy varied much more among traits than when full-sib models were applied. The loss of model quality was most pronounced for MFA, where accuracy was only half. When cross-validation was conducted with completely unrelated progeny from different provenances and families (whether with fullsib or half-sib structure), model accuracy dropped virtually to zero and was associated with high error rates (results not shown).

Marker subsets inform on the nature of causative linkage disequilibrium
We also investigated the effect of reduced marker sets on the accuracy of GS models. The use of fewer markers would result in significant cost reductions for genotyping, which would have some impact on the overall cost of GS, but would also impact the extent of LD that is picked up by the prediction models. Decreasing the number of markers from almost 5,000 to 1,000 randomly sampled markers did not lead to a notable reduction in the accuracy of GS models (Fig. 1). However, using less than 500 markers led to an appreciable loss in accuracy for all four traits. Figure 1 shows the accuracy plots for several scenarios, where the loss of accuracy was mostly related to less scattered correlation plots due to a range reduction of genomic-estimated breeding values (GEBV).
The modelling of 250 markers that had previously been identified to contribute the largest absolute effects in the basic model, led to accuracy estimates comparable to those of the basic model including all 4,993 markers (Fig. 1). These models also perform substantially better than models based on 250 random markers, which is again related to an increase of range in GEBV estimates ( Fig. 1). However, the average MAF of largest-effect markers was also significantly higher (student t, P < 0.0001, for all traits) than the average MAF of remaining 4,743 markers of lower effects. Markers with the largest effects most likely retraced family linkages better, given their higher MAF and thus, higher information value. Modelling the remaining 4,743 lower-effect markers led to accuracies marginally lower than estimates for the full model.

Linkage groups have similar effects on GS model accuracies
In a further analysis, we investigated the effect of the genomic location of markers of GS models for the different traits. Therefore, we used markers of genes for which a set of 2928 homologs have been recently mapped to the white spruce genome [48], which has been previously shown to be highly syntenic et collinear to the  [51]. GS analyses considering sets of markers delimited according to linkage groups showed some differences in accuracy estimates from one chromosome to another, but these differences were not statistically significant (Fig. 2). At this marker resolution, we were thus unable to identify particular linkage groups that would control more for a specific trait. Overall, the accuracies obtained for the four traits were lower than those of models based on all markers, but they were slightly higher than estimates for models based on an equivalently reduced number of markers but sampled randomly across the entire genome (see Fig. 1, random 250 SNPs). These results suggest that relatedness is a likely strong contributing factor to the high prediction accuracy of GS models. Table 2 Genomic selection analyses based on all marker information (4,993 SNPs) and following different schemes of model building and application. "Pedigree" indicates that only pedigree information was used for prediction after model calibrations, which is also referred to as the conventional breeding approach in the body of the text; "Markers" indicates that SNP information was used for prediction. All results are from cross-validations using 10 replicates on randomly-selected trees not included in model fitting, but from the same families used to fit models. The first scheme considered all 734 trees from the two sites combined. Models in other schemes were trained on one site only and whether applied on the same or on the other site, respectively. For comparative purposes, the "Site_mean" scheme represents the mean of 5 models run on 359 trees corresponding to the mean number of trees per site. Model accuracy is the correlation between the crossvalidated estimated breeding value (using independent sets of trees) and the "true" reference breeding value. The predictive ability is the correlation between the predicted and the actual phenotypes. Genetic gains are given in absolute values and percentages are given in brackets. Gain estimates are based on predicted phenotypes and a selection intensity of 5%. Annual gain estimates were based on assumptions of a conventional breeding cycle length of 28 years for pedigree selection ("Pedigree"), and a shortened cycle length of 9 years for selection with markers ("Markers"), with 4 years for crosses and production of seedlots that are full-siblings to the training population, followed by 1 year for selection of individuals using markers and genomic selection models, and 4 years for vegetative propagation of selected individuals for seedling production

Sample sizes used to build genomic selection models
The effect of sample size used for GS model training was investigated. Starting with the full set of 734 sampled trees from both sites, we randomly removed trees and evaluated model quality for various subsets. As expected, accuracy of the GS models from the combined-site analysis generally decreased when fewer trees were used (Fig. 3). The reduction was much more important in marker-based models compared with pedigree-based models, leading to decay in the accuracy ratio. When considering less than a quarter of the trees initially present in the training set, model accuracy decreased by 50% on average. Sample sets with 330 or more trees showed comparable or only marginally inferior model accuracy compared with the full sample set, picking up well existing LD and relatedness. However, when training models with this number of trees, the accuracy of models was associated with larger errors for wood quality traits, especially for wood density where accuracy degraded faster and more irregularly in both marker-based and pedigree-based models. Similarly, errors of predicted genetic gain based on the use of GS models increased for sample sets of 330 trees or less. This was especially true for tree height and wood density where the average predicted genetic gain of markerbased models even increased slightly, when models were based on low numbers of individuals (Fig. 4). Together with the important loss of accuracy for both traits (Fig. 3), this result highlights the value of a large training set leading to precise accuracy estimates from GS models in order to make well-grounded selection decisions.

Accuracy of genomic selection models with complete information
This study clearly shows that medium-dense marker panels with several thousand markers well distributed over the genome can be effectively used in GS to predict additive breeding values in advanced-generation tree breeding programs, echoing previous results in similar settings [21,24,25,33]. The accuracy of the GS models obtained with the present black spruce breeding population was high for both growth and wood quality traits, reaching values of approximately 0.8 when using several hundred trees to build the GS models. Accuracy estimates from GS models were comparable with their pedigree-based counterparts, and were superior to results obtained in white spruce for a population of fullsib families [25]. These findings suggest that the current marker panel was marginally more efficient than the pedigree information in retracing family linkages (see below), especially if some errors affected pedigree information. These results lead to the conclusion that GS can efficiently be applied for this boreal conifer species in advanced-breeding programs and highly structured populations of full-sib families, resulting in much higher gains per year than conventional selection. Moreover, applying a forward GS scheme at an early stage appears possible in black spruce, given that it inherently displays a high propensity for vegetative propagation at an early age, as seen for other spruces [1].
The present accuracy estimates were somewhat higher than estimates previously obtained for similar traits in full-sib families of white spruce of similar age in a comparable study [25], and they were considerably higher than accuracy estimates obtained for loblolly pine [22]. Following the simulation results of Grattapaglia and Resende [32] and the parameters of the present study, one would expect accuracy estimates ranging between 0.7 and 0.8 for genomic models relying on 2 to 3 markers per cM, with an effective population size close to 30, as well as a training population set somewhat below 1,000 individuals. The present estimates are on Table 3 Accuracies of genomic selection models based on half-sib families using all 4,993 markers in a combined-site analysis, thus removing full-sib family linkage. Pedigree indicates that only pedigree information was used for predictions after model calibrations; markers indicates that SNP information was used. The predictive ability is the correlation between the predicted and the actual phenotypes. Genetic gains are given in absolute values and percentages are given in brackets. Gain estimates are based on predicted phenotypes and a selection intensity of 5%. Annual gain estimates were based on assumptions of a conventional breeding cycle length of 28 years for pedigree selection ("Pedigree"), and a shortened cycle length of 9 years for selection with markers ("Markers"), with 4 years for crosses and production of seedlots that are full-siblings to the training population, followed by 1 year for selection of individuals using markers and genomic selection models, and 4 years for vegetative propagation of selected individuals for seedling production     Accuracy of genomic selection models obtained using subsets of trees. Accuracy estimates for pedigree-based models (light grey line) and marker-based models (dark line), as well as their ratio (histograms). Estimates are given with their associated standard errors the upper limit or surpass these expectations, likely because of the high heritabilities observed in the present black spruce field trial for these traits.
Overall, the accuracy estimates only showed minor differences among traits. Model quality for tree diameter was somewhat lower than that for tree height, which is congruent with earlier reports [24,25]. This is most likely related to the higher heritability of tree height compared to diameter, as often noted in conifers [25,27,58].

Genotype-by-environment interaction
The genotype-by-environment interaction was low. Models calibrated on one site led to good predictions of GEBVs on the other site, indicating low genotype-by-environment interactions in this test despite the contrasting site conditions and the large geographic distribution of parental trees used to produce the full-sib families. Low genotype-byenvironment interactions were also previously reported from provenance-progeny tests replicated on multiple sites in Québec [59,60]. Similar observations of good transfer of model accuracy among distant sites from two large breeding zones in Québec were also reported for white spruce for both half-sib and full-sib GS models [25,26]. These results contrast with reports on hybrid spruce from western Canada [27] and loblolly pine from the southeastern United States [33], where the need to recalibrate GS models in each breeding zone was shown.
Boreal spruces in eastern Canada, such as black and white spruces, are reforested on a geographically restricted land base compared to the extent of their natural distribution, at the southern edge of their natural range where the commercial forest is mostly located. Based on provenance-progeny tests targeting these reforestation areas, reduced genotype-by-environment interaction was noted and the reforestation sites have been split into only a few large breeding zones [40,60]. Furthermore, little phylogeographic structure has been reported in the province of Québec and its vicinity for black spruce [52,53,61,62] or white spruce [63], indicating a homogenous historical genetic background. Little among-population differentiation has also been observed with various molecular markers, indicating limited population structuring and reflecting the recent post-glacial recolonization in eastern Canada [52,53,57,64]. Furthermore, the parental trees employed in the present black spruce advanced-breeding populations were first-generation superior trees selected in provenance tests assembling multiple provenances, well scattered geographically beyond single breeding zones and performing well on multiple sites. Therefore, their genetic background may have been indirectly selected toward generalists bearing a plastic adaptability to local conditions. This study shows that models can be applied to different sites or be built by pooling data from different sites without significantly compromising accuracy. From a Fig. 4 Predicted genetic gain using subsets of trees to build pedigree-based models (light grey line) and marker-based models (dark line), and the corresponding coefficient of variation (error bars). The ratio of marker-to pedigree-predicted genetic gain is presented by histograms. Gain estimates are based on predicted phenotypes and a selection intensity of 5% practical point of view, it also means that fewer trees are necessary to train and obtain models with good accuracy, thus reducing phenotyping and genotyping costs for GS model development.
The effect of size of model training set on genomic selection accuracy The size of the dataset employed to train models has previously been shown to have a large effect on the accuracy of GS models [32]. In advanced-breeding populations with small effective population size, we hypothesized that a smaller number of samples per progeny should be sufficient to obtain good model accuracy. Using resampling, Perron et al. [65] reported that optimal subgroups should include 6 to 8 trees per family and site in order to precisely estimate genetic parameters for wood density and growth in an open-pollinated black spruce test. Our current finds concur with those of Perron et al. [65], as a sizeable loss in accuracy was only observed when the training sets were less than 330 trees, corresponding to a minimum of about 4 trees per site and full-sib family in the present context. One interesting observation was that marker-based models had a tendency to loose accuracy more quickly than their pedigree-based counterparts. From a practical point of view, a smaller number of trees necessary to train and obtain models with good accuracy will help reduce the genotyping and phenotyping costs for GS model development.

Genomic selection model accuracy and level of family structure
Genomic selection models constructed using data from a subset of half-sibs from the same test resulted in much lower and more variable accuracies. However, models built with half-sibs had higher accuracies than those from previous reports similarly derived from half-sib families of eastern white spruce or western hybrid spruce in Canada [26,28]. The difference is likely due to the low effective population size of half-sibs in the present study, leading to a higher level of relatedness among trees in the training and validation sets compared to true open-pollinated families where a large number of mostly unrelated pollen donors intervene and greatly increase the effective population size [26].
The shift in accuracy and predicted genetic gain observed between using half-sib versus full-sib sets further highlights that GS is most efficiently applied in more structured populations where relatedness and LD are higher, which necessitates less genome coverage to attain high prediction accuracy. Similar observations were reported by Beaulieu et al. [25,26] and Zapata-Valenzula et al. [21]. Below, we discuss additional evidence to support this interpretation.

A limited role for short-range LD in genomic prediction
The main obstacles for the application of MAS in largely undomesticated populations of conifers are their large genome sizes often exceeding 20 Gbp [66] and their low LD [35,36], which in turn would require very high genome coverage in order to pick-up short-range marker-QTL LD and make accurate predictions; this, besides the multigenic control of most relevant traits was already identified as a drawback in association studies [5][6][7] where single markers explained only a low percent of variance. In this context, the role of short-range marker-QTL LD in the accuracy of GS models obtained with moderate genome coverage appears negligible. In the present study, a density of approximately 2.7 markers/cM was used, which resulted in GS model accuracy roughly equivalent to that from pedigree-based models. The same trend was also observed in previous studies [25,26]. GS model accuracy decreased significantly when using half-sibs instead of full-sibs, which is consistent with the trend seen from other studies displaying similar genome coverage [25][26][27]. Using only markers with large effects resulted in marginally better model accuracies compared to those obtained with same number of markers randomly picked. However, we showed that this aspect is entangled with higher average MAF values for these markers. Also, the non-significant differences among model accuracies obtained with markers from different chromosomes likely indicates that limited short-range marker-QTL LD could be traced. Altogether, these observations point to relatedness and an increasing size of un-recombined chromosome blocks as the main drivers of GS model accuracy. Relatedness and the ability to retrace family linkages should be seen as the key factors for the high accuracies obtained, given that restricting GS model building with markers from single chromosomes led to somewhat higher accuracies compared to models relying on an equivalent random sample of markers covering all 12 spruce chromosomes. This trend is further supported by the fact that increasing the number of markers used to build GS models tenfold (from 500 to 4,993 markers) only led to incremental, though useful, improvements in model accuracy.
These observations are further supported by the fact that when GS models were applied to trees from families not included during model building, little or no accuracy was obtained, further confirming the limited role of short-range LD in the high accuracies obtained when using full-sibs. These results are not surprising given that black spruce and most conifers are essentially undomesticated, outbreed, wind-pollinated organisms with large effective population sizes, hence lacking populationwide LD. For instance, LD decays rather rapidly in natural populations of spruces and pines, usually well within gene limits and in many cases, within a few hundred base pairs, which is a likely consequence of historically large effective population sizes [35,36]. Consequently, our results and those of others (e.g. [22,23]) suggest that genomic prediction may not be possible for unrelated individuals at current marker densities.
A corollary is that obtaining GS model accuracy beyond that of pedigree-based models would likely necessitate increasing genome coverage by a factor of at least 10X to 100X of that used herein, resulting in the use of a very large number of markers, likely in the hundreds of thousands. Simulations and historical data in cattle led to the conclusion that 50k markers would allow for the capture of causal loci within breeds, but 300k markers would be needed for accurate prediction across breeds [67]. A similar trend is emerging for crop plants, with the preparation of genotyping arrays containing over half a million SNPs (e.g. McCouch C. et al., International Rice Consortium, in preparation). Furthermore, based on simulations, Grattapaglia and Resende [32] showed that increasing genome coverage tenfold from 1-2 markers/cM to 10-20 markers/cM only asymptotically improves GS model accuracy.
The fact that slightly higher GS model accuracy was obtained with models using only markers with largest effects (top 250 out of 4,993 markers), compared to models estimated with all markers, could imply that part of the short-range LD can be picked-up by the GS models when a reduced numbers of large-effect markers are employed. However, there could be confounding factors, such as the a priori information value of markers. Indeed, a significantly higher average value of minimum allele frequency (MAF) was observed for markers with largest effects. Such markers could track family linkages more effectively than random markers, especially when small numbers of markers are used. Also, the marginally higher accuracy achieved with markers located on a specific chromosome compared to same number of markers but spread over the entire genome indicate that a better tracking of family linkages is achieved when a small number of markers are located on the same chromosome. Thus, this factor could also be potentially useful to reduce genotyping and GS costs in highly structured populations when the genome location of markers is known.

Conclusions
The results of the present study on black spruce indicate that, at least in the short term, GS holds substantial promises for efficient application in populations of small effective size, such as advanced-breeding populations. With the scale of marker densities usually employed (at a rate of a few markers per cM), the overall short-range LD between markers and causal loci is likely not sufficiently well captured to make GS efficient in largely unstructured natural populations of conifers or between unrelated breeding populations. In small and structured populations, we show that prediction accuracy is null when relatedness between training and testing data sets is absent. Marker densities of one or several orders of magnitude higher would likely be necessary to improve accuracy in such conditions. Thus, high relatedness between individuals appears to be a prerequisite to obtain highly accurate GEBVs.
From an applied point of view, lowering marker densities may be feasible without major loss in accuracy in order to reduce genotyping costs. Information relative to relatedness would be more precise when markers are located on a single chromosome instead of spreading an equivalent number of markers over the whole genome, and with using markers with highest MAF. These approaches may be considered when a reduced genotyping assay is needed.
Based on the present results, the GS application with highest potential for spruce breeders would be to select with high accuracy superior individuals within a group of full-sib families. This would effectively increase the relatedness as well as the size of unrecombined chromosome blocks generated by controlled crossing in a small breeding population. One obvious beneficial implementation of GS in such a context would be to repeat the controlled crosses that were used to build the GS models to generate much larger full-sib families and thus, apply higher selection intensities and obtain larger genetic gains. Future studies also need to evaluate to which extent genomic selection models developed for the present generation could be applied to the next generation of progeny, as recombination should break up some of the established linkage [23]. However, given the good accuracy of GS models that we obtained when considering individuals sharing only one parent, we hypothesize that predictions in a following generation may be relatively accurate when sharing the same parental material. Also, because family linkages could be efficiently traced with genomic profiles, polycross strategies could likely be used without losing significantly on prediction accuracy, where male pollen donors are mixed so to reduce the cost of crosses. Such screening of larger families from polycross would make it possible to increase selection intensity and the ensuing genetic gain in a cost-effective fashion, especially for traits such as wood quality parameters or pest resistance, which are expensive and cumbersome to assess on large test populations. Thus, candidate individuals for selection would only have to be genotyped at the early seedling stage and those having the highest GEBVs predicted using the available GS models would be selected.
For species amenable to vegetative propagation and for organizations having access to somatic embryogenesis and/or rooted cutting facilities, as it is the case in eastern Canada for spruces, individuals identified with GS at the early seedling stage could also be propagated and massproduced for reforestation programs within only a few years [1]. With such a forward selection scheme, the breeding and production cycles could be significantly reduced and gain per time unit would be multiplied by a factor of 3 (Table 2). Other schemes based on sexual reproduction could be deployed, such as top-grafting of selected progeny in previous generation seed orchards followed by polymix crossing, which would facilitate the production of genetically improved seeds with minimal delays. At the same time and by shortening quite drastically the breeding cycles for slow-growing species such as temperate and boreal conifers, the use of GS tools should result in more flexibility to tree breeders, which appears especially important in the context of rapid environmental changes and evolution of wood products markets.