The effect of tree (and cambium) age on genomic prediction for solid wood properties in Norway spruce

Genomic selection (GS) or genomic prediction is considered as a promising approach to accelerate tree breeding and increase genetic gain by shortening breeding cycle. We investigated the predictive ability (PA) of GS based on 484 progeny trees from 62 half-sib families in Norway spruce ( Picea abies (L.) Karst.) for wood density, modulus of elasticity (MOE) and microfibril angle (MFA) measured with SilviScan, as well as for measurements on standing trees by Pilodyn and Hitman instruments. GS predictive abilities (PA) were comparable with those based on pedigree-based selection. The highest PAs were reached with at least 80-90% of the dataset used as training set. Use of different statistical methods had no significant impact on the estimated PAs. We also compared the abilities to predict density, MFA and MOE of 19 year old trees with use of models trained on data from coring at different ages and to different depths into the stem. The comparison indicated that close to the maximal PAs can be reached at age 10-12 by drilling only half way (ringwise) towards the pith, thereby reducing the impact on the tree.


Introduction
Norway spruce is one of the most important conifer species in Europe in relation to economic and ecological aspects 1 . Breeding of Norway spruce started in the 1940s with phenotypic selection of plus-trees, first in natural populations and later in even-aged plantations 2 . Norway spruce breeding cycle is approximately 25-30 years long, of which the production of seeds and the evaluation of the trees take roughly one-half of that time 3 .
Genomic prediction using genome-wide dense markers or genomic selection (GS) was first introduced by Meuwissen 4 . The method modelling the effect of large numbers of DNA markers covering the entire genome and subsequently predict the genomic value of 3 individuals that have been genotyped, but not phenotyped. As compared to the phenotypic mass selection based on a pedigree-based relationship matrix (A matrix), genomic prediction relies on constructing a marker-based relationship matrix (G matrix). The superiority of the G-matrix is the result of a more precise estimation of genetic similarity based on Mendelian segregation that not only captures recently pedigree but also the historical pedigree 5− 7 , and corrects possible errors in the pedigree 8, 9 . There are multiple factors affecting genomic prediction accuracy such as the extent of linkage disequilibrium (LD) between the marker loci and the quantitative trait loci (QTL), which is determined by the density of markers and the effective population size (N e ). An increase in accuracy with dense marker has been reported in simulation 10  In forest tree species the accuracy of the genomic prediction model has been mainly tested in cross-validation designs where full-sibs and/or half-sibs progenies within a single generation are subdivided into training and validation sets 10, 19− 22 . Model accuracy was reported to increase with larger training to validation set ratios 11,17,23 , while the level of relatedness between the two sets is considered as a major factor 10, 15− 17, 19, 24 . When genomic prediction is conducted across environments, the level of genotype by environment interaction (GxE) of the trait determines its efficiency 11,20,21,25 . The number of families and progeny size have also been shown to affect model accuracy 11,15 .
As compared to the previously described factors, trait heritability and specially the trait genetic architecture are intrinsic characteristics to the studied trait in a given population.
Those two factors can also be addressed by choosing an adequate statistical model depending on the expected distribution of the marker effects 26 . Despite theory and some results indicate that complex genetic structures obtain better fit with models that assume equal contribution of all markers to the observed variation, traits like disease-resistance are better predicted with methods where markers are assumed to have different variances 13,20,22,27,28 . However, results in forestry so far indicate that statistical models have little impact on the GS efficiency 12,17,29 .
In this study, we conducted a genomic prediction study for solid wood properties based on data from 23-year old trees from open-pollinated (OP) families of Norway spruce. We focused on wood density, microfibril angle (MFA) and modulus of elasticity/wood stiffness (MOE) analyzed with SilviScan in the lab, and on estimates of these 3 traits based on data from measurements on standing trees of Pilodyn penetration depth and Hitman velocity of sound, estimates here named Pilodyn, Velocity and MOE ind , respectively.
The specific aims of the study were: (i) to compare narrow-sense heritability (h 2 ) estimation, predictive ability (PA) and prediction accuracy (PC) of the pedigree-based (ABLUP) models with marker-based models based on data from measurements with SilviScan on increment cores and from Pilodyn and Hitman measurements on standing trees, (ii) to examine the effects on model PA and PC of different training-to-validation set ratios and different statistical methods, (iii) to compare some practical alternatives to implement early training of genomic prediction model into operational breeding.
Both trials were established in 1990 with a spacing 1.4m × 1.4 m. Originally, the experiments contained more than 12 progenies from 524 families at each of site, but after thinning activities in Höreda and Erikstorp in 2010 and 2008, respectively, fewer trees were left. In 2011 and 2012, six trees per site (524 * 12 ~ 6000 trees) were phenotyped 30 and the 524 mothers were genotyped using exome capture (single nucleotide polymorphism, SNP) 31 . Standing tree-based measurements with Pilodyn and Hitman were performed on the same trees in 2011 and 2013, respectively, after which further thinning was performed. For this study, in 2018, we generated genomic (SNP) data from 484 remaining progeny trees which belonged to 62 of the OP families (out of the original 524) and eight progenies per family on average. This new genotypic data was combined with available phenotypic data for the same trees were used.

Phenotypic data
The phenotypic data was previously described in 32 . The samples collected from the progenies in 2011 and in 2012 were taken as increment cores of 12 mm diameter from pith to bark. The samples were analyzed for from pith to bark variations in many woods and fiber traits with a SilviScan 33 instrument at Innventia (now RISE), Stockholm, Sweden.
This data is referred as increment core-based measurements through the text. The annual rings of all samples were identified, as well as their parts of earlywood, transition wood and latewood, averages were calculated for all rings and their parts and dated with year of wood formation 34 .
The aim of breeding is not for properties of individual rings, but properties of the stem at harvesting target age. Therefore, this study focused on predictions of averages for stem 6 cross-sections, and we chose tree age 19 years as the reference age, with models trained on trait averages for all rings formed up to different younger ages. Three types of averages were calculated and predictions compared for density, MFA and MOE: 1) areaweighted averages, relating to the cross-section of the stem, 2) width-weighted, relating to a radius or an increment core, and 3) arithmetic averages, where all ring averages are weighted with same weight. For the calculation of area-weighted average we assumed that each growth ring is a circular around the pith, calculated the area of each annual ring from its inner and outer radii, and when calculating the average at a certain age, the trait average for each ring was weighted with the ring's proportion of the total cross-sectional area at that age. Similarly, for the calculation of the width-weighted average, the trait average for each ring was weighted with the ring's proportion of the total radius from pith to bark at that age. The similar results were obtained with the three average methods. For this reason, only the estimates based on the area-weighted method (the most relevant for breeding) are shown. Tree age 19 years was used as the reference age. Thus, all the selection methods investigated for density, MFA and MOE, phenotypic and genetic, were compared based on how well they predicted the cross-sectional averages of the trees at this age, with their last ring formed during the vegetation period of 2009.
In addition, estimates of the three solid wood traits were calculated based on data from Pilodyn and Hitman instruments, measured on the standing trees without removing the bark at age 22 and age 24 years, respectively. Pilodyn measures the penetration depth of a needle pressed into the stem, which is inversely correlated with wood density. Hitman measures the velocity of sound in the stem, which correlates with microfibril angle, MFA 35,36 . MOE is related to wood density and velocity of sound 37− 39 and can therefore be estimated by combining the Pilodyn and Velocity data, which estimates we here name 7 MOE ind (for standing-tree based). Further details on how this was performed in our study are given in 40 . The references show that these standing-tree-based measurements provide useful information and are very time and cost-efficient. However, they do not allow calculation of properties of the tree at younger ages. Therefore, we were not able to investigate from what early ages such data can be uses within genomic selection.

Population structure
As a first step, we conducted a principal component analysis to determine the presence of structure in our population. The spectral decomposition of the marker matrix revealed that only about 2% of the variation was captured by the first eigenvector, indicating low population structure. Therefore, population structure was not considered in the design of 8 cross-validation sets (see Modelling and cross-validation chapter for further details on the cross-validation sets design). Previously, low genotype by environment (GxE) interaction was detected for wood quality traits on these two trials 30 and consequently, subsequent analyses were conducted obviating population structure and GxE.

Narrow-sense heritability (h 2 ) estimation
For each trait, an individual tree model was fitted in order to estimate additive variance and breeding values: where y is a vector of measured data of a single trait, β is a vector of fixed effects including a grand mean, provenance and site effect, b is a vector of post-block effects and u is a vector of random additive (family) effects which follow a normal distribution u ~ N(0,Aσ 2 u ) and e is the error term with normal distribution N(0,Iσ 2 e ). X, Z and W are incidence matrices, A is the additive genetic relationship matrix and I is the identity matrix. σ 2 u equals to σ a 2 (pedigree-based additive variance) when random effect in Eq. 1 is pedigree-based in which case u ~ N(0,Aσ 2 u ), and σ 2 u equals to σ g 2 (marker-based additive variance) when random effect in Eq. 1 is marker-based in which case u ~ N(0,Gσ 2 u ). The G matrix is calculated as where M is the matrix of samples with SNPs encoded as 0, 1, 2 (i.e., the number of minor alleles), P is the matrix of allele frequencies with the ith column given by 2(pi − 0.5), 9 where pi is the observed allele frequency of all genotyped samples.
Pedigree-based individual narrow-sense heritability (ℎ 2 ) and marker-based individual narrow-sense heritability (h g 2 ) were calculated as respectively, 2 and 2 are phenotypic variances for pedigree-based and marker-based models, respectively.

Selection of the optimal training and validation sets ratio
Cross-validation was conducted after dividing randomly the whole dataset into a training and a validation set. To find the most suitable ratio between the two, we divided the data into sets with five different ratios between the training and the validation sets: 50, 60, 70, 80 and 90%. 100 replicate iterations were carried out for each tested ratio and trait.

Statistical method for model development
In the same context we aimed to find optimal methods. Several statistical methods were compared: pedigree-based best linear unbiased predictions (ABLUP), and four GS methods: genomic best linear unbiased predictions (GBLUP) 44  Operationally, it is also important to develop protocols to assess wood quality in resources at minimum cost and time, and with minimal impact on the trees. Therefore, on coring, it is not only important to know the minimum age at which useful information can be obtained, but also from how many rings from the bark towards the pith information is required to train models with high predictive ability. To address these two practical questions for operational breeding, we trained prediction models based on data from different sets of rings, in order to mimic and compare PAs obtained when coring at different ages of the trees to different depths into the stem, or more precisely, using data from different numbers of rings, starting next to the bark. All the models were judged on, compared by their ability to predict the cross-sectional average of the trait at age    Search for optimal sampling and data for training of GS prediction models Figure 2 showed estimated PAs of models trained on data from sampling different years, using data from all rings available at that age (except for the innermost ring). In this section we instead estimate how much PAs can be reached if the increment core is not drilled all the way to the pith, but to different shorter depths, reducing the injury to the tree, Fig. 3a-d. This analysis was preformed based on tree age data only, as the cambial age of a ring can only be precisely known if the core is drilled to the pith, allowing all rings to be counted.
Each row of the figures represents a tree age when cores are samples, starting at age 3 years when the first 60 trees formed a ring at breast height, ending at the bottom with the reference age 19 years with17 rings. Each column represents a depth of coring, counted in numbers of rings. As one more ring is added each year, thus also to the 14 maximum possible depth on coring, the tables are diagonal. The uppermost diagonal represents models trained on data from the 60 (12%) trees which had reached breast height at age 3. The diagonal next below represents models based on the 243 (51%) trees with rings at age 4, etc. The PAs shown below the three uppermost diagonals represent models trained of data from more than 90% of the trees. The PAs were calculated from the cross-validation, based on data from the trees on which the respective models were trained. This means that the PAs of the three uppermost diagonals are based only on fastgrowing trees not fully representative for the trials. Many of the highest PAs found occur along these diagonals. Due to their trees' special growth, only PAs based on more than 90% of the trees will be further commented.

Discussion
We have conducted a genomic prediction study for solid wood properties assessed on increment cores from Norway spruce trees with SilviScan derived data from pith to bark, using properties of annual rings formed up to tree age 19 years as the reference age. We also compared this with genomic prediction of proxies for density, MFA and MOE estimated with data from same trees measured at the bark of the standing trees with Pilodyn and  errors.

E f f e c t s o n G S m o d e l p r e d i c t i v e a b i l i t y ( P A ) o f t r a i n i n g -t o -v a l i d a t i o n s e t s r a t i o s a n d s t a t i s t i c a l m e t h o d s
In conifers and Eucalyptus cross-validation is often performed on 9/1 training to validation sets ratio 8,12,15,16,28 . This coincide with the general conclusion from the present study, with exception for MFA and MOE, for which the best results were obtained at ratio 8/2. It has been suggested that when the trait has large standard deviation, more training data is needed to cover the variance in order to get high predictive ability 52  In the conifer literature it has more often been reported similar performance of different marker-based statistical models for wood properties 11,12,18,28,53 . This general Generally, pedigree-based PA estimates in conifer species have been reported to be higher or comparable to marker-based models 11,15,16,19,20,23 , but there are also some studies reporting marker-based PA estimates to be higher 13,24,55  Use of tree age versus cambial age (ring number) From a quick look at Fig. 2, one may get the impression that breeding based on cambial age data allows earlier selection than using tree age data. That would however be a too rushed conclusion. At tree age 3 years, after the vegetation period of 1993, only 12.5% of the trees had formed a first annual ring at breast height. Not until tree age 6 years, more than 90% of the trees had done so, that is by number 5 years "higher age". In contrast, all the trees had obviously a ring of cambial age 1. But if aiming for 90% representation, one must wait several years more for more rings were formed at breast height, more precisely, from 1993 to end of growth season 1996, tree age 6. And to train models based on data from 90% of the trees for cambial age say, 6 at breast height, samples cannot be collected until the end of growth season at tree age 11 years, or if a representation of 19 80% is judged as satisfactory, at tree age 10 years. This has to be considered if selection efficiencies are calculated based on cambial age data, which is common. Such results have for instance been published based on the almost 6000 trees sampled 2011 and 2012 30 .
Correctly compared based on minimum 90% of the trees, the estimated PAs shown in Fig. 2 are similar between the age alternatives, or slightly better for use of tree age. For example, the PA for MOE using cambial age data shows a smooth increase, reaching above 0.2 at cambial age (ring number) 7, which needs data from the tree of age 12. The corresponding curve from using tree age passed above 0.2 already at age 8 years.
However, curves based on tree age often show larger year-to-year variation. This is most likely an effect of the fact that the rings of same cambial age represent wood formed across a span of years with different weather. Thus, cambial age data reflect annual weather across a range of years, which does not happen when using tree age data. On the other hand, from a practical point of view, methods based on use of tree age may be easier to apply in operational breeding, especially in light of results as in Fig. 3, indicating that high PAs can be reached without coring all the way to the pith. To number the rings for precise cambial age, you need to find the innermost ring at the pith, but that may not be necessary for good results.

Implementation of GS for solid wood into operational breeding
The results indicate that GS can result in similar early selection efficiency or even higher than traditional pedigree-based breeding and offers further possibilities. Previously, in loblolly pine it was reported that models developed for diameter at breast height (DBH) and height with data collected on 1 to 4-year old trees had limited accuracy in predicting phenotypes at age 6-year old 21 . In British Colombia Interior spruce, the predictive 20 accuracy for tree height of models trained at ages 3 to 40 years, at certain intervals, and validated at 40 years revealed less opportunities for early model training, since the plateau was not reached until 30 years 28 .
In our study, the highest PA values were obtained for the subsets of fast-growing trees which had reached breast height already at tree age 3 and 4 years, 12% and 51% of the total number of trees, representing a limited number of the OP families included in the analysis. The trees of this fast-growing group are affected by high intensity of selection for alleles accelerating growth within each OP family. Also, on cross-validation the prediction abilities for this group were calculated based on the trees within the same group. In this elite group different factors could account for a higher PA value, such as lower phenotypic variance, decreased number of alleles of minor effect could also facilitate identification of major effects and/or higher consanguinity between those families which may share alleles for growth. These models are shown for completeness, but as they cannot be used for operational breeding they are not further discussed.
Models for genetic selection are useful in different steps of a breeding program. A first type of prediction models, here illustrated with Table 1, can be trained from existing trials, preferably based on trees of as old age as available, as the aim of the breeding is to predict tree qualities at age of harvesting when the major part of the stem will be dominated by mature wood. Training the models in older trees for wood properties also allows considering other properties which cannot be easily observed from trees of very young age, such as stem straightness and health. For wood density, the results indicate that models can be built without coring very deep into the stem. It may be expected that this is valid also for instance for tracheid dimensions which in combination determines the wood density 34 .
Due to the juvenility of the trees used in our study, the wood includes considerable proportions of both high MFA core wood and of lower MFA outer wood, with a pronounced shift in between. The results indicate that for such trees, information from the area around where the shift occurs is more useful for training of the GS model than information from the outermost rings, albeit resulting in generally low prediction abilities. This is an unexpected result. It may be expected that models for MFA and MOE trained on older trees would rely more on data from the mature wood outside the shift, similar to the models for density. Apart from models predicting traits of mature wood, ideally in trees of harvesting age, there is also an interest for models focusing on local features which can be detrimental for quality of wood-based products, such as traits of the inner core with high MFA and low MOE, and the size of this core with problematic properties for many solid wood products.
As illustrated in this work, two aspects of incorporating wood properties into operational GS breeding programs can be addressed with the same set of data. Firstly, as mentioned above, models for cost-effective selection based on genomic information from existing trees. In that case, models from data at old ages would normally be preferred, for example for wood density some model at bottom line of Fig. 3a. Secondly, models providing guidance on at what age it is reasonable to approach young trees for training of GS models for specific traits: a) trees in existing juvenile trials, or b) trees of new generations with different pools of genetics. As an example, the same Fig. 3b for wood density suggests GS model training at tree ages 10 to 12 on the third to fifth outermost rings to reduce costs and the negative impact on the tree.

1.
In comparison with phenotypic selection, Genomic selection methods showed similar to higher prediction abilities (PAs) for both increment core-and standing tree-based 22 phenotyping methods. This indicates that the standing tree-based measurements may be a cost-effective alternative method for GS, but higher PAs were obtained based on increment core-based wood analyses.  Estimated Predictive abilities (PA) for prediction of cross-sectional averages at tree age 19 years, based on cross-sectional averages at different tree ages (upper graphs) and cambial ages (lower graphs) from pith to bark.

Figure 3
Predictive ability from bark to pith at different tree ages (y-axis) and an increasing number of rings included in the estimation (x-axis). 3a) Number of trees at each tree age with different number of rings. 3b) PA of density at each tree age with different number of rings. 3c) PA of MFA at each tree age with different number of rings. 3d) PA of MOE at each tree age with different number of rings.