Skip to main content
  • Research article
  • Open access
  • Published:

Association and linkage mapping to unravel genetic architecture of phenological traits and lateral bearing in Persian walnut (Juglans regia L.)



Unravelling the genetic architecture of agronomic traits in walnut such as budbreak date and bearing habit, is crucial for climate change adaptation and yield improvement. A Genome-Wide Association Study (GWAS) using multi-locus models was conducted in a panel of 170 walnut accessions genotyped using the Axiom™ J. regia 700 K SNP array, with phenological data from 2018, 2019 and legacy data. These accessions come from the INRAE walnut germplasm collection which is the result of important prospecting work performed in many countries around the world. In parallel, an F1 progeny of 78 individuals segregating for phenology-related traits, was genotyped with the same array and phenotyped for the same traits, to construct linkage maps and perform Quantitative Trait Loci (QTLs) detection.


Using GWAS, we found strong associations of SNPs located at the beginning of chromosome 1 with both budbreak and female flowering dates. These findings were supported by QTLs detected in the same genomic region. Highly significant associated SNPs were also detected using GWAS for heterodichogamy and lateral bearing habit, both on chromosome 11. We developed a Kompetitive Allele Specific PCR (KASP) marker for budbreak date in walnut, and validated it using plant material from the Walnut Improvement Program of the University of California, Davis, demonstrating its effectiveness for marker-assisted selection in Persian walnut. We found several candidate genes involved in flowering events in walnut, including a gene related to heterodichogamy encoding a sugar catabolism enzyme and a cell division related gene linked to female flowering date.


This study enhances knowledge of the genetic architecture of important agronomic traits related to male and female flowering processes and lateral bearing in walnut. The new marker available for budbreak date, one of the most important traits for good fruiting, will facilitate the selection and development of new walnut cultivars suitable for specific climates.


Persian walnut (Juglans regia L.) is one of the oldest food sources known [1]. It is a monoecious and dichogamous tree species with 2n = 2x = 32 chromosomes [2], and grows in temperate regions [3]. Worldwide in-shell walnut production, mainly from China, California and Iran, exceeded 3800 kt in 2017, as reported by the Food and Agriculture Organization of the United Nations ( At more than 22,000 ha, Persian walnut is the second leading tree crop in France, after apple. In the last 3 years, France has oscillated between 7th and 9th position for in-shell walnut production (circa 40 kt) [4]. Increased yield, larger nut size, light kernel color, and ease of cracking are among the main goals of walnut breeding worldwide [5]. The ability to adapt to specific climatic conditions is also a breeding priority, especially in France where late spring frosts are prevalent [4]. In that respect, a better understanding of phenology and bearing habit, both key determinants of yield, is of upmost importance for walnut genetic improvement and cultivation [6].

Climate change, particularly global warming, is no longer to be proven within the scientific community [7], and researchers are studying its impact on phenology of temperate trees. In these species, growth is punctuated by an annually repeated phase of rest, called bud dormancy [8]. This dormancy period is influenced by various environmental factors, such as photoperiod and temperature, resulting in fulfilment of chilling and heat requirements [9]. In walnut, chilling and heat requirements were widely estimated in Iran [10] and showed, for instance, a range of chilling requirements from 650 h at + 4 °C for ‘Serr’ to 1000 h for ‘Hartley’ cultivars [11]. In France, the frost resistance of walnut were studied [12, 13] and many phenology studies of temperate tree species in Europe report a time shift of phenological events [14,15,16,17]. An advancing effect of warm springs on phenological events has been observed for walnut in California, particularly for leafing date [18]. Similar findings have been reported in Slovenia [19] and Romania [20]. Using phenological data recorded by the Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE) of Bordeaux from 1989 to 2016, we also observed an average advance in budbreak in France of 5 days over the last 3 decades [21]. In Iran, researchers assessed land suitability for walnut cultivation under present and future climatic conditions, and predict that the currently suitable area will be significantly reduced [22].

Genetic control of phenology-related traits is fundamental for the development of new, resilient cultivars, able to adapt to changing climatic conditions. Many studies have focused on genetic dissection of phenological traits (e.g., chilling requirements and flowering time) in diverse fruit crops, such as peach, apricot and sweet cherry [23, 24]. In walnut, a significant genotype effect has been identified for heat requirements [25]. Moreover, high heritability has been shown for leafing date (71–96%), type of heterodichogamy (90%), and female/male blooming (80%) [26, 27]. Persian walnut has two main types of bearing habit. Fruiting can occur only at the terminal position of new branches or at both terminal and lateral positions [28]. A genetic locus for lateral bearing has been identified based in an F2 progeny in the United-States [29], but has not been sufficiently robust for wider use in marker-assisted selection.

Release of the first walnut genome sequence [30] facilitated advanced genetic and genomic studies, including development of the first high-density Axiom™ J. regia 700 K SNP genotyping array [31]. Application of this powerful genotyping tool allowed genetic dissection of crucial traits in walnut, such as nut-related traits [32] and water use efficiency [33, 34]. A recent study, combining genome-wide association study (GWAS) and classical linkage mapping, found major loci for leafing and harvest dates on chromosome 1 (Chr1), and lateral fruitfulness on Chr11 [35].

Here, we studied for the first time in walnut, the genetic control of budbreak date and female/male flowering dates, using the Axiom™ J. regia 700 K SNP array to genotype both a panel of 170 walnut accessions of diverse geographical provenience and an F1 progeny segregating for these traits. This study sought to identify candidate genes for both female and male flowering dates and to develop the first Kompetitive Allele Specific PCR (KASP) marker for phenology in walnut. This will be useful for walnut breeding programs in selecting of new resilient varieties to climate change.


Phenotypic variations of phenology-related traits and lateral bearing

Two populations were used in this study: a GWAS panel of 170 diverse accessions of worldwide origin and an F1 mapping progeny of 78 individuals resulting from a bi-parental controlled cross between ‘Franquette’ (late flowering), and ‘UK 6–2’ (intermediate to early flowering). Both populations were maintained at the INRAE of Bordeaux field station and phenotyped during 2018 and 2019. For the GWAS panel, we also used previously collected (legacy) phenotypical data taken between 1989 and 2011.

For the GWAS panel, the 2018–2019 data exhibited high variation in phenology-related traits, particularly for budbreak which ranged in 2019 from 57 Julian days for ‘Early Ehrhardt’ to 128 for ‘Fertignac’ (Feb 27th to May 9th) (Figures S1 and S2). The F1 progeny in 2019 exhibited a smaller range of 76 to 102 Julian days (Figure S3 and S4). Generally, budbreak was earlier in 2019 (87.78 Julian days ±12.65 for the GWAS panel, 90.71 ± 5.48 for the F1 progeny) than in 2018 (92.47 ± 11.06 for the GWAS panel, 95.55 ± 4.97 for the F1 progeny).

We found significant positive correlations between budbreak date and female flowering stages for both the GWAS panel (0.83 to 0.84; Fig. 1a), and the F1 progeny (0.45 to 0.52; Fig. 1b). Similar significant positive correlations were found between budbreak date and male flowering stages for the GWAS panel (0.78 and 0.81; Fig. 1a), and the F1 progeny (0.61 to 0.84; Fig. 1b). Comparison of the 2 years shows that early accessions in 2018 were also early in 2019, suggesting genetic control of phenology-related traits in walnut. Female flowering was earlier in 2018 than 2019, but the accession order was consistent for both years. In addition, both female and male flowering durations showed low correlations and low statistical significances with other traits. We did not phenotype the F1 progeny for bearing habit, since this trait did not segregate in that population, but we observed great variability for fruit bearing within the GWAS panel.

Fig. 1
figure 1

Correlation matrices of the traits using two-year data. a Using the GWAS panel, and b using the F1 progeny

High broad-sense heritability values were observed for budbreak date, with H2 of 0.95 when using legacy data and 0.93 using only two-year data (Table 1). Overall, H2 values were lower within the F1 progeny (H2 = 0.67 for budbreak date). However, we found low values for male flowering duration (H2 = 0.22) and female flowering duration (H2 = 0.26) within GWAS panel (using the recent two phenotyping years), while no genetic effect was found for the F1 progeny. Therefore, we did not consider both male and female flowering durations in the GWAS and QTL mapping analyses.

Table 1 Descriptive statistics and broad-sense heritabilities

Population structure of the GWAS panel

A total of 364,275 SNPs were retained after filtering for high resolution SNPs categories (Poly High Resolution and No Minor Homozygotes), for genotyping rate > 90%, and minor allele frequency > 5% (Table 2). We investigated the population structure of our association panel using the Bayesian clustering approach implemented in fastSTRUCTURE, and Principal Component Analysis (PCA). The fastSTRUCTURE analysis infers accession ancestry from genotypic information and permitted us to determine the best number of clusters (K). The most likely K subpopulations were K = 2 and K = 3 (Figure S5). At K = 2, admixture proportions clustered the accessions according to their geographical origin. In particular, the cluster in purple named “Western Europe and America” includes 86 accessions from Austria, Chile, England, France, Germany, Netherlands, Portugal, Serbia, Slovenia, Spain, Switzerland and USA. The cluster in green named “Eastern Europe and Asia” includes 50 accessions from Afghanistan, Bulgaria, China, Greece, Hungary, India, Iran, Israel, Japan, Poland, Romania, Russia and Central Asia (Fig. 2). At K = 3, a new cluster includes all the hybrids and admixed accessions from France and USA (Fig. 2, Table S1).

Table 2 SNPs used for the GWAS analyses and the construction of the parental linkage maps ‘Franquette’ and ‘UK6–2’
Fig. 2
figure 2

Structure of the GWAS panel. The fastSTRUCTURE software was used. Bar plot of individual ancestry proportions (Q values) for the genetic cluster inferred using the whole set of 364,275 robust SNPs. For K = 2, accessions are geographically separated in two main groups: the purple group for ‘Western Europe and America’ accessions, and the green group for ‘Eastern Europe and Asia’ accessions. For K = 3, the blue group, highlights hybrids

PCA shows similar clustering of our germplasm collection as fastSTRUCTURE (Figure S6). PC1, which explains 7.37% of total variance, separated the “Western Europe and America” (WEAm) accessions from the “Eastern Europe and Asia” (EEAs) accessions. PC2 accounted for 5.80% of variance explained and separated the hybrids and admixed accessions from France and USA, observed with K = 3 in fastSTRUCTURE.

Relatedness of the GWAS panel

In addition to population structure, we investigated the familial relatedness within our association panel by estimating kinship coefficient (k) with the KING method. To identify first-degree relationships and differentiate “parent-offspring” from “full sibling” pairs, we used the estimates of k and the proportion of zero identical-by-state (IBS0) observed in the F1 progeny (Figure S7). In particular, we defined all pairwise relationships in the GWAS panel with k > 0.17 and 0 < IBS0 <  0.019 to be parent-offspring relationships. Results confirmed known pedigrees, particularly for the hybrids accessions and the modern cultivars from France and the USA. We also identified new relationships, such as that between ‘Grosvert n°1’ and ‘Verdelet’, French landraces from the departments of Dordogne and Corrèze, which may be full-sibs (Figure S8). Moreover, ‘Ashley’ and ‘Payne’, said to be identical, show the highest kinship coefficient.

Genome-wide analysis for bearing H abit

For bearing habit, we found no influence of population structure (PC = 0 according to the ‘model selection’ function implemented in GAPIT; Table S2). We used multi-locus mixed model (MLMM), and Fixed and random model Circulating Probability Unification method (FarmCPU). GWAS results using both models showed a significant association on Chr11 with bearing habit, using only the 2019 data (Fig. 3). The most significantly associated marker was the SNP ‘AX-171191765’ (physical position: 20,831,267 bp; p-value: 2.98E-14), and two additional associations are also found on Chr6 (SNP ‘AX-171108125’; p-value = 4.08E-09) and Chr8 (SNP ‘AX-171083929’; p-value = 1.47E-08), according to the false discovery rate (FDR) threshold (≥ 0.05).

Fig. 3
figure 3

GWAS results for bearing habit using 2019 data. Manhattan plots followed by Q-Q plots using a) MLMM model, b) FarmCPU model, and c) box plots of the allele effects for the 3 SNPs associated with bearing habit

The boxplots show the bearing habit phenotypes of 2019 for the different alleles of the three associated SNPs (Fig. 3). For the most significantly associated SNP ‘AX-171191765’, the allele G is linked to a terminal bearing habit, whereas the allele C is linked to a lateral bearing habit (R2 = 34.3%, allelic estimated effect = 2.59), leading to an increased yield.

Association and linkage mapping for Budbreak date and female flowering dates

Using 1937 SNPs (Table 2), the ‘Franquette’ and ‘UK 6–2’ parental genetic maps constructed have a length of 1015 and 1346 cM, and a number of markers of 849 and 1088 SNPs, respectively (Table S3). The marker names of the genetic maps were changed with the corresponding chromosome number and its physical position for a better visualization (Figure S9). For all the phenology-related traits, we also found that population structure did not influence phenology in our GWAS panel. Both GWAS and classical QTL mapping identified marker-trait associations for budbreak date in the same region on Chr1 (Fig. 4). The most significant associated SNP ‘AX-171179714’ on the Chr1 (physical position: 6,514,832 bp) was found using the Best Linear Unbiased Predictions (BLUPs) of two-year data and co-localizes with the major QTLs identified for both parents in 2019 using the F1 progeny data. The allele T of this SNP is linked to a late budbreak date (R2 = 30.6%, allelic estimated effect = 5.9) (Table 3). We also ran a Kruskal-Wallis test to find if the phenotypic differences were significant among the three genotypes using the different phenotypic datasets, and this allelic effect remains consistent (p-values = 1.84E-13 for two-year data, 6.63E-12 for 2018, 9.76E-13 for 2019, and 2.61E-09 for legacy data; Fig. 5). The co-localizing major QTLs found in 2019 in LG 1 in ‘Franquette’ and ‘UK 6–2’ explain 23.9 and 34.8% of the budbreak date variance, respectively. In addition, GWAS with two-year data found four additional associations on chromosomes 2, 4, 8 and 15, while the classical linkage mapping analysis identified minor QTLs on linkage groups 6, 11, 12 and 14.

Fig. 4
figure 4

GWAS and linkage mapping results for budbreak date. a Manhattan plot followed by Q-Q plots using BLUPs with two-year data and FarmCPU model, b focus on chromosome 1, and c) QTLs found using 2018 and 2019 data and the F1 progeny. The dotted green line indicates the physical position (6,514,832 bp) of the SNP found in GWAS transposed into the linkage maps

Table 3 Summary of association and linkage mapping results related to bearing habit, budbreak date and female flowering dates
Fig. 5
figure 5

Box plots of the allele effects for the SNP AX-171179714 associated with budbreak date. FarmCPU model was used and all datasets: BLUPs using two-year data, legacy data, 2018 data and 2019 data

The high power of our gene tagging approach based on both GWAS and QTL mapping, was also confirmed for beginning, peak, and end, of female flowering dates. The SNP ‘AX-170990138’ (physical position: 9,298,520 bp) on Chr1 was systematically found associated with all stages of female flowering (Table 3). For beginning female flowering date, we found this SNP associated using two-year data, and using each year separately. We also observed this marker-trait association for peak female flowering date using legacy data, and for end female flowering date with two-year data and with 2019 data. The SNP ‘AX-170990138’ is 2.8 Mbp apart from the most significant marker-trait association found for the budbreak date on Chr1, and the allele G of this SNP is linked to a delayed female flowering, with a R2 ranging from 34.8 to 39.6%, and an allelic estimated effect ranging from 3.4 to 4.5, depending on the stage and the dataset. We identified additional QTLs for all three stages of female flowering but the most significant ones, segregating in both parental maps, co-localize with those previously found associated with the budbreak date on Chr1 (Table 3).

Besides the major QTL on Chr1 identified with both GWAS and QTL mapping, we found three significant associations also on Chr7 for all three stages. These three SNPs are located in a region of about 23 to 25 Mb (Table 3). In addition, we found QTLs on LGs 2 and 14 in ‘Franquette’ map, and on LGs 3, 9, 11 and 12 in ‘UK 6–2’ map.

Association and linkage mapping for Heterodichogamy and male flowering dates

Results for male flowering are similar to the female flowering results in that a few SNPs in a very close region are associated with all three stages (Table 4). On Chr11, we found four associated SNPs depending on the stage and the dataset, in a region of about 31.8 Mbp and a window of 52 kb. The most significant QTLs for all three stages of male flowering for both parental maps co-localize with those previously identified as associated with budbreak date and the three stages of female flowering. Two additional QTLs on ‘UK 6–2’ map were found on LG 11 for peak male flowering date, using two-year data (26,141,527 – 35,537,934 bp), and for the end of male flowering using 2019 data (27,685,560 – 35,537,934 bp), supporting the GWAS results.

Table 4 Summary of association and linkage mapping results related to heterodichogamy and male flowering dates

For heterodichogamy trait (computed by subtracting peak female date from peak male date; Table S5), the significant associations found with GWAS using two-year data and legacy data co-localized with the associations identified for male flowering dates on Chr11 in the region of about 31.8 Mbp.

Candidate genes for bearing habit and phenology-related traits using the walnut genome

By combining GWAS and QTL results and considering their consistency over phenotypic datasets, we decided to focus on a robust subset of eight loci to find candidate genes for bearing habit and phenology-related traits (Table 5). Using HaploView v4.2 software and the new chromosome-scale reference walnut genome v2.0 [36], several interesting coding sequences were found within the defined Linkage Disequilibrium (LD) blocks for different traits. The SNP ‘AX-171557178’ on Chr2 and associated with budbreak date, falls within a candidate gene encoding for a putative BPI/LBP family protein At1g04970. The corresponding LD block of 78 kb also contains a candidate gene coding for a GrpE-like protein and one encoding a 65-kDa microtubule-associated protein 1-like. Only one candidate gene, encoding an uncharacterized protein LOC108987988, overlaps with the most significant SNP associated with budbreak date, ‘AX-171179714’ on Chr1.

Table 5 List of candidate genes for bearing habit and phenology-related traits

The two SNPs on Chr11 associated with all three stages of male flowering date and with heterodichogamy, belong to the same LD block of 19 kb. Within this block is located a candidate gene encoding for a probable trehalose-phosphate phosphatase D. The other SNP on Chr4 associated with all three stages of male flowering date, belongs to a LD block of 63 kb comprising a candidate gene encoding for a trichome birefringence-like 13 protein. Only one candidate gene was found in LD with the associated SNP on Chr1 for all three stages of female flowering date. The SNP ‘AX-170990138’ belongs to a small LD block on Chr1 spanning from 9,298,520 to 9,300,288 bp (Fig. 6). The identified candidate gene of 1.75 kb (interval from 9,298,602 to 9,300,352 bp) overlaps with the LD block and encodes a chromosome transmission fidelity protein 8 homolog.

Fig. 6
figure 6

LD block representation for the SNP AX-170990138 associated with female flowering dates. This SNP overlaps with a candidate gene, coding for a chromosome transmission fidelity protein 8 homolog

Development of KASP marker for the Budbreak date and validation using 96 accessions of the walnut improvement program of the University of California, Davis

Primers related to the marker-trait association found on Chr1 for budbreak date (SNP ‘AX-171179714’) were developed. These primers were used to genotype 96 unreleased breeding line accessions of the Walnut Improvement Program of the University of California, Davis. Among them, 48 are early leafing accessions and 48 are late leafing accessions. Leafing date occurs 3 days after budbreak date. Our GWAS results showed that the allele T of the SNP ‘AX-171179714’ (versus G) has an allelic estimated effect of 5.9 (R2 = 30.6%), linked with a delayed budbreak date. Since we designed the KASP primers using the complementary strand, we expect to find the allele A (versus C) as involved in late leafing genotypes. The three different genotypes were clearly separated for 95 accessions, whereas one of them could not be determined (Table S4). In order to determine if leafing dates were different among the three genotypes, we ran a Kruskal-Wallis test which showed significance (p-value = 6.88E-13). Then, we ran a pairwise comparison using Dunn’s test with Bonferroni p-value adjustment. Results of pairwise comparisons showed that A/A vs. C/A, and A/A vs. C/C have a significant effect on leafing date (p-values = 1.4E-06 and 9.4E-12, respectively), contrary to C/A vs. C/C genotypes (p-value = 1). The homozygous accessions A/A have a median leafing date of almost 110 Julian days, while genotypes C/A and C/C have a median leafing date of only 76 Julian days (Fig. 7).

Fig. 7
figure 7

Allelic effect of the KASP marker associated to the leafing date. 96 unreleased breeding line accessions of the Walnut Improvement Program of the University of California, Davis, were used


Analysis of population structure and its effect on bearing habit and phenology-related traits

We found that the most likely K subpopulations were K = 2, with a clustering in two major groups according to their geographical origin, and K = 3 that highlights admixed accessions, including hybrids or modern cultivars from France and USA. The EEAs cluster in PCA resulted more scattered, indicating a higher level of SNP-based genetic diversity, and the three most diverse accessions of the EAAs cluster, located at the bottom-right corner of the PCA plot, are ‘Shinrei’ from Japan, ‘JinLong1’ from China, and ‘EAA6’ from Greece. The two last were previously found to be highly diverse by SSRs and included in an SSR-based core collection [37].

For the entire set of studied traits, we observed no influence of population structure on the phenotypes. Most likely, the familial relatedness among accessions accounted for most of the genetic variation affecting the traits studied, leading to redundant PCA information [38]. Even though our collection included accessions from different geographical areas, it consisted of 30% modern cultivars likely brought to the INRAE of Bordeaux as potential genitors for breeding purposes. Therefore, it is not surprising to observe accessions of the same geographical origin that are both early and late in phenology.

Strong correlations between traits Lead to close associations

Results found using the F1 progeny show that strong QTLs for all the stages of female flowering date segregating in both parents on LG 1, co-localize with strong QTLs for budbreak date. In walnut trees, budbreak phenotyping relies on buds that will produce both leaves and female flowers. Those traits are, therefore, correlated and likely controlled by common genetic and metabolic pathways, as suggested by the identification of QTLs within the same genomic region on Chr1. Similar results were observed in walnut for leafing and harvest dates [35]. On the other hand, GWAS enabled further dissection of the genetic control of these traits by identifying a location for the most significant marker-trait associations for budbreak date at about 2.8 Mbp apart from the SNPs associated to all stages of the female flowering date. This is one of the major advantages of GWAS, whose higher resolution defines more precisely the underlying regions of traits of interest compared to classical QTL mapping.

While the major loci controlling the traits studied were identified with both GWAS and linkage mapping, this did not happen for the minor associations. For instance, for budbreak date, GWAS found a second trait-locus association on Chr2, while one minor QTL was detected on LG 6 of the ‘UK 6–2’ map. This finding can be explained in two ways: (i) the loci identified in the GWAS likely are not segregating in the F1 progeny, and (ii) the QTLs found in the linkage mapping analysis have rare alleles in the association panel. For both GWAS results and QTLs mapping, the minor trait-loci associations were not stable across years, suggesting they are more affected by environmental conditions. For this reason, we decided to focus mostly on the GWAS results based on the two-year BLUPs. Nevertheless, results based on the legacy data allowed confirmation of the major SNP associated with budbreak date.

We observed a different scenario for all the stages of male flowering date: while the major QTL found on LG 1 co-localizes with those for female flowering dates and budbreak date, these results were not confirmed in the GWAS panel. Here, instead, we identified the most significant associations on chromosomes 11 and 4. Then, since heterodichogamy is the measurement of how the male and female flowering dates overlap, it is not surprising to find a marker-trait association on Chr11 with more consistency across years. Since the F1 progeny used for QTL mapping included only 78 individuals, likely leading to an overestimation of the QTL effects (Beavis effect) [39], we decided to analyze the GWAS results in more depth. In addition, a few regions were not segregating in our F1 progeny, due to monomorphism in the parents, likely preventing accurate QTL detection. This is the case, for example, of the QTLs detected on LG 2 for both parental maps. By checking the physical position of the associated markers, we observed that the beginning of the Chr2 lacks regions from 0 to 5 Mbp for the ‘Franquette’ map, and from 0 to 7.7 Mbp for the ‘UK 6–2’ map.

Combining multi-locus models for GWAS has become widespread [40], and the associations found by multiple methods are usually reliable [41]. Similar work on walnut, also using MLMM and FarmCPU models [35], found three associations within a physical genomic region spanning from 3,187,214 to 4,805,396 bp for the leafing date. This is very close to our association for budbreak date, detected at 6,514,832 bp, and our QTL for the same trait using the ‘Franquette’ parental map, located at 87,347 to 3,051,320 bp. These results indicate that most likely on Chr1 there is a genomic region controlling budbreak, leafing, and female flowering dates in walnut. Since strong positive correlations exist among these traits, with high allelic effects of the marker-trait associations, this would have an impact for selection. For instance, breeding for budbreak and female flowering (and more broadly for all phenological-related traits) would be a difficult job, and would complicate the creation of shorter-cycle cultivars.

In this work, classical QTL mapping mostly confirmed the major marker-trait associations identified. We found a major locus on Chr1 for budbreak date and female flowering dates, and a major locus on Chr11 for male flowering dates. Separating the flowering stages (beginning, peak and end) allowed “repeat” phenotyping, generating robust data for identification of the major loci required for marker development purposes. Also, the high complexity of phenology-related traits indicates that many loci may be involved in their expression [23], making the detection of minor loci difficult.

Genetic bases for the expression of lateral bearing

Lateral bearing is a crucial trait in walnut breeding since it contributes to increased yield [42]. Fruiting at lateral positions on shoots was originally introduced into California from the cultivar ‘Payne’, which is in the background of all current varieties released by the Walnut Improvement Program of the University of California, Davis [4]. Using a large F2 progeny from ‘Chandler’, a lateral bearing cultivar heterozygous for this locus, the genetic basis of lateral fruitfulness was found to be located in the centromeric region of Chr11 [29]. The same location was obtained using F1 progeny from ‘Chandler’ by ‘Idaho’ cross [43]. Additional associations for lateral bearing were found using GWAS in a region covering 25 Mb on Chr11, and a major QTL was detected in ‘Chandler’ in proximity to the centromeric region (peak at 17,341,634 bp) of Chr11 [35].

In the present study, we reconfirmed that walnut bearing habit is controlled by a major QTL in the centromeric region of Chr11. In line with previous studies, we observed that an allelic substitution at the most associated locus, ‘AX-171191765’, causes lateral bud fruiting. We confirm previous results regarding genetic control of lateral bearing by performing GWAS in genetically different plant materials, further supporting the power and high resolution of this method in walnut. In our F1 progeny, this trait did not segregate and all hybrids are lateral bearing. ‘Franquette’ is terminal bearing and homozygous G/G for this locus, whereas ‘UK 6–2’ is lateral bearing and heterozygous G/C for this locus. Maybe because our progeny is too small or because of segregation distorsion, we did not observe any terminal bearing individuals.

Gene annotations may give clues to dissect genetic control of flowering process

Availability of the new chromosome-scale reference genome Chandler v2.0 [36] allowed us to explore the gene space surrounding the most significant marker-trait associations identified for bearing habit and phenology. For instance, we found the chromosome transmission fidelity (ctf) protein 8 homolog gene as candidate gene for all female flowering dates. Previous studies in Arabidopsis thaliana suggest the involvement of the ctf genes family in the cell division processes [44]. In particular, the inactivation of the ctf7–1 and ctf7–2 genes resulted in developmental defects, including aberrations in flower morphology and male and female gametophytes. Interestingly, a TPX2 encoding gene required for spindle assembly during cell division process was also found likely to be involved in harvest date determination in walnut [35]. These findings suggest that cell division is a crucial process for correct flower and fruit development in walnut, and suggest the ctf8 protein homolog locus is an excellent candidate gene for female flowering.

Trichomes consist of only one cell type, appearing as a long, slender appendage [45]. In J. regia, during staminate flower development, the tepals differentiate into anthers and filaments shortly before anthesis, enveloping the stamens [46]. Since these tepals become hirsute, it is not surprising to find a trichome birefringence-like 13 protein encoding gene within the LD block surrounding the SNP on Chr4 which is associated with male flowering dates. Also, the SNP on Chr11 associated with heterodichogamy and male flowering dates is in LD with a candidate gene encoding a probable trehalose-phosphate phosphatase D. This dephosphorylating enzyme and other members of the gene family are highly expressed in male flowers of Jatropha, a perennial woody plant, and in transgenic plants of A. thaliana with delayed flowering [47], suggesting the putative involvement of this class of genes in the flower development process. Other authors reviewed the roles of sugars in the control of flowering time, and apart from its involvement in carbohydrate metabolism, trehalose-6-phosphate seems to serve as a signal for inducing flowering transition in plants [48, 49]. In this regard, we found a probable trehalose-phosphate phosphatase D encoding gene for heterodichogamy.

Surprisingly, we did not find widely known genes to be involved in flowering process or dormancy in our candidate genes. In A. thaliana, sweet cherry or other species, these mechanisms are regulated by flowering locus, early flowering, and MADS-box homologous genes, among others [50,51,52]. For this reason, we decided to list several homologous genes within the RefSeq walnut gene annotation using three keywords: ‘flower’, ‘dormancy’, ‘MADS’, and we checked the physical positions of these genes. We then compared them with our marker-trait associations, regardless of the LD blocks, and we found interesting results. We found a agamous-like MADS-box protein AGL11 encoding gene (approx. 2.0 Mbp) and a MADS-box transcription factor 6-like encoding gene (8.8 Mbp), both at the beginning of the Chr1. The first one co-localizes with the major QTL found for budbreak date using the F1 progeny and the second is very near to the major marker-trait association found for female flowering dates (9.3 Mbp). Then, our results showed several marker-trait associations for both female and male flowering dates on Chr7 between 23.0 and 48.0 Mbp. Interestingly, we found the following encoding genes within this window: a flowering time control protein FCA-like (33 Mbp), a flowering-promoting factor 1-like protein 3 (34.4 Mbp), a dormancy-associated protein homolog 4 (42.9 Mbp), a protein early flowering 3-like (44.3 Mbp), a MADS-box protein SOC1-like (48.0 Mbp), and a protein flowering locus D-like (48.6 Mbp). Finally, we noticed that two genes are near to our marker-trait associations for the dichogamy (31.9 Mbp) and the male flowering dates (from 29.4 to 31.9 Mbp) on Chr11: a MADS-box transcription factor 27-like (29.9 Mbp) and a protein embryonic flower 1 (34.3 Mbp).

A KASP marker related to phenology is released for the first time in walnut

For the first time in walnut, a KASP marker related to phenology is released. By designing specific primers for the SNP AX-171179714, we found that the homozygous accessions A/A are significantly later leafing. Since this marker is dominant, it will greatly help breeders to accurately select individuals with delayed budbreak. However, due to the complex genetic basis of phenology-related traits in perennials [23], additional markers, especially for minor loci, will be needed to improve the selection. Finally, this SNP is located in a gene encoding an uncharacterized protein, and it would be interesting to know more about the functional role of this gene. The marker can predict a significant portion of the phenotype but we still do not know the regulatory networks involved in this complex trait.


Due to the significant influence of environment on phenology-related traits in walnut, unravelling their genetic architecture is of upmost importance for the development of markers that could assist the selection of superior genotypes and, therefore, the release of new walnut cultivars adapted to different climatic conditions. Using GWAS with two different multi-locus models, we identified significant associations for budbreak date, and male and female flowering dates, confirmed by classical QTL mapping. In addition, we provide a list of candidate genes for these traits, that will be fundamental in future studies of functional genomics and understanding the metabolic pathways underlying phenology in walnut. We also developed and validated the first KASP marker for budbreak date in walnut, which will allow accurate selection of individuals with a delayed budbreak and, therefore, suitable for cultivation in France and other regions where late spring frosts are challenging. In parallel, the genetic bases for the expression of lateral bearing were confirmed. Since the future French walnut breeding program needs cultivars with high quality kernels, efforts are underway phenotype the entire collection regarding chemical content (e.g. fatty acids and tocopherols) and physical characteristics (nut length, filling ratio, and ease of removal). Our F1 progeny is too small to pursue investigations but this study confirms that the INRAE walnut germplasm repository contains an array of plant material suitable for this type of work. New genome-wide analyses now are being initiated to further increase our knowledge concerning the genetic architecture of the main traits of interest in walnut.


Plant material

The panel for GWAS consists of 170 unique J. regia grafted trees accessions originating worldwide and maintained at the Prunus and Juglans Genetic Resources Center located in the Fruit Experimental Unit of the INRAE, Toulenne, France (44°34′37.442″N – 0°16′51.48″W), near Bordeaux. This INRAE walnut germplasm collection is publicly available and is a result of important collecting work performed between 1988 and 2000 by Eric Germain (retired and former head of the INRAE walnut breeding program) in 23 countries including the European, American, and Asian continents. Eric Germain initiated an international cooperation during his activity concerning the exchange of walnut plant materials (especially in Europe) and obtained all permissions to bring them. Experimental research on this plant material complies with our INRAE institutional guidelines. The original source of each accession (institute/lab source or collecting source) is given (Table S1). The panel choice was based on previous genetic diversity work using 13 SSR markers and evaluation of phenotypic variability [37].

The intraspecific F1 mapping progeny results from bi-parental controlled crosses made from 1997 to 2004 by Eric Germain between two cultivars with contrasting phenology-related traits, for his needs under his breeding program. The female parent was the ‘Franquette’ cultivar, a French landrace from Isère valley and having a late budbreak date and terminal bearing. The male parent was the accession ‘UK 6–2’, obtained from the Kiev Botanical Garden in Ukraine and thought to originate from the center of origin of J. regia in Central Asia (Uzbekistan, Tajikistan, or Kyrgyzstan). ‘UK 6–2’ has an early to intermediate budbreak date and lateral bearing. The progeny consists of 78 individuals, also located at the Fruit Experimental Unit of the INRAE in Toulenne, which were previously genotyped using the same SSR markers as those of the GWAS panel [37] to confirm that they were true progeny. This plant material also complies with our INRAE institutional guidelines.

Phenotypic evaluation and data analysis

Phenotypic evaluation for the following 10 traits was conducted in 2 years (2018 and 2019) for both the GWAS panel and the F1 progeny, at a rate of two to three visits in orchards per week during the months of March, April and May: budbreak date, beginning, peak, end, and duration of female flowering, heterodichogamy, and beginning, peak, end, and duration of male flowering (Table S5). Heterodichogamy, degree of male and female flowering overlap, was computed by subtracting peak female flowering date from peak male flowering date. A negative value (categories ‘1’ and ‘3’, Table S5) means the accession is protandrous since peak male flowering in Julian days occurs before the peak female date. Categories ‘7’ and ‘9’ indicate protogyny.

For the GWAS panel, we also used phenotypical data based on the same scoring scales used for many years previously (mainly between 1989 and 2011) on three sites. These legacy data are available from the INRAE walnut germplasm collection database [21]. Moreover, as bearing habit is not affected by environment conditions, this trait was recorded only in 2019. This trait did not segregate within the F1 progeny, so only the GWAS panel was evaluated. Lateral bearing was found in recently released cultivars from France (‘Ferbel’, ‘Fertignac’) and the USA (‘Pedro’, ‘Chico’, ‘Cisco’, ‘Chandler’), and in accessions coming from the Central Botanical Garden of Kiev, Ukraine (Table S1), supposedly originating from the walnut center of domestication, according to the introduction book of the INRAE. Terminal bearing was observed mainly in French landraces such as ‘Franquette’, ‘Grosvert’ and ‘Verdelet’. Data management and visualisation, and Spearman correlation matrices (more appropriate for discrete variables) were performed using “R” software [53], with the packages “tidyverse” [54] and “corrplot” [55], respectively.

For both the GWAS panel and the F1 progeny, the means of genotypic effects were obtained for each accession by adjusting for known environmental factors using the BLUPs. When using two-year data, the means were predicted using a mixed linear model considering the year effect (a). When using phenological data from many years and three sites only available for the GWAS panel [21], the means were predicted using a mixed linear model considering the effects of year, site and combined of year and site (b):

$$ {P}_{ik}=\mu +{Y}_i+{g}_k+{e}_{ik} $$
$$ {P}_{ijk}=\mu +{Y}_i+{S}_j+\left({Y}_i\times {S}_j\right)+{g}_k+{e}_{ijk} $$

where Pi(j)k refers to the observed phenotype of the kth accession in the ith year in the jth site; μ is the mean value of the trait; Yi is the fixed effect of the ith year, Sj is the fixed effect of the jth site, gk is the random effect of the k genotype; and ei(j)k is the residuals of the model. The BLUPs were performed using “R” package “lme4” [56].

Based on the previous mixed linear models, broad-sense heritability of each trait was estimated using the variance components. When using two-year data, we accounted for the variance component of genotype × year interaction (a). When using phenological data from many years and three sites, we accounted for the variance components of genotype × year and genotype × site interactions (b):

$$ {\mathrm{H}}^2={\sigma^2}_G/\left[\right({\sigma^2}_G+\left({\sigma^2}_{GxY}/{\mathrm{n}}_{\mathrm{y}}\right)+\left({\sigma^2}_{\varepsilon }/{\mathrm{n}}_{\mathrm{y}}\right)\Big] $$
$$ {\mathrm{H}}^2={\sigma^2}_G/\left[\right({\sigma^2}_G+\left({\sigma^2}_{GxY}/{\mathrm{n}}_{\mathrm{y}}\right)+\left({\sigma^2}_{GxS}/{\mathrm{n}}_{\mathrm{s}}\right)+\left({\sigma^2}_{\varepsilon }/{\mathrm{n}}_{\mathrm{y}}{\mathrm{n}}_{\mathrm{s}}\right)\Big] $$

where σ2G is the genotypic effect variance; σ2GxY is the genotype × year interaction variance; σ2GxS is the genotype × site interaction variance; σ2ε is the variance of residuals; ny is the number of years; and ns is the number of sites. We have always only one replication by genotype. For two-year data, ny = 2 (2018 and 2019). Nevertheless, the legacy data are very unbalanced for the years and sites available considering each genotype. We consider nyns = 5 because, in average, we have 5 observations by genotype.

SNP genotyping and quality control

The genomic DNAs of both the GWAS panel (J. regia accessions) and F1 progeny (78 individuals) were extracted from young leaves as described in Bernard et al. [37]. The accessions were genotyped using the Axiom™ J. regia 700 K SNP array containing 609,658 SNPs uniformly distributed over the 16 J. regia chromosomes [31]. These SNPs were then filtered through several criteria (Table 2). First, the filtering metrics were performed by ThermoFisher considering default thresholds: dish quality control greater than or equal to 0.82, and quality control call rate of 97%. The quality control steps were performed using “PLINK 1.9” software [57]. SNPs with Mendelian errors in the F1 progeny were removed. After, Poly High Resolution (PHR) and No Minor Homozygotes (NMH) SNPs were filtered using stringent thresholds: SNP call rate (> 90%), minor allele frequency (MAF > 5%), and redundancy in the genome (SNP probes aligning in duplicated regions). Finally, 364,275 robust SNPs (59.8% of the total number of SNPs) were retained for the following genome-wide analysis.

The same steps were performed for the F1 progeny, except filtering for MAF, gaining a final panel of 478,458 SNPs. In addition, individuals were checked for outlying heterozygosity rate.

SNP linkage map construction and quantitative trait locus analyses

Based on the pseudo-testcross strategy [58], 849 SNPs were retained for the ‘Franquette’ female parent, and 1088 for the ‘UK 6–2’ male parent. The two parental genetic linkage maps were constructed using JoinMap® 4.0 software [59] and distorted markers were omitted. A minimum LOD value of 16.0 was chosen for mapping. Kosambi’s mapping function was used to convert recombination frequencies into map distances [60], and graphical display of linkage maps was performed using MapChart 2.32 software [61].

QTLs were determined using MultiQTL 2.6 software (; MultiQTL Ltd., Institute of Evolution, Haifa University, Haifa, Israel). Single trait analysis was performed using a Multiple Interval Mapping (MIM) method [62], in which the highest effect QTL is taken as a cofactor to control the genetic background, whereas another QTL is searched in a different position. All LGs for each parent were scanned using the one QTL model, with 1000 runs of permutation tests, to compare H1 hypothesis (one QTL is present in the LG) and H0 hypothesis (no QTL in the LG). The threshold for MIM was 0.05 and computation of Type I error rate for each QTL (αchr) was performed as follows:

$$ {\alpha}_{\mathrm{chr}}=1-\left\{1-\right[1-{\left(1-0.05\right)}^{1/M}\Big\}{}^m $$

where M is the total number of markers used for the QTL detection on each parental map, and m is the number of markers in the LG carrying the QTL [63].

Population structure and kinship analyses

For the GWAS panel, the R packages “SNPRelate” and “gdsfmt” [64] were used to perform PCA and relatedness estimations based on a LD pruned set of 29,825 SNPs to avoid capturing too much variance of high LD regions. For the PCA, the ten largest eigenvalues were obtained to check for structure. The KING method of moment was used to estimate identity-by-descent (IBD) proportions between all pairwise comparisons [65]. Population structure was also investigated using the “fastSTRUCTURE” software [66], and the most likely K was determined using the ΔK method [67]. Then, “CLUMPP” [68] and “distruct” [69] softwares were used for clustering accessions and graphical presentation respectively.

Genome-wide association analysis

GWAS was carried out using the R package “GAPIT” [70], accounting for the familial relationships in the form of a kinship matrix. The ‘model selection’ function implemented in GAPIT was used to select the best number of PCs by running a mixed linear model (MLM) with a maximum of ten PCs tested. Then, the best number of PCs to include in the GWAS analysis was selected according to the Bayesian Information Criterion (BIC). Using BLUPs previously calculated as phenotypic data and each year separately, two multi-locus models were applied: multi-locus mixed model (MLMM) [71] and Fixed and random model Circulating Probability Unification method (FarmCPU) [72]. Thresholding with FDR [73] classically implemented in “GAPIT” was applied to define the significant associations.

Search of annotations within LD blocks of associated loci

LD levels around the most associated loci were estimated using HaploView v4.2 software [74]. Each physical position of these trait-SNPs associations was investigated to explore the extension of the surrounding LD blocks and determine the genomic regions where to search for candidate genes. The LD blocks were investigated using the method of confidence interval [75], and “solid spine of LD” method, in which the first and last SNPs in a block are in strong LD with all intermediate markers but the intermediate ones are not necessarily in LD with each other. The identified LD blocks were then searched for candidate genes using RefSeq walnut gene annotation ( and mapped onto the new chromosome-scale reference genome v2.0 [36] (

Kompetitive allele specific PCR marker development for Budbreak date and validation

Kompetitive allele specific PCR (KASP) markers are based on the dual Fluorescence Resonance Energy Transfer (FRET) method, in which the sample DNA is amplified with allele specific primers conjugated to fluorometric dyes HEX and FAM at their 5′ end. Based on GWAS results regarding the budbreak date, the following KASP primers were developed (LGC Genomics, Hoddesdon, UK) to target the most significantly associated SNP with this trait:



Allele A tail (FAM tail): 5′-GAAGGTGACCAAGTTCATGCT-3′.

Allele C tail (HEX tail): 5′-GAAGGTCGGAGTCAACGGATT-3′.


These primers were developed by the BioGEVES laboratory (Beaucouzé, France) on the complementary strand. The KASP reaction and its components are explained at The KASP reaction was prepared as follows: 1.95 μL of PCR mix PACE (3CR Biosciences Limited, Harlow, UK), 2 μL of genomic DNA (2 ng/μL), and 0.053 μL of the three primers (Integrated DNA Technologies, Leuven, Belgium), for a total volume reaction of 4.003 μL. Amplification was performed in a hydrocycler, starting with 15 min at 94 °C, a touchdown phase of 10 cycles at 94 °C for 20 s and at 61 °C for 60 s with a 0.6 °C decrease in temperature per cycle, followed by 35 cycles of 94 °C for 20 s and 55 °C for 60 s. Once the cycle was complete, Fluostar Omega reader (BMG Labtech, Paris, France) was used to read the fluorescence signal. The KASP method was tested using DNA from a set of 96 unreleased breeding line accessions from the Walnut Improvement Program of the University of California, Davis.

Availability of data and materials

The phenotypic raw datasets generated and analyzed during the current study for the GWAS panel and for the F1 progeny are available respectively in Table S6, and Table S7.

The SNP genotyping raw datasets in “hapmap” format are freely and openly accessed on the “Portail Data INRA” repository, via the identifier “INRA’s Walnut Hapmap” and the following Digital Object Identifier (DOI): The dataset called “GWAS_hapmap.txt” is related to the GWAS panel, and the one called “Progeny_hapmap.txt” is related to the F1 progeny. The additional file called “List of” allows to link the array identifier name of the accessions with the identifier name used in this study.



Agence nationale de la recherche et de la technologie


Bayesian information criterion


Basic local alignment search tool


Best linear unbiased predictions






Centre technique interprofessionnel des fruits et légumes


Food and agriculture organisation


Fixed And random model circulating probability unification


False discovery rate


Fluorescence resonance energy transfer


Genome-wide efficient mixed model association


Groupe d’etude et de contrôle des variétés et des semences


Genome-wide association study


Identity by descent


Identity by state


Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement


Kompetitive Allele Specific PCR


Linkage disequilibrium


Linkage group


Logarithm of odds


Minor allele frequency


Multiple interval mapping


Mixed linear model


Multi-locus mixed model


National center for biotechnology information


No minor homozygote


Principal component analysis


Poly high resolution


Quantitative trait locus


Single nucleotide polymorphism


Simple sequence repeat


  1. Aradhya MK, Potter D, Simon CJ. Cladistic Biogeography of Juglans (Juglandaceae) Based on Chloroplast DNA Intergenic Spacer Sequences. In: Motley TJ, Zerega N, Cross H, editors. Darwins Harvest New Approaches Orig Evol Conserv Crops: New York: Columbia University Press; 2006. p. 143–70.

    Google Scholar 

  2. Woodworth RH. Meiosis of micro-sporogenesis within the Juglandaceae. Am J Bot. 1930;17:863–9.

    Article  Google Scholar 

  3. McGranahan G, Leslie C. Breeding walnuts (Juglans regia). In: Jain SM, Priyadarshan PM, editors. Breeding Plantation Tree Crops: Temperate Species. Springer Science; 2009. p. 249–73.

    Chapter  Google Scholar 

  4. Bernard A, Lheureux F, Dirlewanger E. Walnut: past and future of genetic improvement. Tree Genet Genomes. 2018;14:1.

    Article  Google Scholar 

  5. Vahdati K, Arab MM, Sarikhani S, Sadat-Hosseini M, Leslie CA, Brown PJ. Advances in Persian Walnut (Juglans regia L.) Breeding Strategies. In: Al-Khayri J, Jain SM, Johnson DV, editors. Advances in Plant Breeding Strategies: Nut and Beverage Crops: Springer International Publishing Switzerland; 2019. p. 401–72.

  6. McGranahan G, Leslie C. Walnuts (Juglans). In: Moore JN, Ballington Jr JR, editors. Genetic resources of temperate fruit and nut crops, part 2. Wageningen: International Society of Horticultural Sciences; 1991. p. 907–51.

    Google Scholar 

  7. Cook J, Oreskes N, Doran PT, et al. Consensus on consensus: a synthesis of consensus estimates on human-caused global warming. Environ Res Lett. 2016;11:1–8.

    Article  Google Scholar 

  8. Cooke JEK, Eriksson ME, Juntilla O. The dynamic nature of bud dormancy in trees: environmental control and molecular mechanisms. Plant Cell Environ. 2012;35:1707–28.

    Article  CAS  PubMed  Google Scholar 

  9. Campoy JA, Ruiz D, Egea J. Dormancy in temperate fruit trees in a global warming context: a review. Sci Hort. 2011;130:357–72.

    Article  Google Scholar 

  10. Hassankhah A, Vahdati K, Rahemi M, Sarikhani S. Persian walnut phenology: effect of chilling and heat requirements on Budbreak and flowering date. Int J Hort Sci Tech. 2017;4:259–71.

    Google Scholar 

  11. Aslamarz AA, Vahdati K, Rahemi M, Hasani D. Estimation of chilling and heat requirements of some Persian walnut cultivars and genotypes. HortScience. 2009;44(3):697–701.

    Article  Google Scholar 

  12. Charrier G, Ngao J, Saudreau M, Améglio T. Effects of environmental factors and management practices on microclimate, winter physiology and frost resistance in trees. Front Plant Sci. 2015;6:259.

    PubMed  PubMed Central  Google Scholar 

  13. Charrier G, Chuine I, Bonhomme M, Améglio T. Assessing frost damages using dynamic models in walnut trees: exposure rather than vulnerability controls frost risks. Plant Cell Environ. 2018;41:1008–21.

    Article  CAS  Google Scholar 

  14. Chmielewski FM, Rötzer T. Response of tree phenology to climate change across Europe. Agric For Meteorol. 2001;108:101–12.

    Article  Google Scholar 

  15. Menzel A, Sparks TH, Estrella N, et al. European phenological response to climate change matches the warming pattern. Glob Change Biol. 2006;12:1969–76.

    Article  Google Scholar 

  16. Khanduri VP, Sharma CM, Singh SP. The effects of climate change on plant phenology. Environmentalist. 2008;28:143–7.

    Article  Google Scholar 

  17. Vitasse Y, François C, Delpierre N, et al. Assessing the effects of climate change on the phenology of European temperate trees. Agric For Meteorol. 2011;151:969–80.

    Article  Google Scholar 

  18. Luedeling E, Gassner A. Partial Least Squares Regression for analyzing walnut phenology in California. Agric For Meteorol. 2012:158, 43–9, 52.

    Article  Google Scholar 

  19. Črepinšek Z, Solar M, Štampar F, Solar A. Shifts in walnut (Juglans regia L.) phenology due to increasing temperatures in Slovenia. J Hortic Sci Biotechnol. 2009;84:59–64.

    Article  Google Scholar 

  20. Cosmulescu S, Ionescu MB. Phenological calendar in some walnut genotypes grown in Romania and its correlation with air temperature. Int J Biometeorol. 2018;62:2007–13.

    Article  PubMed  Google Scholar 

  21. Bernard A, Barreneche T, Delmas M, Durand S, Pommier C, et al. The walnut genetic resources of INRA: chronological phenotypic data and ontology. BMC Res Notes. 2019:12–662.

  22. Vahdati K, Massah Bavani AR, Khosh-Khui M, Fakour P, Sarikhani S. Applying the AOGCM-AR5 models to the assessments of land suitability for walnut cultivation in response to climate change: a case study of Iran. PLoS One. 2019;14(6):e0218725.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Dirlewanger E, Quero-García J, Le Dantec L, et al. Comparison of the genetic determinism of two key phenological traits, flowering and maturity dates, in three Prunus species: peach, apricot and sweet cherry. Heredity. 2012;109:280–92.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Olukolu BA, Trainin T, Fan S, et al. Genetic linkage mapping for molecular dissection of chilling requirement and budbreak in apricot (Prunus armeniaca L.). Genome. 2009;52:819–28.

    Article  CAS  PubMed  Google Scholar 

  25. Charrier G, Bonhomme M, Lacointe A, Améglio T. Are budburst dates, dormancy and cold acclimation in walnut trees (Juglans regia L.) under mainly genotypic or environmental control? Int J Biometeorol. 2011;55:763–74.

    Article  PubMed  Google Scholar 

  26. Eskandari S, Hassani D, Abdi A. Investigation on genetic diversity of Persian walnut and evaluation of promising genotypes. Acta Hortic. 2005;705:159–66.

    Article  Google Scholar 

  27. Hansche PE, Beres V, Forde HI. Estimates of quantitative genetic properties of walnut and their implications for cultivar improvement. J Am Soc Hortic Sci. 1972;97:279–85.

    Google Scholar 

  28. Germain E. Inheritance of late leafing and lateral bud fruitfulness in walnut (Juglans regia L.). phenotypic correlations among some traits of the trees. Acta Hortic. 1990;284:125–34.

    Article  Google Scholar 

  29. Dvorak J, Aradhya M, Leslie C, Luo MC. Discovery of the causative mutation of the lateral bearing phenotype in walnut. California Walnut Board: Walnut Research Reports; 2015.

    Google Scholar 

  30. Martínez-García PJ, Crepeau MW, Puiu D, et al. The walnut (Juglans regia) genome sequence reveals diversity in genes coding for the biosynthesis of non-structural polyphenols. Plant J. 2016;87:507–32.

    Article  PubMed  CAS  Google Scholar 

  31. Marrano A, Martínez-García PJ, Bianco L, Sideli GM, Di Pierro EA, et al. A new genomic tool for walnut (Juglans regia L.): development and validation of the high-density axiom™ J. regia 700K SNP genotyping array. Plant Biotechnol J. 2019;17:1027–36.

    Article  CAS  PubMed  Google Scholar 

  32. Arab MM, Marrano A, Abdollahi-Arpanahi R, et al. Genome-wide patterns of population structure and association mapping of nut-related traits in Persian walnut populations from Iran using the Axiom J. regia 700K SNP array. Sci Rep. 2019;9:6376.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  33. Famula RA, Richards JH, Famula TR, Neale DB. Association genetics of carbon isotope discrimination in the founding individuals of a breeding population of Juglans regia L. Tree Genet Genomes. 2019;15:6.

    Article  Google Scholar 

  34. Arab MM, Marrano A, Abdollahi-Arpanahi R, Leslie CA, Cheng H, et al. Combining phenotype, genotype and environment to uncover genetic components underlying water use efficiency in Persian walnut. J Exp Bot. 2020;71:1107–1127.

  35. Marrano A, Sideli GM, Leslie CA, Cheng H, Neale DB. Deciphering of the genetic control of phenology, yield and pellicle color in Persian walnut (Juglans regia L.). front. Plant Sci. 2019;10:1140.

    Google Scholar 

  36. Marrano A, Britton M, Zaini PA, Zimin AV, Workman RE, et al. High-quality chromosome-scale assembly of the walnut (Juglans regia L) reference genome. bioRxiv. 809798.

  37. Bernard A, Barreneche T, Lheureux F, Dirlewanger E. Analysis of genetic diversity and structure in a worldwide walnut (Juglans regia L.) germplasm using SSR markers. PLoS ONE. 2018;13(11):e0208021.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  38. Astle W, Balding DJ. Population structure and cryptic relatedness in genetic association studies. Stat Sci. 2009;24:451–71.

    Article  Google Scholar 

  39. Paterson AH, Beavis WD. QTL analyses: power, precision, and accuracy. In: Paterson AH, editor. Molecular dissection of complex traits. CRC Press: New York; 1998. p. 145–62.

    Google Scholar 

  40. Wang SB, Feng JY, Ren WL, Huang B, Zhou L, et al. Improving power and accuracy of genome-wide association studies via multi-locus mixed linear model methodology. Sci Rep. 2016;6:19444.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Zhang YM, Jia Z, Dunwell JM. Editorial: the applications of new multi-locus GWAS methodologies in the genetic dissection of complex traits. Front Plant Sci. 2019;10:100.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Amiri R, Vahdati K, Mohsenipoor S, et al. Correlations between some horticultural traits in walnut. HortScience. 2010;45:1690–4.

    Article  Google Scholar 

  43. Aradhya MK, Velasco D, Wang JR, Ramasamy R, You FM, et al. A fine-scale genetic linkage map reveals genomic regions associated with economic traits in walnut (Juglans regia). Plant Breed. 2019;00:1–12.

    Google Scholar 

  44. Bolaños-Villegas P, Yang X, Wang HJ, Juan CT, Chuang MH, et al. (2013). Arabidopsis CHROMOSOME TRANSMISSION FIDELITY 7 (AtCTF7/ECO1) is required for DNA repair, mitosis and meiosis. Plant J. 2013;75:927–40.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  45. Behnke HD. Plant trichomes – structure and ultrastructure: general terminology, taxonomic applications, and aspects of trichome-bacteria interaction in leaf tips of Dioscorea. In: Rodriguez E, Healey PL, Mehta I, editors. Biology and chemistry of plant trichomes. Plenum: New York; 1984. p. 1–21.

    Google Scholar 

  46. Lin RZ, Li RQ, Lu AM, Zhu JY, Chen ZD. Comparative flower development of Juglans regia, Cyclocarya paliurus and Engelhardia spicata: homology of floral envelopes in Juglandaceae. Bot J Linn Soc. 2016;181:279–93.

    Article  Google Scholar 

  47. Zhao ML, Ni J, Chen MS, Xu ZF. Ectopic expression of Jatropha curcas TREHALOSE-6-PHOSPHATE PHOSPHATASE J causes late-flowering and Heterostylous phenotypes in Arabidopsis but not in Jatropha. Int J Mol Sci. 2019;20:2165.

    Article  CAS  PubMed Central  Google Scholar 

  48. Cho LH, Pasriga R, Yoon J, Jeon JS, An G. Roles of sugars in controlling flowering time. J Plant Biol. 2018;61:121–30.

    Article  CAS  Google Scholar 

  49. Ponnu J, Wahl V, Schmid M. Trehalose-6-phosphate: connecting plant metabolism and development. Front Plant Sci. 2011;2:70.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Horvath D. Common mechanisms regulate flowering and dormancy. Plant Sci. 2009;177:523–31.

    Article  CAS  Google Scholar 

  51. Michaels SD, Amasino RM. FLOWERING LOCUS C encodes a novel MADS domain protein that acts as a repressor of flowering. Plant Cell. 1999;11:949–56.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Castède S, Campoy JA, Le Dantec L, Quero-García J, Barreneche T, Wenden B, et al. Mapping of candidate genes involved in bud dormancy and flowering time in sweet cherry (Prunus avium). PLoS One. 2015;10(11):e0143250.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  53. R Development Core Team. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2008. Available online at:

    Google Scholar 

  54. Wickham H. Tidyverse: Easily Install and Load the 'Tidyverse'. R package version 1.2.1. 2017.

    Google Scholar 

  55. Wei T, Simko VR. Package "corrplot": Visualization of a Correlation Matrix (Version 0.84). 2017.

    Google Scholar 

  56. Bates D, Maechler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. J Stat Softw. 2015;67:1–48.

    Article  Google Scholar 

  57. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, et al. PLINK: a toolset for whole-genome association and population-based linkage analysis. Am J hum genet. 2008;81 Available online at:

  58. Grattapaglia D, Sederoff R. Genetic linkage maps of Eucalyptus grandis and Eucalyptus urophylla using a pseudo-testcross: mapping strategy and RAPD markers. Genetics. 1994;137:1121–37.

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Van Ooijen JW. JoinMap® 4, Software for the calculation of genetic linkage maps in experimental populations. Wageningen: Kyazma B.V; 2006.

    Google Scholar 

  60. Kosambi D. The estimation of map distances from recombination values. Ann Eugenics. 1944;12:172–5.

    Article  Google Scholar 

  61. Voorrips RE. MapChart: software for the graphical presentation of linkage maps and QTLs. J Hered. 2002;93:77–8.

    Article  CAS  PubMed  Google Scholar 

  62. Kao CH, Zeng ZB, Teasdale RD. Multiple interval mapping for quantitative trait loci. Genetics. 1999;152:1203–16.

    CAS  PubMed  PubMed Central  Google Scholar 

  63. Saintagne C, Bodénès C, Barreneche T, Pot D, Plomion C, Kremer A. Distribution of genomic regions differentiating oak species assessed by QTL detection. Heredity. 2004;92:20–30.

    Article  CAS  PubMed  Google Scholar 

  64. Zheng X, Levine D, Shen J, Gogarten S, Laurie C, Weir B. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics. 2012;28:3326–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26:2867–73.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Raj A, Stephens M, Pritchard JK. fastSTRUCTURE: Variational inference of population structure in large SNP data sets. Genetics. 2014;197:573–98.

    Article  PubMed  PubMed Central  Google Scholar 

  67. Evanno G, Regnaut S, Goudet J. Detecting the number of clusters of individuals using the software structure: a simulation study. Mol Ecol. 2005;14:2611–20.

    Article  CAS  PubMed  Google Scholar 

  68. Jakobsson M, Rosenberg NA. CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics. 2007;23:1801–6.

    Article  CAS  PubMed  Google Scholar 

  69. Rosenberg NA. Distruct: a program for the graphical display of population structure. Mol Ecol Notes. 2004;4:137–8.

    Article  Google Scholar 

  70. Lipka AE, Tian F, Wang Q, Peiffer J, Li M, et al. GAPIT: genome association and prediction integrated tool. Bioinformatics. 2012;28:2397–9.

    Article  CAS  PubMed  Google Scholar 

  71. Segura V, Vilhjálmsson BJ, Platt A, Korte A, Seren Ü, Long Q, et al. An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nat Genet. 2012;44:825–30.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Liu X, Huang M, Fan B, Buckler ES, Zhang Z. Iterative usage of fixed and random effect models for powerful and efficient genome-wide association studies. PLoS Genet. 2016;12:e1005767.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  73. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Statist Soc B. 1995;57:289–300.

    Google Scholar 

  74. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005;21:263–5.

    Article  CAS  PubMed  Google Scholar 

  75. Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, et al. The structure of haplotype blocks in the human genome. Science. 2002;296:2225–9.

    Article  CAS  PubMed  Google Scholar 

Download references


First, we would like to thank the late Eric Germain, former head of the breeding program at the INRAE of Bordeaux from 1977 to 2007. His remarkable work, then continued by Francis Delort, has given us the opportunity to study a rich set of plant material. The latter helped us to phenotype the bearing habit of the walnut trees. We thank the Fruit Tree Experimental Unit of the INRAE in Toulenne and the Prunus/Juglans Genetic Resources Center for the maintenance of the collection and for helping us to collect the samples. We acknowledge the BioGEVES laboratory for DNA extraction and KASP marker development. Then, the CTIFL, holder of the project “INNOV’noyer”, in partnership with the INRAE of Bordeaux, want to thank the “Région Nouvelle-Aquitaine”, and “Cifre” convention of “ANRT” (Agence Nationale de la Recherche et de la Technologie). It is also important to note that the project is supported by the “Agri Sud-Ouest Innovation” competitiveness cluster. We also want to thank Bénédicte Wenden (INRAE, UMR 1332 BFP, Villenave d’Ornon, France) for her expertise in data visualisation, and Hao Cheng (Department of Animal Sciences, University of California, Davis) for his help with linear model construction. Finally, our special thanks are addressed to the staff of the University of California, Davis, working on walnut for their involvement and support since the beginning of the project, and for welcoming Anthony Bernard.


This work has been mainly funded by the “Région Nouvelle-Aquitaine” in the project “INNOV’noyer”, coordinated by the CTIFL, and in partnership with the INRAE of Bordeaux. This work has been also funded by the “Cifre” convention number 2016/1558 of ANRT (Agence Nationale de la Recherche et de la Technologie). To finish, Anthony Bernard thanks the “Initiative d’Excellence” (IdEx) program of the University of Bordeaux and the “Dufrenoy-Crédit Agricole d’Ile-de-France Mécénat” grant for contributing to the funding of his travel to the University of California, Davis.

Author information

Authors and Affiliations



AB performed the phenotypic statistical analyses under the supervision of AM. Linkage mapping and QTL detection were carried out by AB and ED. GWAS methodology was carried out by AB and AM. AM and DN supervised the entire GWAS analyses and brought expertise about quantitative genetics. AD and AB performed the candidate genes analyses. KASP marker development was made possible thanks to PB and CL who have shared DNA and phenotypic data. FL, ED and AB conceived and coordinated the study. AB wrote the manuscript, assisted by AM. All authors critically reviewed the manuscript and they all approved the final version.

Corresponding author

Correspondence to Elisabeth Dirlewanger.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

ED is a member of the editorial board (Associate Editor) of BMC Genomics.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1: Table S1.

List of association panel accessions with their origin, breeding level, pedigree when known, and their membership to clusters based on fastSTRUCTURE results for K = 3.

Additional file 2: Table S2.

Number of principal components required for bearing habit trait using Bayesian Information Criterion.

Additional file 3: Table S3.

‘Franquette’ and ‘UK 6–2’ parental maps for QTL detection.

Additional file 4: Table S4.

Genotyping results of SNP AX-171179714 by KASP.

Additional file 5: Table S5.

Walnut trait ontology.

Additional file 6: Table S6.

Phenotypic raw dataset of the GWAS panel.

Additional file 7: Table S7.

Phenotypic raw dataset of the F1 progeny.

Additional file 8: Figure S1.

Scatter plots showing the two-year data related to phenological traits in Julian days for the 170 accessions of the GWAS panel.

Additional file 9: Figure S2.

Density plots showing the two-year data related to phenological traits in Julian days for the 170 accessions of the GWAS panel.

Additional file 10: Figure S3.

Scatter plots showing the two-year data related to phenological traits in Julian days for the 78 accessions of the F1 progeny.

Additional file 11: Figure S4.

Density plots showing the two-year data related to phenological traits in Julian days for the 78 accessions of the F1 progeny.

Additional file 12: Figure S5.

Detection of the number of clusters using a) Prob(K), and b) deltaK method (Evanno et al., 2005) in the GWAS panel.

Additional file 13: Figure S6.

Principal Component Analysis performed on the GWAS panel. The two first principal components show accessions colored according to fastSTRUCTURE results with EEAs for Eastern Europe and Asia, and WEAm for Western Europe and America.

Additional file 14: Figure S7.

Scatter plot showing the estimated kinship coefficients by the proportion of zero Identical-By-State (IBS0) in the F1 progeny.

Additional file 15: Figure S8.

Level plot showing the Identical-By-State (IBS) values for the 170 accessions of the GWAS panel.

Additional file 16: Figure S9.

Genetic maps and QTLs detected using two-year data. F(X) and U(X) are the linkage groups of ‘Franquette’ and ‘UK 6–2’ parental maps respectively. Legend of the QTLs: black for budbreak date, red for beginning female flowering date, deep green for full female flowering date, blue for end female flowering date, yellow for beginning male flowering date, pink for full female flowering date, and light green for end female flowering date. Solid bars indicate the 95% confidence interval of the QTL, and terminal bars indicate the 99.9% confidence interval of the QTL. The marker names were changed with the corresponding chromosome number and its physical position for a better visualization.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bernard, A., Marrano, A., Donkpegan, A. et al. Association and linkage mapping to unravel genetic architecture of phenological traits and lateral bearing in Persian walnut (Juglans regia L.). BMC Genomics 21, 203 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: