Genome-wide association study revealed a promising region and candidate genes for eggshell quality in an F2 resource population

Background Eggshell is subject to quality loss with aging process of laying hens, and damaged eggshells result in economic losses of eggs. However, the genetic architecture underlying the dynamic eggshell quality remains elusive. Here, we measured eggshell quality traits, including eggshell weight (ESW), eggshell thickness (EST) and eggshell strength (ESS) at 11 time points from onset of laying to 72 weeks of age and conducted comprehensive genome-wide association studies (GWAS) in 1534 F2 hens derived from reciprocal crosses between White Leghorn (WL) and Dongxiang chickens (DX). Results ESWs at all ages exhibited moderate SNP-based heritability estimates (0.30 ~ 0.46), while the estimates for EST (0.21 ~ 0.31) and ESS (0.20 ~ 0.27) were relatively low. Eleven independent univariate genome-wide screens for each trait totally identified 1059, 1026 and 1356 significant associations with ESW, EST and ESS, respectively. Most significant loci were in a region spanning from 57.3 to 71.4 Mb of chromosome 1 (GGA1), which together account for 8.4 ~ 16.5 % of the phenotypic variance for ESW from 32 to 72 weeks of age, 4.1 ~ 6.9 % and 2.95 ~ 16.1 % for EST and ESS from 40 to 72 weeks of age. According to linkage disequilibrium (LD) and conditional analysis, the significant SNPs in this region were in extremely strong linkage disequilibrium status. Ultimately, two missense SNPs in GGA1 and one in GGA4 were considered as promising loci on three independent genes including ITPR2, PIK3C2G, and NCAPG. The homozygotes of advantageously effective alleles on PIK3C2G and ITPR2 possessed the best eggshell quality and could partly counteract the negative effect of aging process. NCAPG had certain effect on eggshell quality for young hens. Conclusions Identification of the promising region as well as potential candidate genes will greatly advance our understanding of the genetic basis underlying dynamic eggshell quality and has the practical significance in breeding program for the improvement of eggshell quality, especially at the later part of laying cycle. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1795-7) contains supplementary material, which is available to authorized users.


Background
A vast number of eggs are produced annually for human consumption worldwide. Avian calcified eggshell as a unique bioceramic material can provide protection to egg contents from physical damage and microorganic invasion [1,2]. Changes in eggshell properties are directly related to increasing risk of foodborne disease for the consumer [3]. Low eggshell quality will also lead to more cracked eggs in the automatic sorting and packing process in the modern egg industrial production [4]. It has been estimated that improving the mean eggshell strength by one Newton will lead to 0.5 % less broken eggs per hen per laying cycle [5]. On the other hand, eggshell is biologically significant for bird embryo developments by allowing gas exchange between the embryo and the environment and supplying calcium for the embryo bone deposition [6,7]. Hatching eggs with thin eggshell have high embryonic mortality, owing to more loss of water vapour during the incubation [8]. Furthermore, eggshell are subject to quality loss with the aging process of laying hens [9], which hinders the developmental trend to prolong the laying cycles of egg-type chicken in the future. Therefore, understanding the genetic control for dynamic eggshell quality with aging process is of great economic and biological importance.
In recent years, the genomic [5,10], transcriptomic [11][12][13], proteomic [14][15][16][17][18] and structural analyses [19][20][21] of eggshell have been carried out to provide new insights into better understanding on the eggshell mineralization and its ultrastructure or micostructure that contribute to eggshell quality. Identifying the quantitative trait loci (QTLs) that relate to eggshell quality is one of the most powerful strategies to illustrate the genetic mechanism underlying eggshell quality. Up to date, a total of 62 QTLs related to eggshell quality have been collected in the AnimalQTLdb (http://www.animalgenome.org/cgi-bin/QTLdb/index). However, most previously reported QTLs were identified by low-density linkage analysis with limited markers of microsatellites [22][23][24], which restricted the confidence intervals and mapping accuracy [25,26]. Recently, genome-wide association study (GWAS) has been utilized to identify the associations between genomic loci and phenotypes with relatively high-density SNP arrays in chicken [27]. For instance, Liu et al. [10] conducted the first GWA analysis with relatively highdensity SNA array (60 K) to refine the associations with egg production and egg quality traits in chicken. With the rapid advance in next-generation sequencing technologies, large amounts of SNPs in chicken have been discovered [28]. The development of 600 K Affymetrix Chicken SNP array allows more efficient screening of the causal loci and genes in relevance to target traits.
It is notable that many complex traits are dynamic varied with the aging process of animals [29,30]. However, previous studies utilized the phenotypes from limited age points. Animal phenotypes at different age should be utilized in GWA analysis to refine the associations with age-dependent complex traits and increase the statistical power. Growing evidences had been provided that this type of "longitudinal design" could efficiently identify the time-dependent or consistent loci for complex traits [31,32].
In the current study, we conducted GWA analysis on the dynamic eggshell quality traits at 11 time points from onset of laying to 72 weeks old utilizing a 600 K high-density SNP arrays in an F 2 chicken population. We aimed to explore the associated genomic loci and genes that contribute to the phenotypic variability and dynamic change in eggshell quality traits, and to provide potential breeding tools for the improvement of eggshell quality.

Phenotype description and genetic parameters
Means and standard deviations for eggshell quality traits, including ESW, EST and ESS, at 11 time points from onset of laying to 72 weeks of age are presented in Table 1 and Additional file 1: Figure S1. These three eggshell quality traits displayed a smooth curve with hen age except 36 weeks-old, at which ESW, EST and ESS showed a abrupt decrease (Additional file 1: Figure S1) and the results of GWA analysis in this week were also unusual ( Table 3). We speculated that this might be caused by the feed or environment changes in this period leading to eggshell quality decrease. For this reason, phenotypes collected at  Table S1) and ESS (0.20~0.27, Additional file 2: Table S2) were relatively low. Moreover, bivariate GCTA analyses indicated that ESW at different age were highly and positively correlated, specially at neighboring time points (r g > 0.90).

Identification of candidate loci by GWAS
Eggshell quality traits were collected at 11 time points from onset of laying to 72 weeks. At each time point, three separate genome-wide association tests were conducted for each eggshell quality trait with univariate method. A total of 1057, 1026 and 1356 genome-wide significant associations were identified with ESW, EST and ESS, respectively ( Table 2, Additional file 2: Table S3). Almost all the significant locus were in a region spanning from 57.3 to 71.4 M of chromosome 1 (GGA1) ( Table 3). Out of these loci in this region, 794 SNPs were observed in association with all these three eggshell quality traits at genome-wide significance level ( Fig. 1). It was notable that most of the SNPs in this region possess a MAF more than 0.3. In addition, 8 loci on chromosome 4 (GGA4) were significantly associated with ESW, but only for young hens from age of first egg (AFE) to 40 week old. The global view of the putative P-values for all SNPs affecting eggshell quality traits at 44weeks is given in Fig. 2, and the Manhattan and quantile-quantile (QQ) plots for the remaining time points in Additional file 1: Figure S2.
To enhance the statistical power, a joint GWA analysis of multi time points was performed by fitting these SNPs into a multivariate model. The time points that had significant SNPs were included in this model except for 36 weeks owing to the abnormal phenotype in this week. The 1531 significant SNPs related to eggshell quality identified by univariate method in GGA1 were further analyzed in this multivariate model. Consequently, a total of 503, 501 and 532 significant hits on GGA1 from 60.5 to 67.9 M presented consistent and compelling associations with   Table S4), and 388 SNPs out of these locus had pervasive effect on ESW, EST and ESS (Fig. 1b).

Linkage disequilibrium (LD) and conditional analysis
The linkage disequilibrium (LD) analysis were then carried out and the results showed the uncovered SNPs in GGA1 were in extremely strong linkage disequilibrium status, especially for the LD block from 64.0 to 67.5 Mb that include half (732/1531) of the total loci (Fig. 3). To detect the independent acting locus in this region, we then carried out stepwise conditional GWA analysis to prioritize separately associated SNPs. At locus rs312347405, a missense mutation in association with all three eggshell traits was then fitted into the multivariate model using its genotype as covariate to explore the change of the associations with other loci. As shown in Fig. 3, if the genotype of rs312347405 was used as covariate in multivariate analysis, the significance levels of all other loci were substantially attenuated below genome-wide significant level. Regional plots and conditional analysis in multivariate model for the traits of eggshell thickness (EST) and eggshell strength (ESS) were shown in Additional file 1: Figure S3.

Estimation of contribution to phenotypic variation (CPV)
The phenotypic variance explained by loci or genomic region were estimated by a tool of GCTA for three eggshell traits. The phenotypic differences among 3 genotypes at these two loci are displayed in Fig. 4. There was no difference in eggshell quality traits among different genotypes for the first egg. However, along with the aging process, profound differences in eggshell quality appeared among three genotypes. Eggshell of each genotype exerted different rate of quality decay with aging process, i.e., the homozygotes of advantageously effective alleles possessed the best eggshell quality and could partly counteract the negative effect of aging process. However, the change trends of eggshell traits among three genotypes were concordant along with age.

Promising genes related to eggshell quality
Utilizing gene annotation of the causal locus allowed us to screen the putative genes relating to eggshell quality. For SNPs-traits association analysis, the missense mutations on exons were more meaningful than the SNPs located on introns or intergenic regions. We totally found five missense loci at GGA1 and one at GGA4 using univariate genome-wide screens. They located on 6 independent genes including phosphatidylinositol-4phosphate 3-kinase, catalytic subunit type 2 gamma (PIK3 C2G), inositol 1,4,5-trisphosphate receptor type 2 (ITPR2), RecQ helicase-like (RECQL), ATP-binding cassette subfamily C member 9 (ABCC9) and cancer susceptibility candidate 1 (CASC1) on GGA1 and non-SMC condensin I complex subunit G (NCAPG) on GGA4. Nonetheless, only two SNPs, rs312347405 and rs316607577, located on PIK3C2G and ITPR2, remained significantly associated with eggshell quality traits after multivariate GWA analysis.
Furthermore, as illustrated above, these two SNPs could explain more phenotypic variance than others. Consequently, we first considered PIK3C2G and ITPR2 as the primary candidate genes associated with eggshell quality. The missense mutation in GGA4 that located on NCAPG was posterior putative locus for its roles in affecting early eggshell weight. In addition to missense mutations on exons, the mutations on 3′ or 5′ UTRs were also of biological significance for genetic variation underlying complex traits. We aggregately discovered 15 mutations that located on 3′ or 5′ UTRs corresponding to 13 independent genes (Additional file 2: Table S5). According to functional annotation and database research, however, most of these Manhattan plot indicates -log 10 (observed P-values) for genome-wide SNPs (y-axis) plotted against their respective positions on each chromosome (x-axis), and the horizontal green and black lines depict the genome-wide suggestive (1.69 × 10 −5 ) and significant (8.43 × 10 −7 ) threshold, respectively. For quantile-quantile plot, the x-axis shows the expected -log 10 -transformed P-values, and the y-axis represents the observed -log 10 -transformed P-values. The genomic inflation factors (λ) are shown on the top left in the QQ plot. Green points represent the genome-wide significant associations genes had no direct relevance to eggshell calcification or calcium metabolism, expect for rs316447591 which located on the 3′UTR of ITPR2 gene. The detailed message of promising loci and the corresponding candidate genes are listed in Table 5.

Discussion
It is known that eggshells are subject to quality loss along with the age process [9]. In egg-type chicken production, there exists a trend to prolong the laying cycles [33], whereas the decline of eggshell quality for old hens proposes a substantial challenge for this development pattern. In the present study, we elucidated genomic architecture underlying the age-dependent dynamic eggshell quality, which has biological and practical significance.
The QTL number detected for eggshell quality was limited in previous studies. The low to medium heritability of the eggshell trait revealed by our results and other study [34] suggested the necessary use of large population for causal mutant identification. Our F 2 population consisting of 1512 hens is the largest population used for eggshell quality GWA analysis so far, and therefore the novel genomic region and loci revealed by the current study should be accurate and reliable. Our GWA analysis for dynamic eggshell quality traits identified thousands of significant associations, that were much more than those screened by previous GWA studies in chicken [10,35]. These significant mutations were almost in a same region with an extremely strong linkage disequilibrium status, which may result from the insufficient recombination events in the F 2  The 1531 significant SNPs in this region were distributed on 116 independent genes. We tried to perform Gene Ontology (GO) [39] and pathway analysis [40] on these 116 genes, whereas no significant GO terms or pathways were enriched. This suggested there were not a series of functionally similar genes in this region to influence eggshell quality together. Most SNPs might passively display significant associations for their linkages to a sort of real causal mutants on one or a few genes. Hence, the phenotypic variance explained by all loci in this region was only slightly higher than single locus. It was notable that more associations were discovered for old hens (40-72 weeks) than young hens (AFE to 36 weeks), suggesting certain genetic variants were age-dependent. This was partially consistent with the other GWA studies, indicating no significant associations with both early and late shell quality [10,41]. However, our results also proved the existence of common genetic variants consistently influencing eggshell quality from 40 to 72 weeks.
Six significant missense mutations in this genomic region were considered as the most important putative variants. After multivariate GWA analysis, only two loci, rs312347405 and rs316607577 on PIK3C2G and ITPR2 in GGA1 remained significantly associated with dynamic eggshell quality traits. The loci of rs312347405 on PIK3C2G could explain the most phenotypic variance of eggshell quality traits among the five missense mutations. Hens with homozygotes of advantageously effective alleles (GG) could produce eggs with excellent ESS, which decreased relatively within a narrow range along with the aging process. PIK3C2G encoded proteins belonging to the phosphoinositide 3-kinase (PI3K) family, containing a lipid kinase catalytic domain as well as a C-terminal C2 domain [42], which acted as calcium-dependent phospholipid binding motifs [43]. Previous proteomic screens revealed that a high proportion of lipid-binding proteins abundantly existed in eggshell matrix such as extracellular fatty acid-binding protein (Ex-FABP), prosaposin and apolipoprotein D [14]. Considering the low abundance of lipids in eggshell, the existence of high abundance of lipidbinding proteins in eggshell matrix was perplexed [14]. Furthermore, one association analysis revealed another lipid-related gene, low-density lipoprotein receptor-related protein 8 (LRP8) gene, as a new member of eggshell matrix protein, was significantly associated with eggshell traits [44]. Now in the current study PIK3C2G possessing the C2 domain acting as lipid binding motif was also implicated in eggshell property. The role of lipid binding proteins in eggshell or during eggshell formation should be re-recognized. PIK3C2G possessing the C2 domain could mediate translocation of proteins to lipid membranes, and also regulate protein-protein interactions in human and mammals [45]. The interoperable matrix proteins and calcite together formed the bioceramic eggshell [46]. It could be hypothesized that lipid binding proteins Estimated allelic substitution effect per copy of the effect allele (EA); SE standard error of the beta b Beta means the effect size of minor alleles. Positive/ negative effect size value means that the substitution of major allele to minor allele can lead to heavier/lighter yolk or ovary weight might act as carrier or modifier of matrix precursors, which mediate eggshell calcification process or act as frame proteins deposited into eggshell structure. Consequently, PIK3C2G was considered as candidate gene for eggshell quality.
The gene of inositol 1,4,5-trisphosphate receptor type 2 (ITPR2) harboring the loci of rs316607577 (exon 25) was the positional and functional candidate gene for eggshell quality revealed by the current GWA analysis. The mutation of rs316607577 was a nonconservative substitution of serine by glycine (S1072G), with the glycine-encoding allele being associated with stronger eggshell. A gene could be inactivated by a mutation either in a control site or in a coding region [47]. So rs316447591 on 3′URT of ITPR2 was also the susceptibility locus. ITPR2 was known for its mediation in endoplasmic reticulum (ER) calcium release and inositol 1,4,5-trisphosphate (IP3) could mobilize Ca 2+ from intracellular calcium stores to many types of cells [48]. The existence of ITPR2 in uterine epithelial cell of chicken had been verified [11,12], furthermore, the expression of ITPR2 in uterus during eggshell calcification was significantly higher than that in magnum and duodenum which also possess active calcium metabolism [49]. This provided the evidences that ITPR2 play roles in the regulation of intra-cellular Ca 2+ transportation in uterus and contributed to the process of eggshell calcification. Recently, a genome-wide association study in humans identified ITPR2 as a susceptibility gene for Kashin-Beck disease which is a chronic osteochondropathy [50], mainly characterized by cartilage degeneration, cartilage matrix degradation, chondrocyte necrosis and apoptosis [51]. This uncovered association with bone disease indicated that it might play a role in general mineralization processes. Birds process medullary bone, a nonstructural type of woven bone, to act as a reservoir for the minerals required for shell calcification [52]. Our previous comparative proteomic analysis for uterine fluid and eggshell matrix proteins also identified that many bone-development related proteins were up-presented in strong eggshell group [15]. Up to date, there only existed one paper referred to the association of ion transporter genes with eggshell quality. They found one SNP on a sodium channel gene (SCNN1b) had effects on eggshell strength [53], however, the phenotypic variance of ESS it could explained was relatively low (1.14 %) compared to that of ITPR2 (2.43-11.52 %) in the current study. All the above evidences suggested ITPR2 should be a crucial and promising candidate gene relating to eggshell calcification as well as eggshell mechanical property.
Another missense mutation (rs14491030) located on non-SMC condensin I complex subunit G (NCAPG) gene in GGA4 was discovered in association with eggshell weight for young hens in our GWA analysis. Many other studies reported multiple genomic regions containing this gene were identified to be associated with body weight [23,54] and egg weight [35,55] in chickens and many other growth traits in livestocks [56,57]. The current study suggested that NCAPG also involved in influencing eggshell weight. Larger eggs generally owned heavier eggshell. Hence we could not exclude the potential cause that the significant association of NCAPG with ESW was just due to its relevance to egg weight. Nevertheless, we still considered NCAPG as a candidate gene for its consistent association with eggshell weight from 32 to 40 week.
Previous studies hypothesized that eggshell organic matrix as a complicated mixture of proteins might play a regulatory role in assembly of the calcite zone with calcium carbonate to ultimately determine its strength, and many evidences had been proposed to support this [17], such as the different matrix protein concentrations in strong and weak eggshells [15,58]. Furthermore, many works conducted to investigate the associations between polymorphisms in genes encoding eggshell matrix proteins and eggshell quality traits [59,60]. Takahashi et al. [60] conducted association analysis between ovocalyxin-32 gene haplotypes and eggshell quality traits in an F 2 intercross and Dunn et al. [59] carried out association analysis between polymorphisms in 10 eggshell organic matrix genes and eggshell quality measurements in two lines. These two studies found some significant associations, however, the loci could only explain limited phenotypic variances. Our current study discovered no association between SNPs on eggshell matrix genes and eggshell quality, suggesting that the influences of eggshell matrix genes on eggshell quality were not caused by their nucleotide polymorphisms.

Conclusions
The dynamic eggshell quality traits at 11 time points from onset of laying to 72 weeks were collected and used for genome-wide association analysis with a 600 K highdensity SNP array. According to univariate and multivariate GWA analysis, we discovered a genomic region spanning from 57.3 to 71.4 M in GGA1 significantly associated with eggshell quality. LD and conditional analysis suggested this region were in extremely strong linkage disequilibrium status, especially for the LD block from 64.0 to 67.5 Mb that almost include half (732/1531) of the SIFT is a program that predicts whether an amino acid substitution affects protein function. Small values means deleterious amino acid change total significant loci. Ultimately, three genes, PIK3C2G, ITPR2 and NCAPG identified from three missense mutations were considered as promising candidate genes that implicated in dynamic eggshell quality. The homozygotes of advantageously effective alleles on PIK3C2G and ITPR2 possessed the best eggshell quality and could partly counteract the negative effect of aging process. These promising loci and genes could be helpful to engineer practical breeding programs for the improvement of eggshell quality for old hens to meet the need of prolonging the laying cycle. The new findings also have greatly advanced our understanding of the genetic basis underlying eggshell quality.

Resource population
The reciprocal crosses between a standard breed White  [61].

Genotyping and quality control (QC)
Genomic DNA was extracted by standard phenol/chloroform method and genotyped with the 600 K Affymetrix Axiom Chicken Genotyping Array (Affymetrix, Inc. Santa Clara, CA, USA). Affymetrix Power Tools v1. 16.0 (APT) software was then used for quality control and genotype calling. Specifically, only samples with dish quality control (DQC) > 0.82 and call rate > 97 % were included into the subsequent analyses. An R script supplied by Affymetrix was run to compute the SNP QC metrics and filter out individual SNPs falling below given thresholds. After these QC steps, 1512 individuals and 532,299 SNPs remained valid. Given the current statistical methods were more powerful for the detection of the associations between phenotypes and autosomal genotypes, all SNPs on sex chromosomes were excluded in QC process. The PLINK v1.90 package [62] were then used for further quality control to improve the detect power, in which SNPs with minor allele frequency (MAF) < 5 % and Hardy-Weinberg equilibrium (HWE) test P < 1 × 10 −6 were removed from downstream analysis. Some sporadic missing genotypes were imputed using the BEAGLE v4.0 procedure [63], only SNPs with imputation quality score R 2 > 0.5 were retained. Ultimately, up to 1512 individuals and 435,867 SNPs were valid for the following GWA analysis.

Genome-wide Association (GWA) analysis
Principal component analysis (PCA) implemented in PLINK package was first conducted prior to GWA analysis to eliminate spurious associations resulting from the presence of cryptic relatedness or hidden population stratification. To properly decide the thresholds for genome-wide significant/suggestive associations, a method of simpleM [64] was used to corrected the number of multiple tests. With this approach, an effective number of 59,308 independent tests was suggested, hence the genome-wide significant and suggestive P-values were 8.43 × 10 −7 (0.05/59,308) and 1.69 × 10 −5 (1.00/59,308), respectively. Univariate association test equipped with an exact mixed model approach in GEMMA v0.94 software [65] was first performed with the valid individuals and SNPs. The centered relatedness matrix was calculated by those independent SNPs, and then the derived Wald test Pvalue was calculated for the significance level between SNPs and phenotypes. The univariate linear mixed model is as follows: y is an n × 1 vector of phenotypic values for n individuals, W is an n × c matrix of covariates (fixed effects, i.e., top five PCs) including a column vector of 1, α is a c × 1 vector of corresponding coefficients including the intercept, x is an n × 1 vector of marker genotypes at the locus tested, β is the corresponding effect size of the marker and all effects are reported for the minor allele in each marker, u is an n × 1 vector of random polygenic effects with a covariance structure as u~N (0,KVg), where K represents a known n × n genetic relatedness matrix derived from SNP markers and Vg is the polygenic additive variance, and ε is an n × 1 vector of random residuals with ε~N (0,IV e ), where I is an n × n identity matrix, and V e is the residual variance component. We used the Wald test statistic F Wald ¼β 2 =Varβ for each SNP to test the null hypothesis β = 0, where the best linear unbiased estimate (BLUE) of β and the corresponding sampling variance V arβ are obtained by solving the mixed model equations (MME) based on estimated Vg and V e .
The manhattan plots and quantile-quantile (QQ) plots were drawn by the "gap" packages [66] in R project. The genomic inflation factor λ which were the judgements for the extent of false positive signals was also calculated with the function of estlambda implemented in the GenABEL package [61] in R project.
To further analyze genetic variants affecting the dynamic eggshell quality traits along with aging process, a multivariate association analysis [67] that directly utilized phenotypes from multiple time points on an individual was applied. Only the time points and SNPs that had significant associations in univariate model were also included in this multivariate model. For each SNP marker, a multivariate linear mixed model could be fitted in the following form: where Y is an n by d matrix of d phenotypes for n individuals, W = (w 1 , ⋅ ⋅⋅, w c ) is an n × c matrix of covariates (fixed effects, i.e., top five PCs) including a column of 1 s, A is a c by d matrix of corresponding coefficients including the intercept, x is an n-vector of marker genotypes, β is a d vector of marker effect sizes for the d phenotypes. It should be noted that G is an n by d matrix of random effects with G ∼ MVN n × d (0, K, V g ) ∼ MVN n × d (0, K, V g ) where V g is a d by d symmetric positive definite matrix of genetic variance component, and E is an n by d matrix of residual errors with E ∼ MVN n × d (0, I, V e ) where V e is a d by d symmetric positive definite matrix of residual variance component (K and I are the same in the two models). MVN n × d (0, V 1 , V 2 ) denotes the n × d matrix normal distribution with mean 0, row covariance matrix V 1 (n by n) and column covariance matrix V 2 (d by d).

Conditional and Linkage disequilibrium (LD) analysis
Notably, many SNPs maybe passively significant associated with target traits, resulting from their strong linkage to a really causal mutants. To demarcate independent association signals in a putative region, the conditional analyses as well as the LD analysis were both performed in multivariate and univariate models, through fitting the genotypes of one candidate signal as covariates [62]. In general, GWAS does not distinguish a genuine causal locus from those statistically significant loci within a strong LD region. Therefore, in order to characterize potential candidate genes responsible for a trait, we firstly conducted a LD analysis and inferred the haplotype blocks containing peak SNPs by Haploview v4.2 [68]. A block is derived using the solid spine algorithm, and defined as that the first and last SNPs in a region are in strong LD (D ' ≥ 0.8) with all intermediate SNPs.

Estimation of heritability and contribution to phenotypic variance (CPV)
Univariate restricted maximum likelihood (REML) implemented in GCTA v1.24 program [69] was performed to estimate the heritability explained by the eligible SNPs (h snp 2 ). We also quantified the pair-wise genetic and phenotypic correlations for each trait at multiple time points with the bivariate mixed model [70]. A genetic relationship matrix (GRM) derived from all genotyped SNPs on autosomes and two linkage groups was created, and the top five PCs calculated by the GCTA tool were included as covariates to account for the potential population structure. We then estimated the phenotypic variance contributed by these associated loci or genomic region.

Gene identification
Variant Effect Predictor (VEP) and Biomart tools based on the Galgal4 assembly supported by Ensembl were used for the identification of candidate genes that the significant loci located on [71] and detection the genes within in a given genomic region [72].

Additional files
Additional file 1: Figure S1. Change curve of eggshell weight (ESW), eggshell percentage (ESP), eggshell thickness (EST) and eggshell strength (ESS) along with the age of laying hens. Plots A, B, C, D displayed the change curve of ESW, ESP, EST and ESS respectively. Figure S2. Manhattan plot (left) and quantile-quantile plot (right) of the observed P-values for ESW, EST and ESS at age of first egg and at 32,36,40,48,52,56,60,66,72 weeks of old. The Manhattan plot indicates -log 10 (observed P-values) for genome-wide SNPs (y-axis) plotted against their respective positions on each chromosome (x-axis), and the horizontal green and black lines depict the genome-wide suggestive (1.69 × 10 −5 ) and significant (8.43 × 10 −7 ) threshold, respectively. For quantile-quantile plot, the x-axis shows the expected -log 10 -transformed P-values, and the y-axis represents the observed -log 10 -transformed P-values. The genomic inflation factors (λ) are shown on the top left in the QQ plot. Green points represent the genome-wide significant associations. Figure S3. Regional plots and conditional analysis in multivariate model for eggshell thickness (EST) and eggshell strength (ESS). Plot A: regional and conditional plot for EST. Plot B: regional and conditional plot for ESS. (PDF 4277 kb) Additional file 2: Table S1. Summary of genetic analysis for eggshell thickness at different wks of age. Table S2. Summary of genetic analysis for eggshell strength at different wks of age. Table S3. Genome-wide significant SNPs for eggshell weight (ESW), eggshell thickness (EST) and eggshell strength (ESS) at different age by univariate model. Table S4. Genome-wide significant SNPs for eggshell weight (ESW),eggshell thickness (EST) and eggshell strength (ESS) by multivariate model. (XLSX 1372 kb). Table S5. Significant loci that located on 3' or 5' UTRs.