- Research
- Open access
- Published:
Genome-wide detection of multiple variants associated with teat number in French Yorkshire pigs
BMC Genomics volume 25, Article number: 722 (2024)
Abstract
Background
Teat number is a vital reproductive trait in sows, crucial for providing immunity and nutrition to piglets during lactation. However, “missing heritability” in Single Nucleotide Polymorphism (SNP)-based Genome-Wide Association Studies (GWAS) has led to an increasing focus on structural variations in the genetic analysis of complex biological traits.
Results
In this study, we generated a comprehensive CNV map in a population of French Yorkshire pigs (n = 644) and identified 429 CNVRs. Notably, 44% (189 CNVRs) of these were detected for the first time. Subsequently, we conducted GWAS for teat number in the French Yorkshire pig population using both 80K chip and its imputed data, as well as a GWAS analysis based on CNV regions (CNVR). Interestingly, 80K chip GWAS identified two SNPs located on Sus scrofa chromosome 5 (SSC5) that were simultaneously associated with Total Teat Number (TTN), Left Teat Number (LTN), and Right Teat Number (RTN). The leading SNP (WU_10.2_5_76130558) explained 3.33%, 2.69%, and 2.67% of the phenotypic variance for TTN, LTN, and RTN, respectively. Moreover, through imputed GWAS, we successfully identified 30 genetic variants associated with TTN located within the 73.22 -73.30 Mb region on SSC5. The two SNPs identified in the 80K chip GWAS were also located in this region. In addition, CNVR-based GWAS revealed three significant CNVRs associated with TTN. Finally, through gene annotation, we pinpointed two candidate genes, TRIM66 and PRICKLE1, which are related to diverse processes such as breast cancer and abnormal vertebral development.
Conclusions
Our research provides an in-depth analysis of the complex genetic structure underlying teat number, contributing to the genetic enhancement of sows with improved reproductive performance and, ultimately, bolstering the economic benefits of swine production enterprises.
Background
Teat number, as an important reproduction trait, has a large effect on the lactation ability of the sows, thus teat number may directly affect the weight gain of piglets [1, 2]. However, the complex genetic architecture of teat number makes its molecular mechanism unclear. With the rapid development and application of high-throughput sequencing technology, genome-wide association analysis (GWAS) combined with molecular marker technology is considered a powerful approach for dissecting the genetic architecture of complex traits in livestock [3,4,5]. Several previous GWAS researches indicate that VRTN [6] and ABCD4 [7] as candidate genes may regulate teat upgrowth, but traditional GWAS based on single nucleotide polymorphisms (SNP) have accounted for only part of the total heritability [8].
Some of the missing heritability has been attributed to copy number variation (CNV) in humans [9]. CNV refers to the structural variations of DNA segments in the range of 50 bp to several Mb compared with a reference genome, and it is widespread distributed in the genome [10]. Overlapping CNVs are combined into big regions known as the copy number variation regions (CNVR) [11]. CNVs may lead to phenotypic variation and disease by altering gene structure, gene regulation, and exposure to recessive deleterious genes [12]. As researchers revealed the existence of large-scale copy number variation in the human genome [13, 14], CNV research was gradually carried out in various fields [15,16,17]. For example, Chen et al. [18] identified a CNV affecting the MSRB3 gene that increases pig ear size through the mechanism of Mir-584-5p; Wang et al. [19] found that the gain status at CNVR will decrease total number born and number born alive in large white sows. Overall, research into novel CNVs of pigs can capture part of missing heritability from SNP-based GWAS and explain more genomic structural variations.
In this study, multi-dataset GWAS were conducted for teat number in the French Yorkshire pig population. Our research aimed to identify genetic variants and candidate genes associated with teat number in pigs and to elucidate the potential molecular genetic mechanisms. Additionally, the genome-wide CNV detection provides a valuable complement for CNV map of the French Yorkshire pig genome.
Results
Phenotype and heritability statistics
In this study, we analyzed three traits including Total Teat Number (TTN), Left Teat Number (LTN), and Right Teat Number (RTN) in 644 French Yorkshire pigs (Table 1). The average numbers (mean ± standard deviation) of TTN, LTN, and RTN were 14.10 ± 0.92, 7.04 ± 0.50 and 7.06 ± 0.49, respectively. The coefficients of variation (C.V.) of all three traits were over 6.50%. Besides, we estimated the SNP based heritability (h2) for the three traits based on the 53,869 SNPs, and the genomic h2 ranged from 0.14 ± 0.06 to 0.17 ± 0.07, suggesting that teat number belongs to low heritability trait.
Detection of genome-wide CNVs
A total of 8,746 CNVs (583 losses and 8163 gains) were detected using PennCNV software v1.0.5 [20], with an average of 13.48 per individual, ranging from two to 32 (Fig. 1a). Compared with losses, CNV gains occurred more frequently in individuals (Fig. 1a). The length of CNVs pigs ranged from 10.4 kb to 1.25 Mb with an average of 128.80 kb, and the median length of gains was 99.85 kb, which was longer than that of loss (82.63 kb, Fig. 1b). As shown in Table S1, all CNVs were merged into 429 CNVRs, including 103 losses, 296 gains, and 30 mixed events (gains and losses occurring in the same region). The total length of CNVRs was 66.78 Mb, occupying 2.95% of the total length of the pig autosomal genome (Sus scrofa version 11.1). The length of CNVRs ranged from 10.40 kb to 1.25 Mb with an average of 155.66 kb and the median length of gains (88.21 kb) was shorter than that of losses (120.86 kb) and of mixed events (301.20 kb, Fig. 1c). In addition, we observed that the majority of CNVs and CNVRs were under 500 kb in size.
Figure 2a and Table 2 illustrate the number and proportion of CNVRs distributed on autosomal. The number of CNVRs ranged from 11 in chromosome 18 (SSC18) to 36 in SSC2, accounting for 2.56% and 8.39% of the total proportion of CNVRs, respectively. In addition, CNVRs on SSC12 were the most dense, with an average distance between adjacent CNVRs of 2.12 Mb. We also found that the density distribution of the SNP and CNVR were remarkably consistent, suggesting that increasing SNP density contributes to the detection rate of CNVR [20] (Fig. 2a).
Gene content of CNVRs
A total of 1,558 genes from the Ensembl annotation of the Sscrofa 11.1 genome were identified to be overlapping with our detected 429 CNVRs, including 870 known genes and 688 unknown genes. Of these, 74.33% were protein-coding genes, 16.11% were long noncoding RNA (lncRNA), and others belonged to pseudogenes, small nuclear RNA (snRNA), microRNA (miRNA), small nucleolar RNA (snoRNA), processed pseudogenes, miscellaneous RNA (miscRNA), small cajal body-specific RNA (scaRNA), and T cell receptors (TR) V gene.
To further study the 1558 genes contended in CNVRs, we performed GO analysis and KEGG pathway analysis, as shown in Table S2. Accordingly, the GO analysis showed that genes of the terms of proteolysis, calcium ion binding, and endoplasmic reticulum are dominantly represented in the CNVRs (Fig. S1a), and the KEGG pathway analysis revealed that these genes are mainly represented in the pathway of endocytosis and estrogen signaling pathway (Fig. S1b). Compared with the reported quantitative trait loci (QTLs) in pigs, a total of 419 (97.67%) CNVRs included in or partially overlapping with 17674 QTLs (Table S3), which are associated with a variety of traits, such as teat number, average daily gain, and body weight. Among these QTLs, a number of 408 QTLs were associated with teat number.
Comparison of CNVRs detected in previous studies
The CNVR data set detected in this study was compared with the data of previous studies on CNVR [19, 21,22,23,24,25,26,27,28,29,30,31,32], as shown in Table S4. The results of Zheng et al. [31] have the highest overlap rate, with 609 CNVRs, while those of Wang et al. [22] have the lowest overlap rate, with only three CNVRs, indicating that there are still a large number of CNVs in the pig genome haven’t yet been discovered. Further, 189 CNVRs were newly identified, meanwhile, we detected 240 CNVRs overlapping with the previously reported CNVRs.
CNVR frequency in French Yorkshire population
The frequencies of CNVRs in the French Yorkshire population were also calculated and grouped into four categories: singleton (present in one individual), rare (present in more than one individual but with a frequency ≤ 0.01), low (0.01 < frequency ≤ 0.05), and common (frequency > 0.05), as shown in Fig. 2b. The singleton (151) accounted for 35.2% of all CNVRs, while there were 148 rare (34.5%), 78 low (18.2%) and 52 common (12.1%) CNVRs. As a result of the 189 novel CNVRs, 42.9%, 35.4%, 16.4% and 5.3% of them belong to singleton, rare, low and common, respectively. We found that the proportion of novel CNVRs was larger when the frequency of identified CNVRs were lower (Fig. 2c). The frequencies of CNVRs ranged from 0.15% (detected in one pig) to 71.76% (detected in 465 pigs) and was concentrated in singleton and rare frequency categories, indicating that CNVRs occur mostly in a few individuals and is hard to measure reliably [33]. Thus, we used 130 CNVRs with a frequency greater than 1% for later GWAS.
SNP-based GWAS results
PCA based on SNPs showed that there was no population stratification in this population (Fig. S2). Additionally, quantile–quantile (Q–Q) plots were used to illustrate the level of potential P value inflation (Fig. S3). The genomic inflation factors (λ) of GWAS ranged from 0.976 to 0.992, indicating that there is no obvious evidence of population stratification. Significant SNPs detected through 80K chips GWAS are presented in Fig. 3 and Table 3. We found that two SNPs (WU_10.2_5_76130558 and WU_10.2_5_76207514) located in SSC5 were simultaneously associated with all of the TTN, LTN, and RTN. The leading SNP (WU_10.2_5_76130558) explained 3.33%, 2.69% and 2.67% phenotypic variance for TTN, LTN and RTN, respectively. Subsequently, we performed a haplotype block analysis and found that these two SNPs are in complete LD and located within a 46 kb haplotype block (5: 73.19 Mb—73.24 Mb), which suggests that mutations near the potential QTL may have essential effect on teat number (Fig. S4).
Furthermore, GWAS conducted by imputed data revealed 30 significant variants for TTN (Fig. S5 and Table S5). The leading variant is 5_73264327_C located on SSC5 (P = 4.54E-07). Additionally, the 30 variants are located between 73.22 Mb and 73.30 Mb on SSC5, suggesting the presence of potentially significant variants affecting TTN. Notably, both the WU_10.2_5_76130558 and WU_10.2_5_76207514, identified by the 80K chip GWAS, are also within this region.
CNVR-based GWAS results
PCA based on CNVR also shows that there was no population stratification (Fig. S6). To further dissect the genetic basis of the teat number, CNVR-based GWAS were performed on 644 pigs with phenotypic records for TTN, LTN and RTN, respectively. Figure 4 shows the Manhattan plots for TTN, LTN, and RTN obtained from the association analysis, respectively. As shown in Table 4, we identified two CNVRs located on SSC1 and SSC15 that are associated with both TTN and RTN, demonstrating significant multi-effect associations. We also identified one CNVRs associated with LTN, located on SSC9.
Functional analysis of candidate genes
A total of 12 candidate protein-coding genes overlapped with significant CNVRs and located within a 1 Mb region surrounding the significant variants were detected based on the Sus scrofa 11.1. Subsequently, we employed the GeneCards, Mouse Genome Informatics databases, and conducted an extensive literature review to explore the functional roles of the identified genes. As a result, we identified TRIM66 and PRICKLE1 genes that exhibiting promising associations with teat number based on their known functions and previous research findings.
Discussion
Comparison of CNVR detected in this study with previous studies
In this study, a total of 429 CNVRs were detected using GeneSeek Porcine 80K SNP chip on 649 French Yorkshire pigs, which provides a supreme valuable supplement for the CNV map. The results showed that the gain CNVRs was higher than the loss CNVRs, which may be related to the stronger resistance of the genome to duplication than to deletion [34]. In addition, as our previous study, we also found that CNVRs occur more frequently in telomeres (seven of the top 10 largest CNVRs) [35], which ensure the stability and integrity of the genome and are associated with the replication of genetic material [36].
In previous studies, Wang et al. [27] performed CNVR detection in 12 pigs including nine pig breeds using 1M aCGH and obtained 758 CNVRs (Sus Scrofa 10.2), 20 of which are overlapped with our results. Xie et al. [26] detected 172 CNVRs (Sus Scrofa 10.2) in 125 pigs using the pig 60K SNPs chip, with only six overlapping with ours. As such, these differences may pertain to various platforms, detection software, algorithms, variety, and quantity of samples, etc. [37,38,39,40].
We also compared the sizes of CNVRs detected in different studies. The average length of CNVRs identified in current study is 128.80 kb. In contrast, previous studies using pig SNP chips reported CNVR sizes ranging from 148.99 kb to 1835.44 kb, while those using next-generation sequencing data reported sizes from 4.16 kb to 7.04 kb. The uneven distribution of SNP in Illumina high-density SNP genotyping arrays results in some small CNVs being easily missed in detection [20, 41]. So we concluded that increasing the marker density could improve the detection efficiency and accuracy of CNV, especially small fragment CNV.
Candidate genes associated with teat number
In this study, the SNP-based GWAS and CNVR-based GWAS did not detect overlapping signals. SNPs typically affect single nucleotide changes, whereas CNVs involve large segments of DNA with duplications or deletions that can encompass multiple genes or non-coding regions. Therefore, SNPs and CNVRs may regulate phenotypes by influencing different biological processes or pathways.
Despite this, through multi-dataset GWASs, we identified two candidate genes associated with teat number, namely TRIM66 and PRICKLE1. These two genes have known associations with various processes that could potentially influence teat number, such as breast cancer and abnormal vertebral development.
TRIM66 (Tripartite Motif Containing 66) is a protein-coding gene and a member of the tri-motif protein family. Ning et al. [42] integrated several datasets and software to perform a comprehensive analysis of the expression pattern in TRIMs and found that TRIM66 is significantly downregulated in breast cancer. Zhang et al. [43] also discovered that knocking down TRIM66 inhibits the proliferation of breast cancer cells. Similar to our findings, other GWAS studies on teat number have also identified several candidate genes with functions related to breast cancer [5, 44]. From a biological perspective, mammary gland development is fundamental to teat formation. Genes that affect breast cancer may also influence normal mammary gland development and function, thereby indirectly affecting teat formation. Although our current results require extensive validation, they hold promise for enhancing our understanding of the underlying regulatory mechanisms of teat number in pigs.
PRICKLE1, located on SSC5 at 73.71–73.83 Mb, approximately 416.48 kb from the variant region related to teat number in our results. PRICKLE1 encodes a nuclear receptor and is associated with abnormal vertebral development [45] and other processes. Previous research suggests that PRICKLE1 is involved in the Wnt signaling pathway [45], which is crucial for mammary gland and thymus development [46]. In previous studies, many strong candidate genes associated with teat number have also been demonstrated to be related to vertebral development, such as VRTN and MKX [6, 47]. Additionally, previous researches indicated that VRTN may be an important gene affecting teat number [48,49,50], but Zhuang et al. [7] showed that VRTN is not the most significant gene affecting teat number, and genetic heterogeneity of its insertion may exist in different populations. However, the signal of VRTN was not scanned in our analysis results, which may be due to the differences in varieties, the small size of the population and the high degree of inbreeding, leading to the low minor allele frequency value and finally was eliminated in the quality control.
Heritability and asymmetry of teat number in pigs
In this study, the heritability estimates for TTN, LTN, and RTN were 0.17 ± 0.07, 0.14 ± 0.06, and 0.17 ± 0.07, respectively, with LTN having a lower heritability than TTN and RTN. Similar results were observed in previous studies. Wei et al. [51] investigated teat number traits in a large sample of pigs, defining not only TTN, LTN, and RTN, but also the maximum per side of teat number (TNMPS), teat number symmetry (TNSYM), and the difference between sides of teat number (TNUMD). They found that the heritabilities for TTN and RTN were moderate (0.142 – 0.146), whereas those for LTN, TNMPS, TNSYM, and TNUMD were lower (0.048 – 0.097). After accounting for epistatic effects, the heritability for RTN decreased (0.047), and those for LTN and TNMPS were moderate (0.107 – 0.126). Additionally, studies by Li et al. [5] and Liu et al. [44] also found differing heritability estimates for left and right teat numbers, likely due to asymmetry. In our study, excluding individuals without recorded teat numbers, 68 samples exhibited asymmetry between the left and right teat numbers. Furthermore, some samples lacked records for one side of the teats during collection, contributing to discrepancies in heritability estimates.
Conclusion
In summary, we identified 429 CNVRs in the French Yorkshire pig population, covering approximately 2.95% of the total Sus Scrofa 11.1 autosomal genome length. These findings complement the CNV map of the Yorkshire pig genome. Our GWAS results revealed 32 variants and three CNVRs significantly associated with teat number. Two critical candidate genes, TRIM66 and PRICKLE1, were related to teat number. The combination of GWAS and multiple genetic mutations presents a valuable approach for enhancing genetics and analyzing the genetic mechanisms underlying pig breeding traits, and contributes to the field of genetic improvement in pig breeding.
Materials and methods
Ethics statement
The animals and experimental procedures used in this study follow the guidelines of the Animal Care and Use Committee of the South China Agricultural University (SCAU) (Guangzhou, China). The ethics committee of SCAU approved all animal experiments (SCAU#2014–0136). The experimental animals were not anesthetized or euthanized in this study. We confirmed that all methods are reported in accordance with ARRIVE guidelines (https://arriveguidelines.org) for the reporting of animal experiments.
Animals and phenotype
In this study, a total of 659 French Yorkshire sows were raised in four nucleus pig breeding farms of the Wens Foodstuff Group Co., Ltd. (Guangdong, China) between 2012 and 2016: Yuhe Farm 1 (YH1), Yuhe Farm 2 (YH2), Qingyuan Farm 2 (QY2), and Baizi Farm (BZ). All pigs were subjected to the normal management conditions. Additionally, the left teat number (LTN) and right teat number (RTN) was counted separately after birth, and total teat number (TTN) was the sum of teat numbers on both sides in accordance with our previous study [7].
SNP genotyping and quality control
DNA of each pig was extracted from ear tissue following the standard protocols. The quality of DNA in all samples (659 DNA samples) was measured by a light absorption ratio (A260/280 and A260/230) and gel electrophoresis. All DNA samples were diluted to a concentration of 50 ng/μL. The samples were genotyped with the GeneSeek Porcine 80K SNP chip, which contains 68,528 SNPs uniformly spanning the pig genome. The genotype quality control was conducted by PLINK v1.9 software [52]. SNPs located on the sex chromosomes or without positional information were excluded and a set of 62,078 SNPs from 659 high-quality genotyping samples (call rate ≥ 90%) were retained for CNV detection. Furthermore, to improve the accuracy of the GWAS results, variants with call rates < 90%, minor allele frequency < 5%, and P value < 10–6 for the Hardy–Weinberg equilibrium test were also excluded, and individuals with call rates larger than 95% were retained. After quality control, we enhanced the genotype data to the whole-genome sequence level using an imputation strategy. We employed the Swine Imputation (SWIM) Server tool [53] with default parameter settings to perform genotype imputation, bridging the target and reference genotype data. The reference haplotype panels were constructed from whole-genome sequencing data collected from 2259 pigs, representing 44 breeds. The genotype imputation accuracy consistently demonstrated a high average concordance rate exceeding 97%, a non-reference concordance rate of 91%, and an r2 value of 0.89. This ensured the reliability and robustness of our imputed data. We applied the same quality control criteria to the imputed data as we did to the 80K SNP chip data. After quality control, the final 50294 SNPs and 14656673 variants (including SNPs and INDELs) from 644 French Yorkshire sows were retained for subsequent analysis in 80K chip and SWIM imputed data, respectively.
CNVRs detection and functional enrichment analysis
The PennCNV software v1.0.5 [20] was utilized to identify CNVs by incorporating the SNP signal data of log R ratio (LRR) and B allele frequency (BAF) for each individual. The CNV calling and CNVR determined were carried out following our previous study [35]. In brief, raw CNV dataset, poor quality samples (n = 11) were filtered out with the following criteria: LRR > 0.3, BAF drift > 0.01, GC wave factor of LRR > 0.05, and then CNVs with consecutive SNPs \(\le\) 3 and length \(\le\) 10 kb were filtered to obtain more reliable CNV calls. Afterward, both BEDTools software v2.26.0 [54] and CNVRuler software v1.3.3.2 [55] were used to merge CNVs with at least 1 bp overlap in all samples to determine the CNVR [31]. Finally, 8,746 CNVs and 429 CNVRs were identified. KOBAS v3.0 [56] was used for Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis of genes which involved in CNVR map. In the enrichment analysis, the statistical method of Fisher's exact test was used to retain the GO entries and pathways with P value < 0.05. Besides, the CNVRs were mapped to pig QTL from the Animal QTL database (https://www.animalgenome.org/cgi-bin/QTLdb/index) [57]. To ensure the accuracy and validity of the GWAS results, we filtered the CNVRs dataset by removing those with frequencies smaller than 1% and 130 CNVRs were retained for GWAS analysis.
GWAS for teat number
To identify candidate variants associated with teat number, GWAS were performed with the CNVR dataset and the SNP dataset, separately.
In this study, GWASs were performed separately using a univariate linear mixed model implemented in the GEMMA software v0.98.1 [58]. Before GWAS, genomic relatedness matrix (GRM) and principal component analysis (PCA) were estimated using the GEMMA and GCTA software v1.92.4beta [59] based on SNP datasets. The statistical model used was as follows:
where \(y\) represents a vector of the phenotypic value for all animals; \(W\) is the incidence matrix of covariates, including fixed effects of the top five eigenvectors of PCA; \(\alpha\) represents the vector of corresponding coefficients including the intercept; \(X\) is the vector of all marker genotypes; \(\beta\) specifies the corresponding effect size of the marker; \(u\) is the vector of random effects, with \(u\hspace{0.17em}\sim \hspace{0.17em}MV{N}_{n} (0, \lambda {\tau }^{-\hspace{0.17em}1}K)\); \(\varepsilon\) is the vector of random residuals, with \(\varepsilon \hspace{0.17em}\sim \hspace{0.17em}MV{N}_{n} (0,{ \tau }^{-\hspace{0.17em}1}In)\); \(\lambda\) signifies the ratio between two variance components; \({\tau }^{-\hspace{0.17em}1}\) s the variance of the residual errors;\(K\) is GRM; \(I\) is an n × n identity matrix; \(MV{N}_{n}\) denotes the n-dimensional multivariate normal distribution. In the 80 K chip GWAS, the significance cutoff was defined as the Bonferroni method; a stringent genome-wide threshold (significant) as well as a more lenient chromosome-wide threshold (suggestive) were P < 0.05/N and P < 1/N, respectively, where N is the number of variants or CNVRs tested in the analyses. Based on human GWAS results, we set the genome-wide significance threshold and suggestive significance threshold for GWAS based on imputed data at 5.00E-8 and 1.00E-6, respectively [60, 61]. Besides, Haploview [62] was used for haplotype block analysis to detect linkage disequilibrium (LD) among SNPs, with settings to "Ignore pairwise comparisons of markers > 500 kb apart" and "Exclude individuals with > 50% missing genotypes" .
Candidate genes identification
In this study, the position of SNPs was based on the Sus Scrofa 11.1 version of the pig reference genome. We conducted a functional gene annotation to identify candidate genes using Mouse Genome Informatics website (https://www.informatics.jax.org/), GeneCards (http://www.genecards.org/), GeneCards website (http://www.genecards.org/) and Ensembl website (www.ensembl.org/biomart/martview).
Availability of data and materials
The variation data reported in this paper have been deposited in the Genome Variation Map in National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, under accession number GVM000796 (https://bigd.big.ac.cn/gvm/getProjectDetail?Project=GVM000796).
Abbreviations
- GWAS:
-
Genome-wide association analysis
- SNP:
-
Single nucleotide polymorphism
- CNV:
-
Copy number variation
- CNVR:
-
Copy number variation region
- TTN:
-
Total teat number
- LTN:
-
Left teat number
- RTN:
-
Right teat number
- SSC:
-
Sus scrofa Chromosome
- QTL:
-
Quantitative trait locus
References
Verardo LL, Silva FF, Lopes MS, Madsen O, Bastiaansen JW, Knol EF, Kelly M, Varona L, Lopes PS, Guimarães SE. Revealing new candidate genes for reproductive traits in pigs: combining Bayesian GWAS and functional pathways. Genet Sel Evol. 2016;48:9.
Zhou LS, Zhao WM, Tu F, Wu YH, Ren SW, Fang XM. Physiology and genetics research progress of teat traits in pigs. Yi Chuan. 2019;41(5):384–90.
Arakawa A, Okumura N, Taniguchi M, Hayashi T, Hirose K, Fukawa K, Ito T, Matsumoto T, Uenishi H, Mikawa S. Genome-wide association QTL mapping for teat number in a purebred population of Duroc pigs. Anim Genet. 2015;46(5):571–5.
Rohrer GA, Nonneman DJ. Genetic analysis of teat number in pigs reveals some developmental pathways independent of vertebra number and several loci which only affect a specific side. Genet Sel Evol. 2017;49(1):4.
Li Y, Pu L, Shi L, Gao H, Zhang P, Wang L, Zhao F. Revealing new candidate genes for teat number relevant traits in duroc pigs using genome-wide association studies. Animals : an open access journal from MDPI. 2021;11(3):806.
Yang J, Huang L, Yang M, Fan Y, Li L, Fang S, Deng W, Cui L, Zhang Z, Ai H, et al. Possible introgression of the VRTN mutation increasing vertebral number, carcass length and teat number from Chinese pigs into European pigs. Sci Rep. 2016;6: 19240.
Zhuang Z, Ding R, Peng L, Wu J, Ye Y, Zhou S, Wang X, Quan J, Zheng E, Cai G, et al. Genome-wide association analyses identify known and novel loci for teat number in Duroc pigs using single-locus and multi-locus models. BMC Genomics. 2020;21(1):344.
Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, et al. Finding the missing heritability of complex diseases. Nature. 2009;461(7265):747–53.
Du RQ, Jin L, Zhang F. [Copy number variations in the human genome: their mutational mechanisms and roles in diseases]. Yi Chuan. 2011;33(8):857–69.
Wong KK, deLeeuw RJ, Dosanjh NS, Kimm LR, Cheng Z, Horsman DE, MacAulay C, Ng RT, Brown CJ, Eichler EE, et al. A comprehensive analysis of common copy-number variations in the human genome. Am J Hum Genet. 2007;80(1):91–104.
Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH, Carson AR, Chen W, et al. Global variation in copy number in the human genome. Nature. 2006;444(7118):444–54.
Zhang F, Gu W, Hurles ME, Lupski JR. Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet. 2009;10:451–81.
Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, Qi Y, Scherer SW, Lee C. Detection of large-scale variation in the human genome. Nat Genet. 2004;36(9):949–51.
Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P, Månér S, Massa H, Walker M, Chi M, et al. Large-scale copy number polymorphism in the human genome. Science. 2004;305(5683):525–8.
Buckland PR. Polymorphically duplicated genes: their relevance to phenotypic variation in humans. Ann Med. 2003;35(5):308–15.
Yang P, Zhang Z, Xu J, Qu K, Lyv S, Wang X, Cai C, Li Z, Wang E, Xie J, et al. The association of the copy number variation of the MLLT10 gene with growth traits of Chinese cattle. Animals. 2020;10(2):250.
Kang X, Li M, Liu M, Liu S, Pan MG, Wiggans GR, Rosen BD, Liu GE. Copy number variation analysis reveals variants associated with milk production traits in dairy goats. Genomics. 2020;112(6):4934–7.
Chen C, Liu C, Xiong X, Fang S, Yang H, Zhang Z, Ren J, Guo Y, Huang L. Copy number variation in the MSRB3 gene enlarges porcine ear size through a mechanism involving miR-584-5p. Genet Select Evol. 2018;50(1):72.
Wang Y, Zhang T, Wang C. Detection and analysis of genome-wide copy number variation in the pig genome using an 80 K SNP Beadchip. J Anim Breed Genet. 2020;137(2):166–76.
Wang K, Li M, Hadley D, Liu R, Glessner J, Grant SF, Hakonarson H, Bucan M. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 2007;17(11):1665–74.
Chen C, Qiao R, Wei R, Guo Y, Ai H, Ma J, Ren J, Huang L. A comprehensive survey of copy number variation in 18 diverse pig populations and identification of candidate copy number variable genes associated with complex traits. BMC Genomics. 2012;13: 733.
Wang J, Wang H, Jiang J, Kang H, Feng X, Zhang Q, Liu JF. Identification of genome-wide copy number variations among diverse pig breeds using SNP genotyping arrays. PLoS One. 2013;8(7): e68683.
Fernández AI, Barragán C, Fernández A, Rodríguez MC, Villanueva B. Copy number variants in a highly inbred Iberian porcine strain. Anim Genet. 2014;45(3):357–66.
Wang Y, Tang Z, Sun Y, Wang H, Wang C, Yu S, Liu J, Zhang Y, Fan B, Li K, et al. Analysis of genome-wide copy number variations in Chinese indigenous and western pig breeds by 60 K SNP genotyping arrays. PLoS One. 2014;9(9): e106780.
Wiedmann RT, Nonneman DJ, Rohrer GA. Genome-wide copy number variations using SNP genotyping in a mixed breed swine population. PLoS One. 2015;10(7): e0133529.
Xie J, Li R, Li S, Ran X, Wang J, Jiang J, Zhao P. Identification of copy number variations in Xiang and Kele Pigs. PLoS One. 2016;11(2): e0148565.
Wang J, Jiang J, Wang H, Kang H, Zhang Q, Liu JF. Improved detection and characterization of copy number variations among diverse pig breeds by array CGH. G3 (Bethesda). 2015;5(6):1253–61.
Stafuzza NB, Silva RMO, Fragomeni BO, Masuda Y, Huang Y, Gray K, Lourenco DAL. A genome-wide single nucleotide polymorphism and copy number variation analysis for number of piglets born alive. BMC Genomics. 2019;20(1):321.
Wang C, Chen H, Wang X, Wu Z, Liu W, Guo Y, Ren J, Ding N. Identification of copy number variations using high density whole-genome SNP markers in Chinese Dongxiang spotted pigs. Asian Australas J Anim Sci. 2019;32(12):1809–15.
Keel BN, Nonneman DJ, Lindholm-Perry AK, Oliver WT, Rohrer GA. A survey of copy number variation in the porcine genome detected from whole-genome sequence. Front Genet. 2019;10: 737.
Zheng X, Zhao P, Yang K, Ning C, Wang H, Zhou L, Liu J. CNV analysis of Meishan pig by next-generation sequencing and effects of AHR gene CNV on pig reproductive traits. Journal of animal science and biotechnology. 2020;11:42.
Bovo S, Ribani A, Muñoz M, Alves E, Araujo JP, Bozzi R, Charneca R, Di Palma F, Etherington G, Fernandez AI, et al. Genome-wide detection of copy number variants in European autochthonous and commercial pig breeds by whole-genome sequencing of DNA pools identified breed-characterising copy number states. Anim Genet. 2020;51(4):541–56.
Pearson TA, Manolio TA. How to interpret a genome-wide association study. JAMA. 2008;299(11):1335–44.
Brewer C, Holloway S, Zawalnyski P, Schinzel A, FitzPatrick D. A chromosomal duplication map of malformations: regions of suspected haplo- and triplolethality–and tolerance of segmental aneuploidy–in humans. Am J Hum Genet. 1999;64(6):1702–8.
Qiu Y, Ding R, Zhuang Z, Wu J, Yang M, Zhou S, Ye Y, Geng Q, Xu Z, Huang S, et al. Genome-wide detection of CNV regions and their potential association with growth and fatness traits in Duroc pigs. BMC Genomics. 2021;22(1):332.
Lu W, Zhang Y, Liu D, Songyang Z, Wan M. Telomeres-structure, function, and regulation. Exp Cell Res. 2013;319(2):133–41.
Jiang L, Jiang J, Yang J, Liu X, Wang J, Wang H, Ding X, Liu J, Zhang Q. Genome-wide detection of copy number variations using high-density SNP genotyping platforms in Holsteins. BMC Genomics. 2013;14: 131.
Zhang D, Qian Y, Akula N, Alliey-Rodriguez N, Tang J, Gershon ES, Liu C. Accuracy of CNV detection from GWAS data. PLoS One. 2011;6(1): e14511.
Jiang J, Wang J, Wang H, Zhang Y, Kang H, Feng X, Wang J, Yin Z, Bao W, Zhang Q, et al. Global copy number analyses by next generation sequencing provide insight into pig genome variation. BMC Genomics. 2014;15(1): 593.
Liu GE, Hou Y, Zhu B, Cardone MF, Jiang L, Cellamare A, Mitra A, Alexander LJ, Coutinho LL, Dell’Aquila ME, et al. Analysis of copy number variations among diverse cattle breeds. Genome Res. 2010;20(5):693–703.
Steemers FJ, Gunderson KL. Whole genome genotyping technologies on the BeadArray platform. Biotechnol J. 2007;2(1):41–9.
Ning L, Huo Q, Xie N. Comprehensive analysis of the expression and prognosis for tripartite motif-containing genes in breast cancer. Front Genet. 2022;13: 876325.
Zhang H, Zheng Y, Zhang Y. Knockdown of TRIM66 in MDA-MB-468 triple negative breast cancer cell line suppresses proliferation and promotes apoptosis through EGFR signaling. Pol J Pathol. 2021;72(2):160–6.
Liu Z, Li H, Zhong Z, Jiang S. A whole genome sequencing-based genome-wide association study reveals the potential associations of teat number in Qingping Pigs. Animals. 2022;12(9):1057.
Liu C, Lin C, Gao C, May-Simera H, Swaroop A, Li T. Null and hypomorph Prickle1 alleles in mice phenocopy human Robinow syndrome and disrupt signaling downstream of Wnt5a. Biol Open. 2014;3(9):861–70.
Roarty K, Serra R. Wnt5a is required for proper mammary gland development and TGF-beta-mediated inhibition of ductal growth. Development (Cambridge, England). 2007;134(21):3929–39.
Peng WF, Xu SS, Ren X, Lv FH, Xie XL, Zhao YX, Zhang M, Shen ZQ, Ren YL, Gao L, et al. A genome-wide association study reveals candidate genes for the supernumerary nipple phenotype in sheep (Ovis aries). Anim Genet. 2017;48(5):570–9.
Tang J, Zhang Z, Yang B, Guo Y, Ai H, Long Y, Su Y, Cui L, Zhou L, Wang X, et al. Identification of loci affecting teat number by genome-wide association studies on three pig populations. Asian Australas J Anim Sci. 2017;30(1):1–7.
Lopes MS, Bastiaansen JW, Harlizius B, Knol EF, Bovenhuis H. A genome-wide association study reveals dominance effects on number of teats in pigs. PLoS ONE. 2014;9(8): e105867.
Wang L, Zhang Y, Zhang T, Zhang L, Yan H, Liu X, Wang L. Genotyping by sequencing reveals a new locus for pig teat number. Anim Genet. 2017;48(4):470–2.
Wei C, Cai X, Diao S, Teng J, Xu Z, Zhang W, Zeng H, Zhong Z, Wu X, Gao Y, et al. Integrating genome-wide association study with multi-tissue transcriptome analysis provides insights into the genetic architecture of teat traits in pigs. J Genet Genomics. 2023;50(10):795–8.
Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7.
Ding R, Savegnago R, Liu J, Long N, Tan C, Cai G, Zhuang Z, Wu J, Yang M, Qiu Y, et al. The SWine IMputation (SWIM) haplotype reference panel enables nucleotide resolution genetic mapping in pigs. Commun Biol. 2023;6(1):577.
Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2.
Kim JH, Hu HJ, Yim SH, Bae JS, Kim SY, Chung YJ. CNVRuler: a copy number variation-based case-control association analysis tool. Bioinformatics. 2012;28(13):1790–2.
Bu D, Luo H, Huo P, Wang Z, Zhang S, He Z, Wu Y, Zhao L, Liu J, Guo J, et al. KOBAS-i: intelligent prioritization and exploratory visualization of biological functions for gene enrichment analysis. Nucleic Acids Res. 2021;49(W1):W317–25.
Hu ZL, Park CA, Reecy JM. Bringing the animal QTLdb and CorrDB into the future: meeting new challenges and providing updated services. Nucleic Acids Res. 2022;50(D1):D956-d961.
Zhou X, Stephens M. Genome-wide efficient mixed-model analysis for association studies. Nat Genet. 2012;44(7):821–4.
Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet. 2011;88(1):76–82.
Johnson RC, Nelson GW, Troyer JL, Lautenberger JA, Kessing BD, Winkler CA, O’Brien SJ. Accounting for multiple comparisons in a genome-wide association study (GWAS). BMC Genomics. 2010;11: 724.
Huang Y, Cai L, Duan Y, Zeng Q, He M, Wu Z, Zou X, Zhou M, Zhang Z, Xiao S, et al. Whole-genome sequence-based association analyses on an eight-breed crossed heterogeneous stock of pigs reveal the genetic basis of skeletal muscle fiber characteristics. Meat Sci. 2022;194: 108974.
Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005;21(2):263–5.
Acknowledgements
The authors would like to thank all the staff of the pig core breeding farms of Wens Foodstuff Group Co., Ltd. (Guangdong, China) for their help of sample collection.
Funding
This research was supported by the Key Technologies R&D Program of Guangdong Province project (2022B0202090002) and the National Key Research and Development Program of China (2021YFD1301103).
Author information
Authors and Affiliations
Contributions
Z.W. and E.Z. conceived and designed the experiment. D.L., Y.Q., F.Z., X.L., S.D., Q.C., and J.Y. collected the samples and recorded the phenotypes. D.L., Y.Q., F.Z., X.L., and S.D. extracted the DNA for genotyping. Y.Q., D.L., and F.Z. analyzed the data. D.L. and Y.Q. visualized the experimental results. D.L. and Y.Q. wrote the manuscript. G.C., J.Y., and E.Z. revised the manuscript for comments. Z.W. contributed to the materials. All authors have read and agreed to the published version of the manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
The animals and experimental methods used in this study follow the guidelines of the Ministry of Agriculture of China and the Use Committee of South China Agricultural University (SCAU). The ethics committee of SCAU (Guangzhou, China) approved all animal experiments. The informed consent was obtained from Wens Foodstuff Group Co., Ltd. (Guangdong, China) for data collection. There was no use of human participants, data, or tissues. We confirmed that all methods are reported in accordance with ARRIVE guidelines (https://arriveguidelines.org) for the reporting of animal experiments.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
12864_2024_10611_MOESM3_ESM.jpg
Additional file 3: Figure S1. Top 10 of enriched KEGG pathways (a) and top 10 of GO terms (b) enrichment, re-spectively.
12864_2024_10611_MOESM5_ESM.docx
Additional file 5: Table S4. Comparison of CNVRs identified in this study with other studies (based on the Sscrofa 11.1 genome assembly).
12864_2024_10611_MOESM6_ESM.jpg
Additional file 6: Figure S2. PCA plot of population structure based on SNPs. YH1, YH2, QY2, and BZ represent the four pig farms where we collected samples, namely Yuhe Farm 1, Yuhe Farm 2, Qingyuan District 2, and Baizi Farm.
12864_2024_10611_MOESM7_ESM.png
Additional file 7: Figure S3. Q–Q plots showing the observed versus expected log P-values for LTN, RTN and TTN, respectively.
12864_2024_10611_MOESM8_ESM.jpg
Additional file 8: Figure S4. Haplotype blocks on SSC5. Haplotype blocks are marked with triangles. Values in boxes are the Linkage disequilibrium (r2) between the SNP pairs. The haplotype blocks are colored in accordance with the mode of standard D’/LOD color scheme.
12864_2024_10611_MOESM9_ESM.png
Additional file 9: Figure S5. Manhattan plots of imputed data GWAS in this population. Manhattan plots consisted of TTN (a), LTN (b) and RTN (c), respectively. The x-axis represents the chromosomes, and the y-axis represents the -log10 (P-value). The solid and dashed lines indicate the genome-wide significance threshold (P=5.00E-8) and suggestive significance threshold (P=1.00E-6), respectively.
12864_2024_10611_MOESM11_ESM.jpg
Additional file 11: Figure S6. PCA plot of population structure based on CNVRs. YH1, YH2, QY2, and BZ repre-sent the four pig farms where we collected samples, namely Yuhe Farm 1, Yuhe Farm 2, Qingyuan District 2, and Baizi Farm.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Lin, D., Qiu, Y., Zhou, F. et al. Genome-wide detection of multiple variants associated with teat number in French Yorkshire pigs. BMC Genomics 25, 722 (2024). https://doi.org/10.1186/s12864-024-10611-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12864-024-10611-9