The chicken 60 k SNP array was originally developed for high-throughput SNP genotyping in GWAS studies. Although CNV detection is feasible with this SNP panel, it is of less power due to low marker density, non-uniform SNP distribution along chicken chromosomes, and a lack of non-polymorphic probes specifically designed for CNV identification
. Thus, using this array, typically only large CNVRs could be identified.
Several algorithms have been developed to identify CNVs, including CNVPartition, QuantiSNP, PennCNV, Birdsuite, Cokgen, Gada, and CONAN
[9–14]. Each algorithm has its own strength and weakness
[15, 16]. In the present study, the PennCNV algorithm was used for CNV detection, and CNVPartition was employed to verify the CNVs detected by PennCNV. We found that 99% of the CNVRs detected by PennCNV could be verified by the CNVPartition program. This high ratio indicated that the CNVRs detected in this study were credible and the following discussion was based on the PennCNV results.
We used two lines divergently selected for abdominal fat content to detect CNVs in the chicken, and found the lean line had more CNVs than the fat line (438 vs 291). One of the reason could be due to different number of animals (203 and 272 individuals in the lean and fat lines, respectively). Additionally, these two lines have different selection signatures as reported previously
, which suggests that artificial selection for abdominal fat could also lead to CNV alterations between these two lines.
We compared our results with several previous reports on chicken CNVs. The first study was reported by Griffin et al.
. They used the CGH array and detected 12 CNVs in broiler and layer genomes, compared with the Red Jungle Fowl. Two of these 12 CNVs overlapped with our results (Additional file
4: Table S4). Wang et al. detected 96 CNVs in three chicken lines (Cornish Rock broiler, Leghorn, and Rhode Island Red) using whole-genome tiling arrays
. Of these 96 CNVs, 14 CNVs overlapped with our results (Additional file
4: Table S4). In 2012, Wang et al. detected 130 CNVRs in four chicken breeds (Cobb broiler, White Leghorn, Chinese Dou and Chinese Dehong) using CGH arrays, with 16 overlapping CNVs (Additional file
4: Table S4)
. In the same year, Jia et al. identified 209 CNVRs in two distinct chicken lines (White Leghorn and dwarf) using chicken 60 k SNP arrays, with 47 overlapping CNVRs (Additional file
4: Table S4)
. Luo et al. identified 45 CNVs in four chicken lines (L63, L72, RCS-L, and RCS-M), with two CNVs overlapping with our CNVRs
. Crooijmans et al. detected 1556 CNVRs using the CGH arrays in a wide variety of chicken breeds, with 140 overlapping CNVRs with our current study
. In total, 181 of 459 CNVRs (271 and 188 CNVRs in lean and fat lines, respectively) (39%) detected in our study were also detected in previous studies (Additional file
4: Table S4). Potential reasons for the observed differences include the following three considerations. Firstly, the populations are of different sizes and genetic background; Secondly, different array platforms are used, either SNP genotyping or CGH arrays; Thirdly, genomic waves can interfere with accurate CNV detection
[41, 65]. Genomic waves refer to signal intensity patterns across all chromosomes, with different samples showing highly variable magnitudes of waviness
. In our study, we adjusted for genomic waves using the -gcmodel option in PennCNV. Genomic waves were generally not considered in other studies. Apart from low overlapping rates between different chicken CNV studies, the same issue was also encountered in other animals
In previous observations, CNVs are preferentially located in gene-poor regions
[68, 69]. It is speculated that CNVs present in gene-rich regions may be deleterious and under purifying selection
. In the chicken genome, there are approximately 28,000 genes (data from the GeneChip®Chicken Genome Array Profile), and 886 (3.16%) annotated genes located in the 271 and 188 CNVRs in the lean and fat lines, respectively, were identified in our current study. These CNVRs covered 3.92 and 2.98% of the chicken genome in the lean and fat lines, respectively. Therefore, we can not state that these CNVRs locate in gene-poor or gene-rich regions.
QPCR is often used to validate novel CNVRs, but confirmation rates are usually not very high
[11, 20, 21]. For instance, Fadista et al.
 and Hou et al.
 confirmed 50 and 60% of CNVRs selected for validation, respectively. Our validation rate was 61% (204 out of the 333 qPCR assays), comparable to the results of other studies.
Comparing CNVs detected in our current study with known QTLs (in the QTL database) and selective sweeps for abdominal fat content
, we identified 14 genes (8 and 6 in the lean and fat lines, respectively). For the eight genes in the lean line, we found SLC9A3, GNAL, ANXA10, MYLK, CCDC14, and SPAG9 expressed in chicken pre-adipocytes, and SLC9A3, GNAL, ANXA10, HELIOS, MYLK, CCDC14, and SPAG9 expressed in both chicken abdominal fat and liver tissues (data not published). For the six genes in the fat line, we found SOX5, VSNL1, SMC6, and GEN1 expressed in chicken pre-adipocytes, GEN1, SMC6, SOX5, and VSNL1 expressed in chicken abdominal fat tissue, and SOX5, VSNL1, and SMC6 expressed in chicken liver tissue (data not published). Basic functions of these 14 genes are described as follows.
SLC9A3 is also known as sodium–hydrogen antiporter 3, or sodium–hydrogen exchanger 3 (NHE3)
. SLC9A3 is expressed in human intestine, stomach, respiratory tract, kidneys, glandular and epithelial cells
. SLC9A3 is present in the brush-border of intestinal Na + -absorptive cells and renal proximal tubules, playing an important role in gastrointestinal and renal Na + absorption
, and suggesting it may be involved in food digestion and nutrient absorption, and in turn, abdominal fat deposition.
G-proteins are divided into four subfamilies according to their α-subunits (Gαs, Gαi/o, Gαq, and Gα12)
. Gα subunits interact with both receptor and effect or molecules, and are considered the functional component of G-proteins. GNAL shares 88% amino acid homology with Gαs, and is considered a member of the Gαs family
. Although GNAL was originally discovered in olfactory neuroepithelium and striatum, it is also present in pancreatic β-cells, testis, spleen, lung, and heart
. In addition, this gene is highly expressed in adipose tissue (http://www.genatlas.org/), indicating it may be associated with abdominal fat deposition.
SPOCK3 encodes a member of the novel family of calcium-binding proteoglycan proteins that contain thyroglobulin type-1 and Kazal-like domains. Encoded SPOCK3 protein may play a key role in adult T-cell leukemia by inhibiting membrane-type matrix metalloproteinase activity
. SPOCK3 is expressed in the mouse nervous system
ANXA10 belongs to the annexin family, and is over-expressed in oral squamous cell carcinoma-derived cell lines
. ANXA10 plays an important role in cellular functioning of endocytosis and exocytosis, anticoagulant activity, cytoskeletal interactions, differentiation, and cellular proliferation
[79, 80]. Moreover, ANXA10 shows relevant malignancy in Barrett’s esophagus, gastric cancer, and bladder cancer
[81–83]. ANXA10 is expressed in the digestive system including liver and stomach tissues (http://www.genatlas.org/), indicating it may affect food digestion and absorption, and consequently be associated with fat deposition.
Helios is a member of the Ikaros transcription factor family, and preferentially expressed by regulatory T cells
. Previous work has shown that obese patients with insulin resistance have decreased HELIOS but increased FOXP3 mRNA expression in visceral adipose tissue
. Helios is expressed in ectodermal and neuroectodermal-derived tissues
MYLK is a muscle member of the immunoglobulin gene superfamily, and encodes myosin light chain kinase, a calcium/calmodulin dependent enzyme. Genetic and functional studies show that heterozygous loss-of-function mutations in MYLK are associated with aortic dissection
. MYLK is highly expressed in heart, prostate, trachea tissues, and the digestive system (including esophagus and small intestine), suggesting this gene is involved in food digestion and absorption, and consequently associated with fat deposition.
CCDC14 is a protein-coding gene with unknown function. CCDC14 is expressed in male testis tissue (http://www.genatlas.org/).
SPAG9 is a novel member of c-Jun NH2 -terminal kinase (JNK) interacting proteins, exclusively expressed in testis
. SPAG9 may play a key role in reproductive processes, and tumor growth and development
SOX5 is a member of the SOX (SRY-related HMG-box) family, and involved in regulation of embryonic development and cell fate determination
. In chicken, CNV in intron 1 of SOX5 can cause the Pea-comb phenotype
. SOX5 is expressed in brain, spinal cord, testis, lung, and kidney, and can control cell cycle progression in neural progenitors by interfering with the WNT-beta-catenin pathway
. A recent study indicated SOX5 may play an important role in left ventricular mass regulation, a disease that may be affected by abdominal obesity
VSNL1 is a member of the visinin/recoverin subfamily of neuronal calcium sensor proteins, and highly expressed in human heart and brain
[93, 94]. Previous results suggest VSNL1 regulates heart natriuretic peptide receptor B
. The VSNL1 gene also plays a critical role in regulating cell adhesion and migration via downregulation of fibronectin receptor expression
. The VSNL1 gene is highly expressed in the nervous system.
Structural maintenance of chromosomes (SMC) proteins are a family of related proteins that form the core of three protein complexes. Smc1 and 3 ensure sister chromatids remain associated after DNA replication, as well as playing roles in gene expression and DNA repair
. Smc2 and 4 are responsible for chromosome condensation during mitosis
. The Smc5-6 complex is required for DNA repair by homologous recombination, although its exact role is not fully understood
GEN1 is a member of the Rad2/XPG family of monomeric, structure-specific nucleases
. This protein family includes N-terminal and internal XPG nuclease motifs, and a helix–hairpin–helix domain
. The GEN1 gene is expressed in pancreas, thymus, brain, testis, lung, and kidney, and has Holliday junction resolvase activity in vitro, presumably functioning in homology-driven repair of DNA double-strand breaks
Msgn1 is a basic helix–loop–helix transcription factor, specifically expressed in the presomitic mesoderm (psm). Msgn1 controls differentiation and movement of psm progenitor cells, and mouse embryos lacking Msgn1 exhibit a severely reduced psm and an absence of trunk somites
The vertebrate egg envelope is constructed by a set of related proteins encoded by the zona pellucida (ZP) genes
. Vertebrate ZP genes have six subfamilies: ZPA/ZP2, ZPB/ZP4, ZPC/ZP3, ZP1, ZPAX, and ZPD
. The Zpb pseudogene was identified in the mouse genome, Zp1 pseudogene in the dog and bovine genomes, and Zpax pseudogene in the human, chimpanzee, macaque, and bovine genomes
. ZP genes may play an important role in sperm-egg recognition
All these genes are located in QTLs for abdominal fat weight or percentage in the chicken. From the known functions of these genes, GNAL, HELIOS, and SOX5 are directly related to adipose tissue metabolism or obesity, while SLC9A3, SPOCK3, ANXA10, MYLK, and VSNL1 may be indirectly related. The function of CCDC14, SPAG9, SMC6, GEN1, MSGN1, and ZPAX on adipose tissue development is unknown. However, further investigation are still needed to examine their functional implications in chicken adipogenesis.