Massive screening of copy number population-scale variation in Bos taurusgenome
© Cicconardi et al; licensee BioMed Central Ltd. 2013
Received: 2 July 2012
Accepted: 11 February 2013
Published: 26 February 2013
Copy number variations (CNVs) represent a significant source of genomic structural variation. Their length ranges from approximately one hundred to millions of base pair. Genome-wide screenings have clarified that CNVs are a ubiquitous phenomenon affecting essentially the whole genome. Although Bos taurus is one of the most important domestic animal species worldwide and one of the most studied ruminant models for metabolism, reproduction, and disease, relatively few studies have investigated CNVs in cattle and little is known about how CNVs contribute to normal phenotypic variation and to disease susceptibility in this species, compared to humans and other model organisms.
Here we characterize and compare CNV profiles in 2654 animals from five dairy and beef Bos taurus breeds, using the Illumina BovineSNP50 genotyping array (54001 SNP probes). In this study we applied the two most commonly used algorithms for CNV discovery (QuantiSNP and PennCNV) and identified 4830 unique candidate CNVs belonging to 326 regions. These regions overlap with 5789 known genes, 76.7% of which are significantly co-localized with segmental duplications (SD).
This large scale screening significantly contributes to the enrichment of the Bos taurus CNV map, demonstrates the ubiquity, great diversity and complexity of this type of genomic variation and sets the basis for testing the influence of CNVs on Bos taurus complex functional and production traits.
KeywordsCopy number variants Structural variations Cattle Bos taurus
Copy number variants (CNVs) represent a significant source of genomic structural variation. Their length ranges from 100 bp to several Megabases (up to 5 Mb) and they comprise insertions, deletions, and duplications [1–5]. CNVs were initially thought to be only associated to diseases, but genome-wide screenings have clarified that they are ubiquitous and widespread in many animal genomes [6–11].
Recent studies have shown that genomic structural variations (including CNVs) are common among normal and healthy individuals [12–14]. They account for more differences between individuals, in terms of total bases involved, and have a higher per-locus mutation rate than SNPs . Understanding their distribution in the population at large is crucial in order to clarify their role in determining the phenotype and/or disease state . In humans, several studies have attempted to characterize CNVs in populations using data from the International Human HapMap Consortium [1, 9, 13, 17, 18], and other reference groups [2, 3, 16]. These studies have confirmed that CNVs are widespread throughout the genome and show a broad variation in their frequency of occurrence in populations. In addition they are present throughout the genomes of all taxa investigated so far: mammals [19–26], birds  and invertebrates [28, 29].
CNVs exist in at least two distinct, although non-exclusive, states. Common CNV polymorphisms (i.e. frequency > 1%) often with multiple allelic states defined by variations in copy number and/or genomic structure; and rare CNVs, that typically lead to deletion or duplication of larger genomic segments and exist in fewer allelic states (i.e., hemizygous or trisomic). These latter classes of CNVs are highly penetrant and short-lived in the population, either occurring de novo or persisting for only a few generations and subject to purifying selection . While these structural variations are often benign, they can sometimes influence or even disrupt biological functions. For example CNVs have been identified as causative of a number of human diseases [5, 11].
Bos taurus is one of the most important domestic animal species worldwide. It is one of the most studied ruminant models for metabolism, reproduction, and disease . Consequently, the understanding of the genetic basis of the differences in productive and functional traits in this species has great economic importance and biological significance. In this context, knowledge of the abundance and distribution of CNVs and of their association with phenotypes are of major interest. However, until now, relatively few studies have investigated CNVs in cattle [32–40], none using a population-wide analysis. Therefore, little is known about how CNVs contribute to normal phenotypic variation and disease susceptibility in cattle, compared to humans and other model organisms.
The recent focus of the research community on the study of single nucleotide polymorphisms (SNPs) to assess genetic variation in cattle have promoted the use of genotyping arrays mapping to thousands of loci throughout the genome (e.g. Illumina BovineSNP50 BeadChip with 54,001 informative SNP probes). This type of array is now easily available to scan thousands of individuals at an affordable cost, allowing CNVs to be investigated on a wide scale. Compared to the higher-density of a comparative genomic hybridization array (CGH arrays), a method that detects copy number changes at the level of 5–10 kb, SNP arrays have the advantage of providing both normalized intensities (Log R ratio – LRR), allelic intensity ratios (B allele frequency – BAF) and a better estimate of the loss of heterozygosity (LOH) making CNV detection more robust. Several algorithms are able to detect CNVs using the intensity of fluorescent signals from SNP arrays. In this study we applied the two most commonly used and efficient ones , as implemented in the QuantiSNP  and PennCNV  software, to investigate the genome-wide characteristics of CNVs in five Bos taurus breeds. We scanned the 29 autosomal chromosomes in a panel of 2654 animals and identified 4830 unique CNV candidates belonging to 326 regions, comparing our findings with existing publicly available information on cattle CNVs and investigated the identity and function of genes located within the duplicated regions. Our results significantly enrich the current knowledge about copy-number variants in the Bos taurus genome determining their distribution across the genome in five dairy and beef cattle breeds (Italian Friesian, Italian Brown, Italian Simmental, Marchigiana and Piedmontese). These findings are an important resource for follow-up studies on cattle genome structure and CNV-trait association [44, 45].
CNV discovery and distribution
Average number per sample
Average size of CNVs (kb)
Median size of CNVs (kb)
No. of common CNVs (freq> 1%)
No. of Gain
No. of Loss
No. homo/heterozygous Loss
No. of Both
Eleven copy-number variation regions of homozygous and heterozygous deletions and duplications (Additional file 2: Table S2) were validated by quantitative real-time PCR. These were randomly selected across eleven autosomal chromosomes. Each CNV was amplified in a minimum of three and a maximum of seven specimen belonging to different breeds, for a total of 50 validation tests. The CNV copy number estimated by qRT-PCR was plotted against the BeadChip copy number determination (Figure 1c). Linear regression analysis showed a high level of correlation (R2 = 0.92) and a curve slope of 1.00 (Standard Error: 0.05; p-value = 2.2e-16).
The analysis of the distribution of CNV size indicates that with the BF values used less than 2% of CNVs are ≤ 100kb, 12% have a length between 100 and 250kb, 27% have a length between 250 and 500kb, 33% have a length between 500 and 1000 kb, and 25% are longer than 1 Mb. In few samples we identified CNVs about 8Mb long. CNVR number and length are not significantly correlated to chromosome length. BTA29 hosts three CNVRs, while BTA6, has 20 CNVRs, the highest value. Out of the 326 CNVRs, 192 include loss-only events, 31 gain-only events and 103 include both. Loss events are approximately 6.2-fold more common than gain events in CNVRs, while the corresponding rate is 2.5-fold for CNVs. CNVRs affected by loss events have, on average, smaller size than gain regions, in line with the recent published results of Hou et al. .
CNV association with segmental duplications and gene content
Although the complete set of mechanisms responsible for generating CNVs is unknown, studies on cattle [2, 37] and other mammalian species [5, 29, 40] highlighted an enrichment of CNVs near segmental duplications (SD). Segmental duplications, defined as genomic regions of high sequence identity (greater or equal to 90%) to more than one genomic locus, may mediate CNV genesis by acting as a substrate for non-allelic homologous recombination. These recombination events may result in amplification, deletion, inversion, or copy number variants. We tested whether there is a non-random association between the CNVs that we discovered and known SD regions  and found a significant overlap: 76.7% of the CNVs intersect with SDs (p-value < 0.001 as estimated by a random permutation test).
The 4839 non-redundant CNVs found within autosomes overlap with a total of 5789 known genes (Additional file 4: Table S4 and Additional file 5: Table S5). Among them, 5019 (87%) are protein coding genes, 676 (12%) non-coding RNAs (229 miRNA, 73 rRNA, 211 snRNA, 131 snoRNA, 32 misc_RNA), and 94 (1%) are pseudogenes and retrotransposable elements. The ~5000 loci included in CNVs contain about 25% of the estimated total number of genes of the species (Additional file 4: Table S4). This fraction is higher than what has been reported in similar papers (Hou et al., 1,263 , Bae et al., 538 ) but comparable with the results of the population-scale study in humans carried out by Mills and colleagues , who mapped genomic structural variations affecting more than 10000 genes.
KEGG pathway and Gene ontology enrichment
GO:0022627~cytosolic small ribosomal subunit
GO:0015935~small ribosomal subunit
GO:0044448~cell cortex part
GO:0070013~intracellular organelle lumen
GO:0043232~intracellular non-membrane-bounded organelle
Category enrichment for Italian Simmental
LOC751562; PRP1,3,6,9; LOC751563; CSH2; PRP-VII; PRL
LOC751562; PRP1,3,6,9; LOC751563; CSH2; PRP-VII; PRL
LOC751562; PRP1,3,6,9; LOC751563; CSH2; PRP-VII; PRL
IPR018116:Somatotropin hormone, conserved site
LOC751562; PRP1,3,9; LOC751563; CSH2; PRP-VII; PRL
IPR012351:Four-helical cytokine, core
LOC751562; PRP1,3,4,6; CSH2; PRL
LOC751562; PRP1,3,4,6; CSH2; PRL
LOC751562; PRP1,3,6,9; LOC751563; CSH2; PRP-VII; PRL
CSH2; PRP1,3,4; PRL
PRP1,3,4,6; CSH2; PRL
CSH2, PRP1, PRP4, PRL, PRP3
PRP1,3,4,6; CSH2; PRL
The other two datasets obtained by Fadista et al.  and Liu et al.  who used a CGH array, show a more limited overlap with our dataset, namely 19% and 18%. The lower overlap in these cases is very likely due to the fact that the CGH array they used has a much higher density of probes (420 bases of average probe spacing ) compared to the BovineSNP50 beadchip (49 kb of average probe spacing). The identification with high confidence of short CNVs (< 50 kb), even the more frequent ones [35, 40], is much harder with the Illumina genotyping chip, which identifies CNVs having a distribution skewed towards large size. We also measured the percentage of overlap of the CNVs detected by us and by two other studies based on the next-generation sequencing approach [39, 40]. Even though the authors of these studies examined fewer samples (two samples in  and six in ), their more accurate methodology, at nucleotide resolution, shows a moderately higher overlap with our data (33% and 22% respectively, Additional file 1: Table S1). The only partial overlap of the CNVs we find with those detected in other studies can, in principle, be explained by the different breeds used here. Many CNVs appear to be breed specific and may contribute to breed differentiation. On the other hand several studies  suggest that the bulk of CNV variability is more individual than breed specific and therefore the larger number we find is most likely due to the fact that we tested a large number of individuals.
Bos taurusCNV features among breeds
We looked at the differences among the five Bos taurus breeds investigated: Italian Friesian (dairy), Italian Brown (dairy), Italian Simmental (dairy/beef), Piedmontese (beef), and Marchigiana (beef).
CNV events by Bos taurus breeds
Protein coding genes
Total length (Mb)
Mean length (kb)
Min length (kb)
Max length (kb)
Italian Simmental %
Italian Friesian %
Italian Brown %
The CNVs distribution among chromosomes (Figure 5b) is, in general, homogeneous and consistent across breeds with the exception of two breeds showing a peak in CNV frequencies in two different chromosomes (BTA5, BTA17). In BTA5 the percentage of CNVs in four breeds is only 3.4% (p-value < 1e-12), while in Marchigiana this chromosome carries 18.1% of all its CNVs observed (107/591 CNVs). The same is true for the BTA17 where the Italian Simmental has 18.5% of the CNVs (107/578 CNVs) to be compared with 7.8% for the other breeds (p-value < 0.04). Considering all the other CNV features (length, population frequency and chromosome position), no significant difference was observed among breeds. Overall this findings also suggest that differences between individuals seems to be much larger than differences between breeds.
Gene ontology enrichment was computed taking into account the genes involved in CNVs for each breed. Only the 17 genes of the Italian Simmental (Additional file 6: Table S6, Additional file 7: Table S7) showed functional enrichment (Table 2). In particular we observed a significant enrichment for GO term involved in Somatotropin and prolactin/lactogen/growth activity genes caused by a single and breed-specific CNV (chr23:33,906,415-36,330,036; three copies) that contains 12 loci (LOC751562-3, PRP1,3,4,6,9, CSH2, PRP-VII, PRL, HDGFL1, MIR2284C). These genes belongs to the PRL family (prolactin related proteins), expressed in the placenta around the first 60 days of gestation and are involved in the establishment and maintenance of pregnancy . Prolactin genes (PRL) are known to have undergone rapid evolution in the lineage leading to ruminants [51–54] and to be duplicated in all well studied ruminants species. The evidence presented here suggests a possible implication of this cluster in the explanation of genetic variation of production traits.
In this investigation we find more CNVs than in previous studies [34–36, 39, 40, 51]. This is likely due to the large number of individuals analysed. There is also a (probably less relevant) difference in the analysis tools that we have used, PennCNV (as in previous studies) and QuantiSNP, known to be more efficient . Given the high number of individuals analysed we detected a number of previously unidentified rare CNVs. It has been reported that in humans, for example, the bulk of the observed copy-number variation is present at ~0.02%–1% frequency .
We cannot exclude the presence of false positives in our dataset, but the results of qRT-PCR validation of 50 individuals for the presence of 11 CNVs (see Figure 1c, R2 = 0.92) suggests that the level of BF (BF = 15 vs the commonly used threshold of 10) used in favour of the detection of false positive CNVs was rather effective. Only the validation reported by Fadista et al.  is comparatively equally extensive (65 individuals and 6 CNVs). Furthermore, the number of CNVs per individual in our case averages of 2.8, a lower value than what found in other studies (around 3.6 in Bos taurus with the same SNP chip). We are therefore confident that the rate of false positives we detected is reasonably low and that do not affect the overall picture.
Notwithstanding the high number of samples examined and CNVs identified, we likely still haven't drawn a complete picture of CNV presence in cattle, mainly because of the limitations of the genotyping array used. We are well aware that the relatively low density of the Illumina arrays with respect of other methods (CGH arrays, whole re-sequencing) make the detection of short CNVs very hard, while it is very well documented, by deep-sequencing methodologies that in Homo sapiens[18, 55] and more recently in Bos taurus the most populated class of CNVs is that of variants shorter than 50 kb [39, 40]. This limitation will only be partially overcome by using the more recent higher-density BovineHD BeadChip (777 k SNPs). This chip, with its 3430 bp average probe distance is ~8 times less dense than the available CGH arrays and therefore would not solve the problem of incompleteness. It is unlikely that any single available technology will capture all genome structural variations and the use of multiple experimental methods (sequence assembly comparisons, paired-end sequencing, sequencing analysis and high-resolution tiling arrays) will be needed to unravel the complexity of genome variations.
Our study presents the first population-scale description of copy number variants in Bos Taurus obtained by analysing data from more than 2500 individuals belonging to five different dairy and beef breeds and using two different bioinformatics algorithms. We found that CNVs collectively span ~20% of the genome and that a significant portion of the genome is potentially subject to variation in copy number, as observed in humans. We described here the frequencies, patterns, and the potential of gene landscape impact of such cattle-specific and breed-specific CNVs. Many CNVs include genes having specific biological roles, e.g. in metabolism, and are thus likely to be functional. Our population scale analysis reveals that, because of their very low frequency, many CNVs are likely to arise independently, generating increased diversity among individuals and providing insight into the penetrant behaviour of CNVs in the population. This cattle CNV map provides information that complements SNP information and may be added to SNP-based genome-wide association and selection studies. A more comprehensive knowledge of the full landscape of bovine genetic variation permits a better understanding of ruminant biology and a further improvement of selection methods in this species.
Animal handling and DNA extraction was carried out following national guidelines and was approved by the animal ethics committee.
Systematic genome-wide CNV analysis
We studied CNVs in a sample of 2654 Italian bulls (B. taurus males used for reproductive purposes in Italian breading). The selection of only bulls is due to the fact that males are usually the ones screened for genotyping and genetically evaluated to record the production traits of their offsprings. The animals belong to five different breeds (891 Italian Friesian, 705 Italian Brown, 482 Italian Simmental, 369 Piedmontese, 207 Marchigiana). Genomic DNA of all samples was analysed using the BovineSNP50 v1 BeadChip 54001 probes (Illumina, San Diego, CA)  according to the standard protocol . Sex chromosomes were excluded from the analysis and only autosomes were used. The QuantiSNP  and PennCNV  tools were used to identify copy number deletions and duplications. Both methods are based on a Hidden Markov Model for the detection of CNVs from Illumina high-density SNP genotyping data. PennCNV is the most frequently used algorithm for CNV studies of this type, partly because of the user-friendly design of the program. Its low false positive rate is another convenient aspect. By contrast, QuantiSNP outperformed six other methods in a recent evaluation study of CNV calling algorithms . We deemed the combined use of both algorithms to be a valid strategy.
Samples with LogR ratio (the normalized total intensity at each locus) higher than 0.30 were filtered out together with individuals with CNV longer than 8Mb, likely to be affected by diseases . For both QuantiSNP and PennCNV, a quality control step for GC-content was performed to check for GC-wave factor and subsequently taken into account for correcting the bias in the analysis . To optimally tune the parameters, such as GC wave factor correction, a training dataset composed of 10% of the data was used. Next, a quality filter for CNV calling based on Bayes Factor thresholds using parameters reported previously [44–47] was applied followed by quantitative PCR (qRT-PCR). The qRT-PCR was used to select the BF threshold with the lower false positive rate. When both the QuantiSNP and PennCNV algorithms detected overlapping CNVs, those with higher BF were selected. All statistical tests to estimate differences in CNV features among breeds, were performed using the Wilcoxon-Mann–Whitney rank sum test statistic as implemented in the R package (wilcox.test, http://www.r-project.org).
Association between CNV, segmental duplication and gene content
The non-random association between CNVs and segmental duplications was tested by determining the direct overlap of CNV boundaries with the segmental duplication location available from the literature . The association test was performed by comparing the data with those obtained by randomly selecting a segment length from the distribution of CNV lengths and a valid chromosomal location for 1000 times.
Gene content of the cattle CNV regions was obtained via the Ensemble BioMart tool  using the genome version Btau_4.0. The obtained list of protein coding genes was used to determine the GO terms and pathway enrichment using the DAVID Bioinformatics resource . The Benjamini method for multiple testing correction was used .
To validate the discovered CNVs, TaqMan quantitative real-time PCR was performed on 50 individuals in 11 regions (Additional file 1: Table S1). Reactions were performed in triplicate in a volume of 25 μl with the Maxima Probe qPCR master mix (Fermentas) on a LightCycler® 480 System (Roche). The PCR cycling conditions were: pre-incubation for 15 min at 95°C, 55 cycles of 15 s at 95°C, 30 s at 58°C. The PCR products were also sequenced to verify the correctness of the amplification region. Primer efficiency was tested for each primer pair (Additional file 1: Table S1) over five dilution points using Maxima SYBR Green qPCR master mix (Fermentas). BTF3 was used as reference gene for all qPCR experiments as in Bae et al. 2010. The quantification analysis was performed using the R package qpcR (http://www.dr-spiess.de/qpcR.html) using the ΔΔCt method [21, 62]. The Regression analyses were calculated with the linear model fit function (lm) implemented in R (http://www.r-project.org).
Bos taurus autosome
Comparative genomic hybridization
Copy number variation
Copy number variation region
- GO term:
Gene Ontology term
Log R ratio
Quantitative real-time polymerase chain reaction
Small nucleolar RNA
Single nucleotide polymorphism
Small nuclear RNA
Research funded by the Italian Ministry of Agriculture, grant SelMol and Innovagen. Authors wish to thank ANABIC, ANABORAPI, ANAFI, ANAPRI, ANARB, the Regione Lazio and EPIGEN.
- Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH, Carson AR, Chen W, Cho EK, Dallaire S, Freeman JL, González JR, Gratacòs M, Huang J, Kalaitzopoulos D, Komura D, MacDonald JR, Marshall CR, Mei R, Montgomery L, Nishimura K, Okamura K, Shen F, Somerville MJ, Tchinda J, Valsesia A, Woodwark C, Yang F, Zhang J, Zerjal T, Zhang J, Armengol L, Conrad DF, Estivill X, Tyler-Smith C, Carter NP, Aburatani H, Lee C, Jones KW, Scherer SW, Hurles ME: Global variation in copy number in the human genome. Nature. 2006, 444: 444-454. 10.1038/nature05329.PubMed CentralView ArticlePubMedGoogle Scholar
- Sharp AJ, Locke DP, McGrath SD, Cheng Z, Bailey JA, Vallente RU, Pertz LM, Clark RA, Schwartz S, Segraves R, Oseroff VV, Albertson DG, Pinkel D, Eichler EE: Segmental duplications and copy-number variation in the human genome. Am J Hum Genet. 2005, 77: 78-88. 10.1086/431652.PubMed CentralView ArticlePubMedGoogle Scholar
- Tuzun E, Sharp AJ, Bailey JA, Kaul R, Morrison VA, Pertz LM, Haugen E, Hayden H, Albertson D, Pinkel D, Olson MV, Eichler EE: Fine-scale structural variation of the human genome. Science. 2004, 304: 581-584. 10.1126/science.1092500.View ArticleGoogle Scholar
- Feuk L, Carson AR, Scherer SW: Structural variation in the human genome. Nat Rev Genet. 2006, 7: 85-97.View ArticlePubMedGoogle Scholar
- Zhang F, Gu W, Hurles ME, Lupski JR: Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet. 2009, 10: 451-481. 10.1146/annurev.genom.9.081307.164217.PubMed CentralView ArticlePubMedGoogle Scholar
- Goidts V, Cooper DN, Armengol L, Schempp W, Conroy J, Estivill X, Nowak N, Hameister H, Kehrer-Sawatzki H: Complex patterns of copy number variation at sites of segmental duplications: an important category of structural variation in the human genome. Hum Genet. 2006, 120: 270-284. 10.1007/s00439-006-0217-y.View ArticlePubMedGoogle Scholar
- Khaja R, Zhang J, MacDonald JR, He Y, Joseph-George AM, Wei J, Rafiq MA, Qian C, Shago M, Pantano L, Aburatani H, Jones K, Redon R, Hurles M, Armengol L, Estivill X, Mural RJ, Lee C, Scherer SW, Feuk L: Genome assembly comparison identifies structural variants in the human genome. Nat Genet. 2006, 38: 1413-1418. 10.1038/ng1921.PubMed CentralView ArticlePubMedGoogle Scholar
- Komura D, Shen F, Ishikawa S, Fitch KR, Chen W, Zhang J, Liu G, Ihara S, Nakamura H, Hurles ME, Lee C, Scherer SW, Jones KW, Shapero MH, Huang J, Aburatani H: Genome-wide detection of human copy number variations using high-density DNA oligonucleotide arrays. Genome Research. 2006, 16: 1575-1584. 10.1101/gr.5629106.PubMed CentralView ArticlePubMedGoogle Scholar
- McCarroll SA, Hadnott TN, Perry GH, Sabeti PC, Zody MC, Barrett JC, Dallaire S, Gabriel SB, Lee C, Daly MJ, Altshuler DM: Common deletion polymorphisms in the human genome. Nat Genet. 2005, 38: 86-92.View ArticleGoogle Scholar
- Newman TL, Rieder MJ, Morrison VA, Sharp AJ, Smith JD, Sprague LJ, Kaul R, Carlson CS, Olson MV, Nickerson DA, Eichler EE: High-throughput genotyping of intermediate-size structural variation. Hum Mol Genet. 2006, 15: 1159-1167. 10.1093/hmg/ddl031.View ArticlePubMedGoogle Scholar
- Wong KK, deLeeuw RJ, Dosanjh NS, Kimm LR, Cheng Z, Horsman DE, MacAulay C, Ng RT, Brown CJ, Eichler EE, Lam WL: A Comprehensive Analysis of Common Copy-Number Variations in the Human Genome. Am J Hum Genet. 2007, 80: 91-104. 10.1086/510560.PubMed CentralView ArticlePubMedGoogle Scholar
- Feuk L, MacDonald JR, Tang T, Carson AR, Li M, Rao G, Khaja R, Scherer SW: Discovery of Human Inversion Polymorphisms by Comparative Analysis of Human and Chimpanzee DNA Sequence Assemblies. PLoS Genet. 2005, 1: 10-10.1371/journal.pgen.0010010.View ArticleGoogle Scholar
- Conrad DF, Andrews TD, Carter NP, Hurles ME, Pritchard JK: A high-resolution survey of deletion polymorphism in the human genome. Nat Genet. 2006, 38: 75-81. 10.1038/ng1697.View ArticlePubMedGoogle Scholar
- Hinds DA, Kloek AP, Jen M, Chen X, Frazer KA: Common deletions and SNPs are in linkage disequilibrium in the human genome. Nat Genet. 2006, 38: 82-85. 10.1038/ng1695.View ArticlePubMedGoogle Scholar
- Lupski JR: Genomic rearrangements and sporadic disease. Nat Genet. 2007, 39: S43-S47. 10.1038/ng2084.View ArticlePubMedGoogle Scholar
- Pinto D, Marshall C, Feuk L, Scherer SW: Copy-number variation in control population cohorts. Hum Mol Genet. 2007, 16 Spec No: R168-R173.View ArticleGoogle Scholar
- Locke DP, Sharp AJ, McCarroll SA, McGrath SD, Newman TL, Cheng Z, Schwartz S, Albertson DG, Pinkel D, Altshuler DM, Eichler EE: Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome. Am J Hum Genet. 2006, 79: 275-290. 10.1086/505653.PubMed CentralView ArticlePubMedGoogle Scholar
- Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, Abyzov A, Yoon SC, Ye K, Cheetham RK, Chinwalla A, Conrad DF, Fu Y, Grubert F, Hajirasouliha I, Hormozdiari F, Iakoucheva LM, Iqbal Z, Kang S, Kidd JM, Konkel MK, Korn J, Khurana E, Kural D, Lam HYK, Leng J, Li R, Li Y, Lin C-Y, Luo R, Mu XJ, Nemesh J, Peckham HE, Rausch T, Scally A, Shi X, Stromberg MP, Stütz AM, Urban AE, Walker JA, Wu J, Zhang Y, Zhang ZD, Batzer MA, Ding L, Marth GT, McVean G, Sebat J, Snyder M, Wang J, Ye K, Eichler EE, Gerstein MB, Hurles ME, Lee C, McCarroll SA, Korbel JO: Mapping copy number variation by population-scale genome sequencing. Nature. 2011, 470: 59-65. 10.1038/nature09708.PubMed CentralView ArticlePubMedGoogle Scholar
- Kehrer-Sawatzki H, Cooper DN: Structural divergence between the human and chimpanzee genomes. Hum Genet. 2007, 120: 759-778. 10.1007/s00439-006-0270-6.View ArticlePubMedGoogle Scholar
- Lee AS, Gutiérrez-Arcelus M, Perry GH, Vallender EJ, Johnson WE, Miller GM, Korbel JO, Lee C: Analysis of copy number variation in the rhesus macaque genome identifies candidate loci for evolutionary and human disease studies. Hum Mol Genet. 2008, 17: 1127-1136. 10.1093/hmg/ddn002.View ArticlePubMedGoogle Scholar
- Graubert TA, Cahan P, Edwin D, Selzer RR, Richmond TA, Eis PS, Shannon WD, Li X, McLeod HL, Cheverud JM, Ley TJ: A High-Resolution Map of Segmental DNA Copy Number Variation in the Mouse Genome. PLoS Genet. 2007, 3: 9-10.1371/journal.pgen.0030009.View ArticleGoogle Scholar
- Egan CM, Sridhar S, Wigler M, Hall IM: Recurrent DNA copy number variation in the laboratory mouse. Nat Genet. 2007, 39: 1384-1389. 10.1038/ng.2007.19.View ArticlePubMedGoogle Scholar
- Snijders AM, Nowak NJ, Huey B, Fridlyand J, Law S, Conroy J, Tokuyasu T, Demir K, Chiu R, Mao J-H, Jain AN, Jones SJM, Balmain A, Pinkel D, Albertson DG: Mapping segmental and sequence variations among laboratory mice using BAC array CGH. Genome Res. 2005, 15: 302-311. 10.1101/gr.2902505.PubMed CentralView ArticlePubMedGoogle Scholar
- Guryev V, Saar K, Adamovic T, Verheul M, Van Heesch SAAC, Cook S, Pravenec M, Aitman T, Jacob H, Shull JD, Hubner N, Cuppen E: Distribution and functional impact of DNA copy number variation in the rat. Nat Genet. 2008, 40: 538-545. 10.1038/ng.141.View ArticlePubMedGoogle Scholar
- Chen W-K, Swartz JD, Rush LJ, Alvarez CE: Mapping DNA structural variation in dogs. Genome Res. 2009, 19: 500-509.PubMed CentralView ArticlePubMedGoogle Scholar
- Ramayo-Caldas Y, Castelló A, Pena RN, Alves E, Mercadé A, Souza CA, Fernández AI, Perez-Enciso M, Folch JM: Copy number variation in the porcine genome inferred from a 60 k SNP BeadChip. BMC Genomics. 2010, 11: 593-10.1186/1471-2164-11-593.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang X, Nahashon S, Feaster TK, Bohannon-Stewart A, Adefope N: An initial map of chromosomal segmental copy number variations in the chicken. BMC Genomics. 2010, 11: 351-10.1186/1471-2164-11-351.PubMed CentralView ArticlePubMedGoogle Scholar
- Emerson JJ, Cardoso-Moreira M, Borevitz JO, Long M: Natural selection shapes genome-wide patterns of copy-number polymorphism in Drosophila melanogaster. Science. 2008, 320: 1629-1631. 10.1126/science.1158078.View ArticlePubMedGoogle Scholar
- Maydan JS, Lorch A, Edgley ML, Flibotte S, Moerman DG: Copy number variation in the genomes of twelve natural isolates of Caenorhabditis elegans. BMC Genomics. 2010, 11: 62-10.1186/1471-2164-11-62.PubMed CentralView ArticlePubMedGoogle Scholar
- Itsara A, Cooper GM, Baker C, Girirajan S, Li J, Absher D, Krauss RM, Myers RM, Ridker PM, Chasman DI, Mefford H, Ying P, Nickerson DA, Eichler EE: Population Analysis of Large Copy Number Variants and Hotspots of Human Genetic Disease. Am J Hum Genet. 2009, 84: 148-161. 10.1016/j.ajhg.2008.12.014.PubMed CentralView ArticlePubMedGoogle Scholar
- Elsik CG, Tellam RL, Worley KC: The genome sequence of taurine cattle: a window to ruminant biology and evolution. Science. 2009, 324: 522-528.PubMed CentralView ArticlePubMedGoogle Scholar
- Ibeagha-Awemu EM, Kgwatalala P, Ibeagha AE, Zhao X: A critical analysis of disease-associated DNA polymorphisms in the genes of cattle, goat, sheep, and pig. Mamm Genome. 2008, 19: 226-245. 10.1007/s00335-008-9101-5.PubMed CentralView ArticlePubMedGoogle Scholar
- Liu GE, Van Tassel CP, Sonstegard TS, Li RW, Alexander LJ, Keele JW, Matukumalli LK, Smith TP, Gasbarre LC: Detection of germline and somatic copy number variations in cattle. Dev Biol. 2008, 132: 231-237.Google Scholar
- Bae JS, Cheong HS, Kim LH, NamGung S, Park TJ, Chun J-Y, Kim JY, Pasaje CFA, Lee JS, Shin HD: Identification of copy number variations and common deletion polymorphisms in cattle. BMC Genomics. 2010, 11: 232-10.1186/1471-2164-11-232.PubMed CentralView ArticlePubMedGoogle Scholar
- Fadista J, Thomsen B, Holm L-E, Bendixen C: Copy number variation in the bovine genome. BMC Genomics. 2010, 11: 284-10.1186/1471-2164-11-284.PubMed CentralView ArticlePubMedGoogle Scholar
- Seroussi E, Glick G, Shirak A, Yakobson E, Weller JI, Ezra E, Zeron Y: Analysis of copy loss and gain variations in Holstein cattle autosomes using BeadChip SNPs. BMC Genomics. 2010, 11: 673-10.1186/1471-2164-11-673.PubMed CentralView ArticlePubMedGoogle Scholar
- Hou Y, Liu GE, Bickhart DM, Cardone MF, Wang K, Kim E, Matukumalli LK, Ventura M, Song J, VanRaden PM, Sonstegard TS, Van Tassell CP: Genomic characteristics of cattle copy number variations. BMC Genomics. 2011, 12: 127-10.1186/1471-2164-12-127.PubMed CentralView ArticlePubMedGoogle Scholar
- Kijas JW, Barendse W, Barris W, Harrison B, McCulloch R, McWilliam S, Whan V: Analysis of copy number variants in the cattle genome. Gene. 2011, 482: 73-77. 10.1016/j.gene.2011.04.011.View ArticlePubMedGoogle Scholar
- Stothard P, Choi J-W, Basu U, Sumner-Thomson JM, Meng Y, Liao X, Moore SS: Whole genome resequencing of Black Angus and Holstein cattle for SNP and CNV discovery. BMC Genomics. 2011, 12: 559-10.1186/1471-2164-12-559.PubMed CentralView ArticlePubMedGoogle Scholar
- Bickhart DM, Hou Y, Schroeder SG, Alkan C, Cardone MF, Matukumalli LK, Song J, Schnabel RD, Ventura M, Taylor JF, Garcia JF, Van Tassell CP, Sonstegard TS, Eichler EE, Liu GE: Copy number variation of individual cattle genomes using next-generation sequencing. Genome Res. 2012, 22: 778-790. 10.1101/gr.133967.111.PubMed CentralView ArticlePubMedGoogle Scholar
- Dellinger AE, Saw S-M, Goh LK, Seielstad M, Young TL, Li Y-J: Comparative analyses of seven algorithms for copy number variant identification from single nucleotide polymorphism arrays. Nucleic Acids Res. 2010, 38: e105-10.1093/nar/gkq040.PubMed CentralView ArticlePubMedGoogle Scholar
- Colella S, Yau C, Taylor JM, Mirza G, Butler H, Clouston P, Bassett AS, Seller A, Holmes CC, Ragoussis J: QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res. 2007, 35: 2013-25. 10.1093/nar/gkm076.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang K, Li M, Hadley D, Liu R, Glessner J, Grant SF A, Hakonarson H, Bucan M: PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 2007, 17: 1665-1674. 10.1101/gr.6861907.PubMed CentralView ArticlePubMedGoogle Scholar
- Pagnamenta AT, Bacchelli E, De Jonge MV, Mirza G, Scerri TS, Minopoli F, Chiocchetti A, Ludwig KU, Hoffmann P, Paracchini S, Lowy E, Harold DH, Chapman JA, Klauck SM, Poustka F, Houben RH, Staal WG, Ophoff RA, O’Donovan MC, Williams J, Nöthen MM, Schulte-Körne G, Deloukas P, Ragoussis J, Bailey AJ, Maestrini E, Monaco AP: Characterization of a Family with Rare Deletions in CNTNAP5 and DOCK4 Suggests Novel Risk Loci for Autism and Dyslexia. Biol Psychiatry. 2010, 68: 320-328. 10.1016/j.biopsych.2010.02.002.PubMed CentralView ArticlePubMedGoogle Scholar
- Liu GE, Ventura M, Cellamare A, Chen L, Cheng Z, Zhu B, Li C, Song J, Eichler EE: Analysis of recent segmental duplications in the bovine genome. BMC Genomics. 2009, 10: 571-10.1186/1471-2164-10-571.PubMed CentralView ArticlePubMedGoogle Scholar
- Pagnamenta AT, Wing K, Sadighi Akha E, Knight SJL, Bölte S, Schmötzer G, Duketis E, Poustka F, Klauck SM, Poustka A, Ragoussis J, Bailey AJ, Monaco AP: A 15q13.3 microdeletion segregating with autism. Eur J Hum Genet. 2009, 17: 687-692. 10.1038/ejhg.2008.228.PubMed CentralView ArticlePubMedGoogle Scholar
- Cronin S, Blauw HM, Veldink JH, Van Es MA, Ophoff RA, Bradley DG, Van Den Berg LH, Hardiman O: Analysis of genome-wide copy number variation in Irish and Dutch ALS populations. Hum Mol Genet. 2008, 17: 3392-3398. 10.1093/hmg/ddn233.View ArticlePubMedGoogle Scholar
- Griswold AJ, Ma D, Cukier HN, Nations LD, Schmidt MA, Chung R-H, Jaworski JM, Salyakina D, Konidari I, Whitehead PL, Wright HH, Abramson RK, Williams SM, Menon R, Martin ER, Haines JL, Gilbert JR, Cuccaro ML, Pericak-Vance MA: Evaluation of copy number variations reveals novel candidate genes in autism spectrum disorder-associated pathways. Hum Mol Genet. 2012, 21: 3513-23. 10.1093/hmg/dds164.PubMed CentralView ArticlePubMedGoogle Scholar
- Leblond CS, Heinrich J, Delorme R, Proepper C, Betancur C, Huguet G, Konyukh M, Chaste P, Ey E, Rastam M, Anckarsäter H, Nygren G, Gillberg IC, Melke J, Toro R, Regnault B, Fauchereau F, Mercati O, Lemière N, Skuse D, Poot M, Holt R, Monaco AP, Järvelä I, Kantojärvi K, Vanhala R, Curran S, Collier DA, Bolton P, Chiocchetti A, Klauck SM, Poustka F, Freitag CM, Waltes R, Kopp M, Duketis E, Bacchelli E, Minopoli F, Ruta L, Battaglia A, Mazzone L, Maestrini E, Sequeira AF, Oliveira B, Vicente A, Oliveira G, Pinto D, Scherer SW, Zelenika D, Delepine M, Lathrop M, Bonneau D, Guinchat V, Devillard F, Assouline B, Mouren M-C, Leboyer M, Gillberg C, Boeckers TM, Bourgeron T: Genetic and Functional Analyses of SHANK2 Mutations Suggest a Multiple Hit Model of Autism Spectrum Disorders. PLoS Genet. 2012, 8: e1002521-10.1371/journal.pgen.1002521.PubMed CentralView ArticlePubMedGoogle Scholar
- Huang DW, Sherman BT, Lempicki RA: Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009, 4: 44-57.View ArticleGoogle Scholar
- Liu GE, Hou Y, Zhu B, Cardone MF, Jiang L, Cellamare A, Mitra A, Alexander LJ, Coutinho LL, Dell’Aquila ME, Gasbarre LC, Lacalandra G, Li RW, Matukumalli LK, Nonneman D, De A, Regitano LC, Smith TPL, Song J, Sonstegard TS, Van Tassell CP, Ventura M, Eichler EE, McDaneld TG, Keele JW: Analysis of copy number variations among diverse cattle breeds. Genome Res. 2010, 20: 693-703. 10.1101/gr.105403.110.PubMed CentralView ArticlePubMedGoogle Scholar
- Takahashi T, Yamada O, Soares MJ, Hashizume K: Bovine prolactin-related protein-I is anchored to the extracellular matrix through interactions with type IV collagen. J Endocrinol. 2008, 196: 225-234. 10.1677/JOE-07-0069.View ArticlePubMedGoogle Scholar
- Wallis M: The molecular evolution of vertebrate growth hormones: a pattern of near-stasis interrupted by sustained bursts of rapid change. J Mol Evol. 1996, 43: 93-100. 10.1007/BF02337353.View ArticlePubMedGoogle Scholar
- Wallis M: Episodic evolution of protein hormones: molecular evolution of pituitary prolactin. J Mol Evol. 2000, 50: 465-473.PubMedGoogle Scholar
- Durbin RM, Altshuler DL, Abecasis GR: A map of human genome variation from population-scale sequencing. Nature. 2010, 467: 1061-1073. 10.1038/nature09534.View ArticlePubMedGoogle Scholar
- Jakobsson M, Scholz S, Scheet P, Gibbs , VanLiere J, Fung H, Szpiech Z, Degnan J, Wang K, Guerreiro R, Bras J, Schymick J, Hernandez D, Traynor B, Simon-Sanchez J, Matarin M, Britton A, Van De Leemput J, Rafferty I, Bucan M, Cann H, Hardy J, Rosenberg N, Singleton A: Genotype, haplotype and copy-number variation in worldwide human populations. Nature. 2008, 451: 998-1003. 10.1038/nature06742.View ArticlePubMedGoogle Scholar
- Steemers FJ, Chang W, Lee G, Barker DL, Shen R, Gunderson KL: Whole-genome genotyping with the single-base extension assay. Nat Methods. 2006, 3: 31-33. 10.1038/nmeth842.View ArticlePubMedGoogle Scholar
- Ballif BC, Hornor SA, Jenkins E, Madan-Khetarpa S, Surti U, Jackson KE, Asamoah A, Brock PL, Gowans GC, Conway RL, Graham JM, Medne L, Zackai EH, Shaikh TH, Geoghegan J, Selzer RR, Eis PS, Bejjani BA, Shaffer LG: Discovery of a previously unrecognized microdeletion syndrome of 16p11.2–p12.2. Nat Genet. 2007, 39: 1071-1073. 10.1038/ng2107.View ArticlePubMedGoogle Scholar
- Diskin SJ, Li M, Hou C, Yang S, Glessner J, Hakonarson H, Bucan M, Maris JM, Wang K: Adjustment of genomic waves in signal intensities from whole-genome SNP genotyping platforms. Nucleic Acids Res. 2008, 36: e126-10.1093/nar/gkn556.PubMed CentralView ArticlePubMedGoogle Scholar
- Durinck S, Moreau Y, Kasprzyk A, Davis S, De Moor B, Brazma A, Huber W: BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics. 2005, 21: 3439-40. 10.1093/bioinformatics/bti525.View ArticlePubMedGoogle Scholar
- Benjamini Y, Hochberg Y: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J Roy Stat Soc B. 1995, 57: 289-300.Google Scholar
- Livak KJ, Schmittgen TD: Analysis of Relative Gene Expression Data Using Real-Time Quantitative PCR and the 2−ΔΔCT Method. Methods. 2001, 25: 402-408. 10.1006/meth.2001.1262.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.