An integrated approach of comparative genomics and heritability analysis of pig and human on obesity trait: evidence for candidate genes on human chromosome 2
© Kim et al.; licensee BioMed Central Ltd. 2012
Received: 7 May 2012
Accepted: 4 December 2012
Published: 19 December 2012
Traditional candidate gene approach has been widely used for the study of complex diseases including obesity. However, this approach is largely limited by its dependence on existing knowledge of presumed biology of the phenotype under investigation. Our combined strategy of comparative genomics and chromosomal heritability estimate analysis of obesity traits, subscapular skinfold thickness and back-fat thickness in Korean cohorts and pig (Sus scrofa), may overcome the limitations of candidate gene analysis and allow us to better understand genetic predisposition to human obesity.
We found common genes including FTO, the fat mass and obesity associated gene, identified from significant SNPs by association studies of each trait. These common genes were related to blood pressure and arterial stiffness (P = 1.65E-05) and type 2 diabetes (P = 0.00578). Through the estimation of variance of genetic component (heritability) for each chromosome by SNPs, we observed a significant positive correlation (r = 0.479) between genetic contributions of human and pig to obesity traits. Furthermore, we noted that human chromosome 2 (syntenic to pig chromosomes 3 and 15) was most important in explaining the phenotypic variance for obesity.
Obesity genetics still awaits further discovery. Navigating syntenic regions suggests obesity candidate genes on chromosome 2 that are previously known to be associated with obesity-related diseases: MRPL33, PARD3B, ERBB4, STK39, and ZNF385B.
KeywordsObesity Synteny Comparative genomics Heritability Back-fat thickness Subscapular skinfold thickness Chromosome 2 Pig Human
Candidate gene approach has been proven to be an extremely powerful and effective method for studying the genetic architecture of complex traits. This approach is, however, criticized for non-replication of results when followed up in subsequent association studies. Traditional candidate gene approach is also greatly limited by its reliance on the priori knowledge about the physiological, biochemical or functional aspects of possible candidates, and unfortunately, the detailed molecular anatomy of most complex traits still remains veiled. Such limitation results in a fatal information bottleneck, and comparative genomics serves as an extended strategy to solve the problem of information bottleneck. This strategy makes the utility of cross-species approach to characterize the effect of putative candidate genes [1, 2].
To date, several comparative analysis studies of human and pig on obesity-related traits have confirmed the human obesity genes affecting fatness traits of pigs (Additional file 1: Table S1). Pig is an exceptional biomedical model related to energy metabolism and obesity in humans . It shares many physiological similarities with humans, making it an optimal species for preclinical experimentation as well . Especially, the back-fat thickness (BFT) of pig is closely related to total body fat so that the study of this trait can provide a unique approach to understanding the causes of human obesity . Fontanesi et al. found association of an SNP in FTO with intermuscular fat deposition in Italian Duroc pigs, confirmed in a later study by themselves and by a recent study of Fan et al. [6–8]. On the other hand, Du and collaborators found association between TCF7L2, one of type 2 diabetes genes, and BFT . Nowacka-Woszuk et al. mapped 13 candidate genes in the pig genome and found most of them were located within known quantitative trait loci (QTL) to confirm the association with pig fatness traits .
Nevertheless, published comparative studies still do not break the barrier of information bottleneck, because it starts with presumed biology of one species and applies it to another species. Thus, we previously conducted local genomic de novo sequencing of a porcine QTL region affecting fatness traits to carry out SNP association study for BFT and related the result to human association study of subscapular skinfold thickness (SUB). This study allowed us to expand the QTL results observed in pig to human common forms of obesity, but it still is a hypothesis-driven genetic approach as it only considered known QTL region instead of the genome. In order to overcome the lack of thoroughness and inclusiveness that candidate gene approach is criticized for, we conducted the genome-wide comparative studies on common form of obesity traits.
Our study is also an integrated analysis, as it adopts the concept of the population-based heritability estimates which can provide a valuable metric of available genetic risk information . GCTA (Genome-wide Complex Trait Analysis) tool implements the method of estimating variance explained by all SNPs and extends the method to partition the genetic variance onto each of the chromosomes. By fitting the effects of all the SNPs as random effects in a mixed linear model (MLM), this tool partially unveil the “missing heritability” problem caused by the inability to detect a large number of common variants with small effects and rare variants with large effects [12, 13].
We integrated heritability analysis and comparative genomics strategy to both identify causal genetic factors in the pig genome and to expand the knowledge of genetic risk factors predisposing to common forms of obesity in humans. Combined strategy would provide a more powerful comprehensive means to counter the criticism of candidate gene studies.
The prevalence of obesity has increased greatly; it has tripled in the last five decades in America, and over 400 million people are obese [14, 15]. The amount of fat in the body, adiposity, is regulated as the process of energy homeostasis, controlled by circulating signals related to the size of the fat mass (adiposity signals) integrated with signals from the gastrointestinal system (satiety signals). Adiposity signals are connected through central autonomic pathways to centers that process satiety signals such as cholecystokinin (CCK). These integrated signals are known to regulate the meal size and body fat . Obesity can cause various healthcare problems, including type 2 diabetes, cardiovascular diseases, hypertension, etc.  For instance, it can lead to the development of insulin resistance, one of the reasons for pancreatic islet β-cell dysfunction and apoptosis, resulting in progression to impaired glucose tolerance, followed by the increased risk of type 2 diabetes [9, 16].
Due to the considerable evidence that obesity, the worldwide epidemic, is highly heritable , numerous studies including Genome-wide association studies (GWAS) have elucidated much of genetic architecture of obesity. Despite extensive efforts to the search of obesity at gene and nucleotide levels, considering the substantial heritability estimate of 40 to 70% , further research is still needed. To this aim, we analyzed obesity related traits of human and pig on genome-wide scale in cross-species approach to identify potential genes susceptible to human obesity disease. We present results of our combined approach of comparative and chromosomal heritability estimate analysis in an effort to elucidate the genetic basis of human obesity.
We used the US National Center for Biotechnology Information (NCBI) site as the source of the H. sapiens genomic sequence (version GRCh37.p5) and Sscrofa10.2 for Pig (Sus scrofa) genome.
The pig genome was mapped onto the human genome, using a large-scale alignment tool, LAST . The chromosomal summary of autosomal pig SNPs mapped onto Sscrofa10.2 is described in (Additional file 2: Table S5). We used the default DNA scoring scheme: match score of 1, mismatch cost of 1, cost of 7 for gap, and gap extension cost of 1. The minimum alignment score was set at 150. Using the sorted alignments (“maf-sort.sh” script), we finally proceeded to “maf-cull” step to remove redundant alignments. The synteny map was drawn with Circos software  using Bundling Links function. We only considered autosomal chromosomes for this study, and the number of syntenic SNPs of pig and human by each chromosome is described in (Additional file 2: Table S4).
The genomic DNAs of pig were genotyped on the Illumina Porcine 60 K SNP Beadchip. We discarded the markers with low MAF (<0.01), significant deviation from Hardy-Weinberg equilibrium (P < 10-3), and low genotype call rate (<95%). This quality-control process left 45,013 autosomal SNPs. The SNP probes were mapped on Sus Scrofa genome 10.2 from NCBI FTP using BLAT .
Using the synteny information provided by LAST alignment tool, we filled each human chromosome regions with corresponding DNA segments of pig and their defined SNPs. Therefore, we defined the pig genome in human chromosome level.
The genomic DNAs of human were genotyped on the Affymetrix Genome-Wide Human SNP array 5.0 containing 500,568 SNPs. Markers (GRCh37) with high missing gene call rate (>5%), low MAF (<0.01) and significant deviation from Hardy-Weinberg equilibrium (P < 10–6) were excluded, leaving a total of 326,262 markers to be examined in 8,842 individuals.
We analyzed inbred Berkshire population, and a total of 14 meat quality traits were measured. Traits include back-fat thickness, carcass weight, meat pH, meat color, muscle shear force, drip loss, heat loss, water holding capacity, and intramuscular fat content, etc. The back-fat thickness, which we used for this specific study, was measured between the 10th and 11th rib. The phenotype (back-fat thickness) was adjusted by the age effect using the linear model of y = b0 + b1 × age + e, and then we standardized the residuals to z-scores, in each sex group separately. We used the age data (days) at the time of slaughter. After excluding samples that do not have age information, we examined in 697 samples.
The subscapular skinfold thickness (human) was also adjusted by age and standardized in each gender and area group (rural and urban cities). Subscapular skinfold thickness is a measurement for upper body fat distribution. It is measured just below the angle of the left scapula with the fold either in a vertical line or slightly inclined . The sampling base for both cohorts is in Gyeonggi Province, close to the capital of the Republic of Korea. We used the data of Korean cohorts (KARE) of 8,842 individuals aged 40 to 69 and analyzed 8,801 samples that had SUB phenotype data available.
GCTA & GWAS
We used the GCTA tool  to calculate heritability for SUB and BFT. We calculated the genetic relationship matrix (GRM) between all pairs of samples using all the autosomal SNPs by “make-grm” option. We then estimated the variance of genetic component, or heritability, for each trait by restricted maximum likelihood analysis. We also estimated variance explained by each chromosome using joint analysis (by multiple GRMs option).
Linear regression analysis was performed in an additive model with the data adjusted by sex and age using PLINK-linear option . GWAS have important limitations, such as the potential for false-positive and false-negative results and for biases related to selection of study participants and genotyping errors . Hence, we may need to be careful to interpret single GWA result. For these factors taken into consideration, we investigated markers significantly identified commonly from both human and pig, although pig data had its own limitations of relatively small population size and low number of SNPs. The Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.7 was used to determine gene-disease association [25, 26].
Results and discussion
Common genes - disease association enriched by DAVID tool
Framingham Heart Study 100 K Project: genome-wide associations for blood pressure and arterial stiffness
SLC9A9, CDH13, CAMK4, TNR, GPC6, EXOC4, C14ORF118
Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes
FTO, JAZF1, TSPAN8, CDKAL1, TCF7L2, CAMK1D
Genes identified on chromosome 2 that possibly alter obesity risk
Mammalian mitochondrial ribosomal proteins
Type 2 Diabetes Mellitus
zinc finger protein
Sudden cardiac arrest
cell cycle/cell division
Type 2 Diabetes, chronic kidney disease
Tyr protein kinase
Measurement error in assessing skinfold thickness in human may be considerably large, and this likely resulted in lower statistical power than that for pigs . Because we only analyzed the sytenic regions of the genome of each species with limited sample size, we had the increased sampling error for the heritability estimate and the reduced power of a conventional GWAS. Also, the size of linkage disequilibrium for pig is known to be much larger than that for human , and this can be a confounding effect when detecting causal variants for pig. Finally, the sequence difference among species especially on nonconding regions is huge that prospective SNPs might not have been identified .
Racial differences in genetic effects for complex traits are frequently debated in clinical and molecular research , and thus our research may have resulted differently if European cohorts were considered instead of Koreans. As much of the work in GWAS has focused on European populations, extending GWAS to different populations may provide new discoveries. In addition, various fat-related traits, including BMI of human and intramuscular fat content of pig, can be evaluated for this type of comparative studies; however, in order to make direct comparison between two species of common traits possible, we focused our research specifically on subscapular skinfold thickness and back-fat thickness.
This work demonstrates a new approach of comparative study as it adopts the concept of an important parameter in genetics, heritability. Heritability is the proportion of phenotypic variation that is attributed to genetic components and thus provides insights into the biological significance of a certain trait. We observed that human chromosome 2 (SSC 3 and 15) explained the largest proportion of heritability for common obesity traits. Therefore, we hypothesized that chromosome 2 is crucial for remained complexities of genetic architecture of obesity. Based on this knowledge, we further investigated this chromosome to suggest candidate markers and genes that possibly control obesity.
Subscapular skinfold thickness
This work was supported by grants (PJ008068 and PJ008116) from Next-Generation BioGreen 21 Program, Rural Development Administration, Republic of Korea. It was also supported by the Swine Genome Sequencing Consortium.
- Zhu M, Zhao S: Candidate gene identification approach: progress and challenges. Int J Biol Sci. 2007, 3 (7): 420-PubMed CentralView ArticlePubMed
- Tabor HK, Risch NJ, Myers RM: Candidate-gene approaches for studying complex genetic traits: practical considerations. Nat Rev Genet. 2002, 3 (5): 391-396. 10.1038/nrg796.View ArticlePubMed
- Lee KT, Byun MJ, Kang KS, Park EW, Lee SH, Cho S, Kim HY, Kim KW, Lee TH, Park JE: Neuronal genes for subcutaneous fat thickness in human and pig are identified by local genomic sequencing and combined SNP association study. PLoS One. 2011, 6 (2): e16356-10.1371/journal.pone.0016356.PubMed CentralView ArticlePubMed
- Vodička P, Smetana JRK, Dvořánková B, Emerick T, Xu YZ, Ourednik J, Ourednik V, Motlík J: The miniature pig as an animal model in biomedical research. Ann N Y Acad Sci. 2005, 1049 (1): 161-171. 10.1196/annals.1334.015.View ArticlePubMed
- Houpt KA, Houpt TR, Pond WG: The pig as a model for the study of obesity and of control of food intake: a review. Yale J Biol Med. 1979, 52 (3): 307-PubMed CentralPubMed
- Fontanesi L, Scotti E, Buttazzoni L, Davoli R, Russo V: The porcine fat mass and obesity associated (FTO) gene is associated with fat deposition in Italian Duroc pigs. Anim Genet. 2009, 40 (1): 90-93. 10.1111/j.1365-2052.2008.01777.x.View ArticlePubMed
- Fontanesi L, Scotti E, Buttazzoni L, Dall’Olio S, Bagnato A, Lo Fiego DP, Davoli R, Russo V: Confirmed association between a single nucleotide polymorphism in the FTO gene and obesity-related traits in heavy pigs. Mol Biol Rep. 2010, 37 (1): 461-466. 10.1007/s11033-009-9638-8.View ArticlePubMed
- Fan B, Du ZQ, Rothschild MF: The fat mass and obesity-associated (FTO) gene is associated with intramuscular fat content and growth rate in the pig. Anim Biotechnol. 2009, 20 (2): 58-70. 10.1080/10495390902800792.View ArticlePubMed
- Du ZQ, Fan B, Zhao X, Amoako R, Rothschild MF: Association analyses between type 2 diabetes genes and obesity traits in pigs. Obesity. 2008, 17 (2): 323-329.View ArticlePubMed
- Nowacka-Woszuk J, Szczerbal I, Fijak-Nowak H, Switonski M: Chromosomal localization of 13 candidate genes for human obesity in the pig genome. J Appl Genet. 2008, 49 (4): 373-377. 10.1007/BF03195636.View ArticlePubMed
- Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A: Finding the missing heritability of complex diseases. Nature. 2009, 461 (7265): 747-753. 10.1038/nature08494.PubMed CentralView ArticlePubMed
- Yang J, Lee SH, Goddard ME, Visscher PM: GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet. 2011, 88 (1): 76-82. 10.1016/j.ajhg.2010.11.011.PubMed CentralView ArticlePubMed
- Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW: Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010, 42 (7): 565-569. 10.1038/ng.608.PubMed CentralView ArticlePubMed
- Thorleifsson G, Walters GB, Gudbjartsson DF, Steinthorsdottir V, Sulem P, Helgadottir A, Styrkarsdottir U, Gretarsdottir S, Thorlacius S, Jonsdottir I: Genome-wide association yields new sequence variants at seven loci that associate with measures of obesity. Nat Genet. 2008, 41 (1): 18-24.View ArticlePubMed
- Woods S, Seeley R: Understanding the physiology of obesity: review of recent developments in obesity research. Int J Obes Relat Metab Disord. 2002, 26: S8-10.1038/sj.ijo.0802211.View ArticlePubMed
- Kahn SE, Hull RL, Utzschneider KM: Mechanisms linking obesity to insulin resistance and type 2 diabetes. Nature. 2006, 444 (7121): 840-846. 10.1038/nature05482.View ArticlePubMed
- Bochukova EG, Huang N, Keogh J, Henning E, Purmann C, Blaszczyk K, Saeed S, Hamilton-Shield J, Clayton-Smith J, O’Rahilly S: Large, rare chromosomal deletions associated with severe early-onset obesity. Nature. 2009, 463 (7281): 666-670.PubMed CentralView ArticlePubMed
- Comuzzie AG, Allison DB: The search for human obesity genes. Science. 1998, 280 (5368): 1374-1377. 10.1126/science.280.5368.1374.View ArticlePubMed
- Kiełbasa SM, Wan R, Sato K, Horton P, Frith MC: Adaptive seeds tame genomic sequence comparison. Genome Res. 2011, 21 (3): 487-493. 10.1101/gr.113985.110.PubMed CentralView ArticlePubMed
- Krzywinski M, Schein J, Birol İ, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA: Circos: an information aesthetic for comparative genomics. Genome Res. 2009, 19 (9): 1639-1645. 10.1101/gr.092759.109.PubMed CentralView ArticlePubMed
- Kent WJ: BLAT—the BLAST-like alignment tool. Genome Res. 2002, 12 (4): 656-664.PubMed CentralView ArticlePubMed
- Tanner JM, Whitehouse RH: Revised standards for triceps and subscapular skinfolds in British children. Arch Dis Child. 1975, 50 (2): 142-145. 10.1136/adc.50.2.142.PubMed CentralView ArticlePubMed
- Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, De Bakker PIW, Daly MJ: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007, 81 (3): 559-575. 10.1086/519795.PubMed CentralView ArticlePubMed
- Pearson TA, Manolio TA: How to interpret a genome-wide association study. JAMA. 2008, 299 (11): 1335-1344. 10.1001/jama.299.11.1335.View ArticlePubMed
- Da Wei Huang BTS, Lempicki RA: Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2008, 4 (1): 44-57. 10.1038/nprot.2008.211.View Article
- Sherman BT, Lempicki RA: Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2009, 37 (1): 1-13. 10.1093/nar/gkn923.PubMed CentralView ArticlePubMed
- Yerle M, Lahbib-Mansais Y, Pinton P, Robic A, Goureau A, Milan D, Gellin J: The cytogenetic map of the domestic pig (Sus scrofa domestica). Mamm Genome. 1997, 8 (8): 592-607. 10.1007/s003359900512.View ArticlePubMed
- Williams F: Reasoning with statistics. 1979, New York: Holt, Rinehart and Winston, vol. 124
- Comuzzie AG, Hixson JE, Almasy L, Mitchell BD, Mahaney MC, Dyer TD, Stern MP, MacCluer JW, Blangero J: A major quantitative trait locus determining serum leptin levels and fat mass is located on human chromosome 2. Nat Genet. 1997, 15 (3): 273-276.View ArticlePubMed
- Rotimi CN, Comuzzie AG, Lowe WL, Luke A, Blangero J, Cooper RS: The quantitative trait locus on chromosome 2 for serum leptin levels is confirmed in African-Americans. Diabetes. 1999, 48 (3): 643-644. 10.2337/diabetes.48.3.643.View ArticlePubMed
- Rasche A, Al-Hasani H, Herwig R: Meta-analysis approach identifies candidate genes and associated molecular networks for type-2 diabetes mellitus. BMC Genomics. 2008, 9 (1): 310-10.1186/1471-2164-9-310.PubMed CentralView ArticlePubMed
- Wang Y, O’Connell JR, McArdle PF, Wade JB, Dorff SE, Shah SJ, Shi X, Pan L, Rampersaud E, Shen H: Whole-genome association study identifies STK39 as a hypertension susceptibility gene. Proc Natl Acad Sci. 2009, 106 (1): 226-231. 10.1073/pnas.0808358106.PubMed CentralView ArticlePubMed
- Adeyemo A, Gerry N, Chen G, Herbert A, Doumatey A, Huang H, Zhou J, Lashley K, Chen Y, Christman M: A genome-wide association study of hypertension and blood pressure in African Americans. PLoS Genet. 2009, 5 (7): e1000564-10.1371/journal.pgen.1000564.PubMed CentralView ArticlePubMed
- Levy D, Ehret GB, Rice K, Verwoert GC, Launer LJ, Dehghan A, Glazer NL, Morrison AC, Johnson AD, Aspelund T: Genome-wide association study of blood pressure and hypertension. Nat Genet. 2009, 41 (6): 677-687. 10.1038/ng.384.PubMed CentralView ArticlePubMed
- Bradley A, Eric V, Stacy M, Ludmila P, Pui-Yan K, Jeffrey O, Zian T: GWAS for discovery and replication of genetic loci associated with sudden cardiac arrest in patients with coronary artery disease. BMC Cardiovasc Disord. 2011, 11 (1): 29-10.1186/1471-2261-11-29.View Article
- Below J, Gamazon E, Morrison J, Konkashbaev A, Pluzhnikov A, McKeigue P, Parra E, Elbein S, Hallman D, Nicolae D: Genome-wide association and meta-analysis in populations from Starr County, Texas, and Mexico City identify type 2 diabetes susceptibility loci and enrichment for expression quantitative trait loci in top signals. Diabetologia. 2011, 54 (8): 2047-2055. 10.1007/s00125-011-2188-3.PubMed CentralView ArticlePubMed
- Miettinen PJ, Ustinov J, Ormio P, Gao R, Palgi J, Hakonen E, Juntti-Berggren L, Berggren PO, Otonkoski T: Downregulation of EGF receptor signaling in pancreatic islets causes diabetes due to impaired postnatal β-cell growth. Diabetes. 2006, 55 (12): 3299-3308. 10.2337/db06-0413.View ArticlePubMed
- Zhang C, Plastow G: Genomic diversity in Pig (Sus scrofa) and its comparison with human and other livestock. Curr Genomics. 2011, 12 (2): 138-10.2174/138920211795564386.PubMed CentralView ArticlePubMed
- Ioannidis JPA, Ntzani EE, Trikalinos TA: ‘Racial’differences in genetic effects for complex diseases. Nat Genet. 2004, 36 (12): 1312-1318. 10.1038/ng1474.View ArticlePubMed
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.