Mouse husbandry
Mice were housed in polycarbonate cages for a total of 6 weeks after weaning. Water and Purina LabDiet® 5008 chow (23.5% protein; Purina Mills Inc., St Louis MO) were offered ad libitum. A constant temperature (21°C ± 2°C) and humidity (40-70%) were maintained. Mice received 14 h of light, starting at 7:00 AM, and 10 h of dark. Mice were weighed at 14 days, and weaned at 21 days. Tail clips for DNA extractions were collected between 3 and 6 weeks of age. Animals were managed according to the guidelines of the American Association for Accreditation of Laboratory Animals and the Institutional Animal Care and Use Committee (IACUC).
Development of congenic and subcongenic strains and F2 intercrosses
Five overlapping subcongenic strains were developed by selecting a five different recombinant mice from the HG2D F2 cross (Figure 1) [9]. Recombinant male mice were selected and used as founders of HG.CAST–(D2Mit43–D2Mit103) (referred to as HG2D-1), HG.CAST – (D2Mit420 – D2Mit490) (referred to as HG2D-2), and HG.CAST – (D2Mit456 – D2Mit148) (referred to as HG2D-5) strains. Recombinant female mice were selected and used as founders of the HG.CAST – (rs27385876 – rs27703094) (referred to as HG2D-4) and HG.CAST – (D2Mit194 – rs27335891) (referred to as HG2D-3) strains. Recombinant female mice were backcrossed to HG males to produce heterozygous congenic males. A male from these backcrosses was selected to propagate the strain.
To characterize the genotypic effects of the congenic region, F2 crosses were developed for each congenic strain. A total of 1,990 F2 mice were analyzed from five subcongenic strains, 384 for HG2D-1, 208 from HG2D-2, 382 from HG2D-3, 353 from HG2D-4 and 385 from HG2D-5; and 278 from (the founder HG2D founder strain previously analyzed by Farber and Medrano (2007b). Data from all congenics was merged and the analysis was carried out in the merged data set. Development and original analysis of the HG2D founder congenic strain is described in Farber and Medrano [9]. Sex ratios and genotype frequencies were tested with a χ2 test with 1 degree of freedom and did not differ from expected segregation ratios in any of the subcongenic strains (p > 0.05).
Genotyping
Each subcongenic strain was initially genotyped with three microsatellite markers per strain (Additional file 3) using DNA isolated from Proteinase K digested tail clips and genotyped as described by Farber and Medrano [30]. The PCR reactions contained approximately 100ng of DNA (5 μl of diluted lysate), 0.1 units of Taq DNA Polymerase (ABI), 1X PCR Buffer (ABI), 1.5 to 2.0 mM of MgCl2 (Invitrogen, Carlsbad, CA), 0.17 mM of each dNTP (Invitrogen), 1μM of each primer in a total volume of 10 μl. PCR products were analyzed in 4% 0.5 TBE agarose gels containing 0.06 μg/ml EtBr.
To increase resolution for fine mapping after the initial QTL scans with microsatellite data, 366 mice with recombination events between 145 and 181 Mb from the HG2D founder, HG2D-3, HG2D-4 and HG2D-5 strains were genotyped with 48 SNP markers from 145 Mb to the end of MMU 2 at an average density of 1 SNP/~600 Kb based on the Genome reference mm9. Genotyping was carried out using the Sequenom MassArray® platform at GeneSeek Inc. (Lincoln, NE). DNA for SNP genotyping was purified by first incubating 100 μL of undiluted tail lysate with RNAse A at 37°C for 30 min. DNA was then precipitated with 200 μL of cold ethanol (EtOH), and centrifuged at 1300 RPM at 4°C for 15 min. The remaining pellets were washed with 300 μL of 70% EtOH, dried in a SpeedVac centrifuge for 4 min and resuspended in 32 μL of 10mM Tris pH 8. Purified DNA was diluted as necessary to keep the DNA concentration between 30 and 60 ng/μL.
Phenotypic characterization
Phenotypes were selected based on their relevance to body weight (BW) and body fat. Mice were weighed to the nearest 0.1 g at 2, 3, 6 and 9 weeks of age and prior to sacrifice, 63 ± 4 days, (SAC). At sacrifice mice were anesthetized with isoflourane until they lost consciousness and were then placed over a grid without stretching to measure their lengths. Following sacrifice, dissection of the femoral (FFP; subcutaneous fat on the outer thigh), gonadal (GFP; interstitial fat surrounding the testis or uterus and ovaries), mesenteric (MFP; intraperitoneal fat surrounding the gastro-intestinal tract from the duodenum to the start of the rectum) and retroperitoneal (RFP; fat behind the kidney and along the lumbar muscle) fat pads. Tissue weights were collected immediately after dissection. Total Fat mass (TF) was calculated based on the data as the sum of all fat pads. Gonadal Fat Pad, whole brain, pituitary, gastrocnemius muscle and liver were snap frozen in liquid nitrogen and stored at -80o C for future RNA extractions. Additionally collected trunk blood, and weighed liver, spleen, kidney, testis, empty carcass (skinned carcass without organs, fat, gastrocnemius, or tail), and gastrocnemius muscle; and femurs were measured to the nearest 0.1 mm with a Vernier scale. Mice were dissected in accordance with the University of California, Davis IACUC approved protocols. Genotype and phenotype data supporting this work are available at https://github.com/RodrigoGM/Mmu2QTL.git (doi:10.5281/zenodo.12793).
RNA isolation and cDNA preparation
Total RNA was extracted from whole brain and GFP using TRIzol® (Invitrogen, Carlsbad, CA) according to manufacturer’s protocol. Whole brain and GFP were homogenized in 2 ml TRIzol® using a Mini BeadBeater–8 for 5 to 7 sec. For Real Time PCR, complementary DNA (cDNA) was prepared by taking 5 μg of total RNA from brain or GFP and incubating it with DNAse I (Ambion, Austin, TX) to remove any DNA contamination. Then first strand cDNA was synthesized using Superscript III® (Invitrogen, Carlsbad, CA) with poly-T and random primers according to manufacturer’s protocol. A final RNAse H (Ambion) incubation was done to eliminate single stranded RNA.
Traits and statistical analyses
Characterization of subcongenic strains for body weight and body fat traits
Prior to QTL analysis, all subcongenics were independently analyzed using multiple linear regression models for GFP, MFP, RFP, FFP and TF traits. A detailed description of model selection, statistical analysis performed for each subcongenic strains is given in Additional file 1 and LS Means for all fat traits for males and females are in Table 1.
QTL Analysis of body fat using congenic and subcongenic strains
Linkage mapping using interval mapping
QTL analyses were performed on the combined body fat data from the five HG2D subcongenic F2 intercrosses (described above), and the HG2D founder F2 intercross [9]. Initially, linkage analyses were performed in a stepwise procedure first using microsatellite genotypic data (Additional file 2: Figure S2), and then combining genotypic data from 48 SNPs to increase our mapping resolution of the peaks detected with microsatellites (see Genotyping methods). Covariates to correct for known environmental effects were selected based on our Subcongenic Analysis (Additional file 1).
Body fat phenotypes were adjusted for strain, sex and SAC weight. Residuals were used for the QTL analysis using the R/qtl package of the R Language and Environment [31,32]. Genotype probabilities were calculated using the calc.genoprob function at 0.1 Mb intervals. This is known as a step size and is further utilized in the fine mapping stage (see next section). Initial linkage analysis was performed using Interval Mapping (IM) on the adjusted phenotypes with the scanone function using Haley-Knott regression [33,34] over the physical map (NCBI37/mm9 assembly), since a genetic map could not be calculated accurately from the combined genotypic data. Confidence intervals were estimated with the bayesint function at a probability of 0.95. No specific sex × QTL interaction was detected. Significance thresholds were calculated by 1,000 permutations of each trait to estimate a LOD score to declare significance at α = 0.05 and α = 0.01 [35].
Fine mapping QTL using composite interval mapping
To reduce the critical region of the Total Fat (TF) QTL composite interval mapping (CIM) [36,37] was used by applying the cim function of R/qtl using Haley-Knott regression, with a 2 Mb window and 0.2 Mb step size. Confidence intervals for the CIM were estimated using the bayesint function at a probability of 0.95. The cim function in R/qtl imputes missing genotypes using adjacent marker information and uses different markers as covariates in every run. This lead to an unstable location of the QTL peak in multiple sequential CIM runs. To accommodate this instability, the CIM analysis was replicated 400 times with a 2 Mb window (where three markers are used as covariates), and a 0.2 Mb step size in order to identify the most likely QTL peak location on the high density marker map. The Median LOD of the 400 replicates was used to summarize the replicates as the LOD score distribution at the tested marker were skewed due to changes in the imputed genotypes of missing markers being tested or used as covariates (data not shown). In addition, replicated CIM analyses for TF were repeated with a step size of 0, 1, and 0.5 Mb, and a window size of 1, 0.5, and 0.25 Mb arranged in a 3 × 3 factorial design to identify the most frequent location of the Fatq2b QTL peak. In total CIM was replicated 4,000 times (400 times with the initial parameters, and 400 times for each of the 9 factorial groups). The first replicates of CIM with a window of 2 Mb and step of 0.2 Mb serve as a baseline to compare the 3 × 3 factorial. This approach will be referred here as replicated CIM. A significance threshold value for the replicated CIM was not considered. This is because at this stage replicated CIM was used to pin point the most likely location of the peak within an already mapped QTL that is isolated within a 10 Mb congenic strain with phenotypic differences in body fat [38].
Analysis of gene expression using microarrays
The results of Verdugo et al. [16], corresponding to the GSE22042 dataset were used to screen for genes with differential expression in the 2.3 Mb Fatq2b interval. This dataset is a microarray experiment where global gene expression of three tissues (GFP, whole brain, and liver) from 4 cast/cast and 4 b6/b6 F2 mice that were non-recombinant for the entire HG2D congenic donor region was compared. The details of the analysis performed on the microarray data is described in Verdugo et al. [16]. We considered all genes within the confidence intervals of the Fatq2b QTL as differentially expressed genes if p ≤ 0.05 for the genotype effect. The p-values were not corrected for multiple comparisons since we are focused on specific genomic locations and wanted to maximize the number of genes to verify with real time qPCR (RT-qPCR).
Analysis of differential expression in Fatq2b candidate genes using RT-PCR
To determine which of our primary candidate genes for Fatq2b were differentially expressed, gene expression was quantified in Atp5e, Ctsz, Gnas, Rab22a and Stx16, and two housekeeping Gus and SDHA. Gene expression was quantified using real time PCR in 20 b6/b6 (7 females, 13 males), 20 b6/cast (11 females, 9 males) and 20 cast/cast (11 females, 9 males) mice of the HG2D-4 strain, and 21 b6/b6 (11 females, 10 males), 20 b6/cast (10 females, 10 males) and 19 cast/cast (9 females, 10 males) of the HG2D-5 using FAST SYBR Green® or “Standard” SYBR Green® detectors for brain and GFP, respectively. Default PCR cycling times were used for each detector. Reactions were optimized to 1X FAST (or standard) SYBR Green® Master Mix (Applied Biosystems, Foster City, CA), 200nM of each primer, and 50 ng of cDNA in a total volume of 12 μL for the HG2D-4 samples and 20 μL for the HG2D-5 samples.
Primer design
The mRNA sequence for each gene was re-sequenced with overlapping primers in two HG2D-4 homozygous mice, one cast/cast and one b6/b6 homozygous mouse. Accession numbers for the transcripts that were re-sequenced are shown in Additional file 4.
Primers used for SYBR green gene expression assays were designed from our CAST and B6 mRNA sequences using Primer Express® v.2.0 (Applied Biosystems, Foster City, CA) to ensure that the primer was not designed over a SNP not previously reported in dbSNP. These genes were used based on three lines of information: 1) Differential gene expression between C57BL/6J and CAST/EiJ in microarray experiments, 2) localization within the confidence interval from the replicated CIM analysis, and 3) association with growth or obesity based on the results of transgenic and/or knockout experiments. Any gene meeting at least one of these criteria was considered as a putative candidate.
The comparative Ct method was used to asses differential expression across genotypes [39]. Briefly, ∆∆Ct calculations were performed using the ddCt package from Bioconductor in the R: Language and Environment [40,41] as follows: the median Ct value of two housekeeping genes were used for the estimation of the ∆Ct, thus ∆Ct was estimated as CtTarget – Ctmedian(Gus, SDHA) in both strains. The reference sample for the ∆∆Ct was the mean Ct value between a pool of B6 Females and B6 Males, this was used only in the HG2D-4 strain. For the HG2D-5 a control sample was used instead. Finally, relative gene expression was estimated as 2–∆∆Ct. The 2–∆∆Ct had different variances among genotypes, thus data was analyzed using a natural log (ln) transformation. Finally, to address differential expression across genotypes, gene expression as ln(2–∆∆Ct) was fitted to a linear model that accounted for the fixed effects of sex and genotype, and using PROC GLM in SAS® v. 9.2 (SAS Institute, Cary, NC). P-values were adjusted for multiple comparisons using the SIMULATE adjustment of LSMEANS statement using a sample size (nsamp) of 100,000 in SAS [42]. Comparisons between strains were not performed, as focus was placed to compare genotypes within each line.
Analyses in conserved non-coding and promoter sequence analyses of Fatq2b
The entire 2.3 Mb genomic interval of Fatq2b was compared to the human, and dog genomes on the Vista Genome Browser to identify Conserved Non-coding Sequences (CNS) and conserved promoter regions and gene elements (UTR, exons) [43]. Also, the entire 2.3 Mb region and its intergenic CNS were screened for micro RNA (miRNA) from miRBase (Rel. 15) [44] using BLAST with default search parameters. Thresholds for considering a putative miRNA site were an e-value cut off of 0.01, alignment length greater than 80 bp and identity greater than 90%.
Three thousand base pairs upstream of the first exon or UTR were considered as the promoter region for each gene. These promoter sequences and the CNS within it were screened for putative Transcription Factor Binding Sites (TFBS) using the MATCH™ tool, which uses positional weight matrices from TRANSFAC® [45]. SNP and In/Dels polymorphic between CAST/EiJ and C57BL/6J were obtained from the Sanger Institute Mouse Genomes Project [46].