Overlapping mouse subcongenic strains successfully separate two linked body fat QTL on distal MMU 2

Background Mouse chromosome 2 is linked to growth and body fat phenotypes in many mouse crosses. With the goal to identify the underlying genes regulating growth and body fat on mouse chromosome 2, we developed five overlapping subcongenic strains that contained CAST/EiJ donor regions in a C57BL/6Jhg/hg background (hg is a spontaneous deletion of 500 Kb on mouse chromosome 10). To fine map QTL on distal mouse chromosome 2 a total of 1,712 F2 mice from the five subcongenic strains, plus 278 F2 mice from the HG2D founder congenic strain were phenotyped and analyzed. Interval mapping (IM) and composite IM (CIM) were performed on body weight and body fat traits on a combination of SNP and microsatellite markers, which generated a high-density genotyping panel. Results Phenotypic analysis and interval mapping of total fat mass identified two QTL on distal mouse chromosome 2. One QTL between 150 and 161 Mb, Fatq2a, and the second between 173.3 and 175.6 Mb, Fatq2b. The two QTL reside in different congenic strains with significant total fat differences between homozygous cast/cast and b6/b6 littermates. Both of these QTL were previously identified only as a single QTL affecting body fat, Fatq2. Furthermore, through a novel approach referred here as replicated CIM, Fatq2b was mapped to the Gnas imprinted locus. Conclusions The integration of subcongenic strains, high-density genotyping, and CIM succesfully partitioned two previously linked QTL 20 Mb apart, and the strongest QTL, Fatq2b, was fine mapped to a ~2.3 Mb region interval encompassing the Gnas imprinted locus. Electronic supplementary material The online version of this article (doi:10.1186/s12864-014-1191-8) contains supplementary material, which is available to authorized users.


Background
Mouse mutant models have been extensively used to identify candidate genes regulating complex traits, such as growth and body fat [1][2][3][4], providing insight into the metabolic pathways regulating these traits. In the present study we used the high growth (hg) mouse mutant model in combination with chromosome 2 (MMU2) congenic and subcongenic strains to map genes that affect body composition, and genes that may interact with the Growth Hormone/IGF signaling pathways [5][6][7]. Briefly, high growth mice are 30-50% larger than wild type littermates with body composition and organ weights proportional to their body weight. These differences in growth rate are due to a spontaneous deletion of three genes (Socs2, Raidd/Cradd and Plexin C1) on mouse chromosome 10. The key gene regulating growth in this mouse model is Socs2, which is part of the Jak/Stat signaling pathway and mediates the actions of Growth Hormone/ IGF pathways.
Major QTL affecting body weight (Wg2), carcass ash (Cara1) and carcass protein (Carp1) were previously identified in our laboratory on mouse chromosome 2 between 115 Mb and 150 Mb using a C57BL/6J hg/hg (HG) × CAST/EiJ (CAST) F2 intercross [2] -"hg" refers to the hg locus, i.e. the deletion itself; whereas HG refers to the high growth mouse on a C57BL/6J background, i.e. C57BL/6J hg/hg . Mouse chromosome 2 has a high density of genes, many affecting relevant metabolic processes such as appetite (Agouti, Mc3r, Scg5), energy expenditure and storage (Pcsk2, Atp5e, Hnf4α, Pck1), secondary signaling cascades, and transport proteins (Gnas, Rab22a, Vapb). The high density of plausible functional candidate genes makes it difficult to ascertain if the QTL is the product of a single gene with one large effect, or the combined effect of several genes in close proximity each with a small effect. Thus, to further study MMU2, congenic strains were developed by introgressing CAST/EiJ segments into the backgrounds of HG and C57BL/6J (B6) to fine map QTL identified by Corva et al. and Farber et al. [2,8]. Among the MMU2 congenics developed in our laboratory, the HG.CAST-(D2Mit329-D2Mit457) (HG2D) is of particular importance as it contains Fatq1, and Fatq2, Wg2, Wg5 and Wg6 QTL within its donor region [2,8]. This strain was used to develop a panel of subcongenic strains on an HG background whose donor regions target the peaks of these previously identified QTL.
Here we report a significant improvement in the mapping resolution of the Fatq2 QTL using the combined data from five overlapping CAST subcongenic strains and the data from a HG2D F2 intercross [9] which provides strong evidence that Fatq2 is the result of at least two independent QTL more or less 20 Mb apart in mice homozygous for hg. Furthermore, we report an integrative approach used to fine map one of these QTL, Fatq2b, to a critical interval of 2.3 Mb containing the "Gnas imprinted locus" on distal MMU2.

Results
A graphical overview of the congenic phenotypic effects is presented in Figure 1, showing the location of the CAST allele donor regions on both the founder congenic (HG2D) and five subcongenic strains. The figure also summarizes the significant phenotypes from each subcongenic (LS Means and CAST allelic effects are found in the Table 1), and the location of body fat QTL previously mapped by Farber and Medrano in the HG2D founder strain [9]. Unless stated in the legend and on the x-axis, all figures display results between 74.9 to 181 Mb on mouse chromosome 2, which represents the CAST donor region in the HG2D founder congenic.

QTL partitioning and fine mapping analyses QTL partitioning of body fat mass to individual subcongenic regions
Additive effects for each subcongenic strain were estimated by multiple linear regression as described in the supplemental methods (Additional file 1). Significant additive effects of CAST alleles on Total Fat (TF), Gonadal (GFP), Mesenteric (MFP), Retroperitoneal (RFP) and Femoral (FFP) fat pad weights were observed among the five subcongenic strains ( Figure 1, Table 1). Differences The panel of HG2D-subcongenic strains was developed by selecting recombinant mice from an HG2D F2 intercross (HG2D donor region shown in green). The top horizontal lines in red represent body fat QTL identified previously in the HG2D F2 (Fatq1, Fatq2) (QTL peak is indicated by a circle) [9]. Solid boxed areas represent regions with known alleles (green or yellow = CAST/EiJ, and gray = C57BL/6J). Textured areas correspond to recombinant ends with unknown genotype. Below the five HG2D subcongenics the arrows indicate the direction of the additive genetic effects of the CAST alleles at distinct genomic intervals on each fat pad trait. These analyses were conducted in each subcongenic separately. See Additional file 1 for more details. Markers, locations on mm9 and putative candidate genes are shown below on the black horizontal line, corresponding to MMU 2, blue arrows indicate the approximate location of the genes. in fat distribution were observed across the congenics (described in a separate paragraph), but in general CAST alleles tended to decrease all fat pad weights ( Table 1). Thus only the Total Fat (TF: the sum of all fat pad weights) results are discussed in depth. The CAST alleles significantly reduced TF in HG2D-3, HG2D-4 and HG2D-5 ( Figure 1, Table 1). Only the additive effect was significant (p < 0.009) in these strains, i.e. the dominace effect was not significant. A sex × additive (sex × a) interaction was detected only in HG2D-2, where CAST alleles increase TF in male mice. These phenotypic results support the location of Fatq1 and Fatq2 [9], and suggest a QTL × sex interaction for Fatq1. In contrast, phenotypic results of the subcongenics HG2D-3 and HG2D-5 demonstrate that what was previously described as Fatq2 is actually two QTL: Fatq2a and Fatq2b ( Figure 2). Farber and Medrano [9] previously reported that these QTL are sex-biased, and in these subcongenics it is mainly observed in HG2D-3. The additive effect of CAST alleles was similar in males from HG2D and HG2D-4, whereas females of HG2D show a greater reduction in TF than any other strain. From a theoretical standpoint additive effects  observed in HG2D should be the cumulative sum of the independent additive effects observed in the subcongenic strains (a HG2D = a HG2D-1 + a HG2D-2 + a HG2D-3 + a HG2D-4 + a HG2D-5 + e). HG2D females show a reduction of −0.175 g TF, 1.2x the sum of HG2D-3, HG2D-4 and HG2D-5 (−0.216g = −0.057g + −0.101g + −0.058g, respectively) suggesting an undetected transgressive QTL that may control the remaining −0.041g [10,11]. In contrast, HG2D males have a reduction of −0.099 g TF, which is roughly the equivalent to the sum of HG2D-2, HG2D-3, HG2D-4 and HG2D-5 (−0.0803g = +0.072g + −0.012g + −0.095g + −0.045g, respectively). The same can be said for HG2D-4, where a reduction of −0.095 g TF in HG2D-4 is roughly 1.3x the sum of HG2D-3 and HG2D-5 (−0.057 g = −0.012 g -0.045 g), suggesting an additional QTL on HG2D-4 that accounts for the remaining −0.038 g (Additional file 2: Figure S1). Differences in the distribution of body fat were also observed across strains. The HG2D-3 strain had significant reductions in GFP and MFP, whereas no significant differences were found in RFP and FFP between cast/cast and b6/b6 littermates. In contrast, HG2D-4 and HG2D-5 show significant reductions in GFP, RFP and FFP, but no significant differences in MFP between cast/cast and b6/b6 littermates. Thus, loci within the unique donor region of the HG2D-3 strain primarily regulate MFP since this is the only strain with significant differences in MFP. All other fat pads (GFP, RFP and FFP) are primarily regulated by the unique donor region of the HG2D-4 and its overlap with the HG2D-5 strain, suggesting that loci regulating these fat pads are contained within the overlapping regions of these congenics (Figures 1 and 2, and Table 1).

Linkage analysis of Fatq2
Individual interval mapping of the four fat pads and Total Fat reveal the presence of two QTL. The strongest QTL affecting all fat pads was located in the overlap region of strains HG2D-4 and HG2D-5 at 174 Mb (LOD = 14, referred to as Fatq2b) with a 95% confidence interval (CI) between 170.3 and 175.7 Mb ( Figure 2) and a second QTL affecting total body fat and body weight was identified with a peak at 155 Mb (LOD = 5.5, referred to as Fatq2a) within the overlap of the HG2D-3 and HG2D-4 strains (CI: 150 to 161 Mb) ( Figure 2). These results suggest the presence of two QTL segregating in the HG2D-4 strain that explain the effects of Fatq2.
This locus was previously observed as one large QTL by Farber and Medrano [9], however, the fine mapping provides evidence that Fatq2 is the result of two QTL that flank the peak of the original Fatq2. Thus, suggesting that the peak of Fatq2 in the previous analysis was the combined effects of two independent QTL with additive effects on body fat ( Figure 2). These two QTL reduce total fat in the non-overlapping HG2D-3 and HG2D-5 subcongenics (p < 0.01 for the additive effect of CAST alleles; p ≥ 0.05 when comparing both strains).  Figures S3 and S4), as described in the Fine Mapping methods. These analyses define the peak of Fatq2b to a region containing 30 putative candidate genes that regulate body fat deposition (Table 2), of which 3 are associated with body weight, body fat, and energy metabolism [12][13][14][15].

Differential expression of Fatq2b positional candidates
The QTL Fatq2b contains 14 known genes, and 16 additional transcripts all of which can be considered as putative candidate genes. Verdugo et al. identified Atp5e, Ctsz, Gnas and Rab22a as differentially expressed in adipose tissue and/or brain between B6/B6 and HG2D homozygous congenics (Table 2) [16]. We measured gene expression in brain and GFP using real time qPCR (RT-qPCR) with SYBR green in the four genes identified by Verdugo et al. (Atp5e, Ctsz, Gnas and Rab22a) and also in Stx16. Results show that Ctsz was differential expressed between b6/b6 and cast/cast genotypes in brain tissue from the HG2D-4 strain only (p < 0.05) ( Figure 4). Rab22a showed differential expression among genotypes in both brain and GFP tissues in both the HG2D-4 and HG2D-5 strains (p < 0.001) ( Figure 4). All other tested genes did not show differential expression among genotypes in either tissue.

Identification of putative micro RNA and transcription factor binding sites in Fatq2b
A screen for micro RNA (miRNA) in the entire 2.3 Mb genomic sequence yielded 14 miRNA sequences between 174.09 and 175.25 Mb ( Figure 3C; and Table 3). However, no putuative miRNAs were located within intergenic conserved non-coding sequences using parameters described in the Methods.
The Match™ search for transcription factor (TF) binding sites showed 23 TFBS in the promoter sequences of Rab22a, Gnas and Ctsz. The 3000 bp upstream of the Gnas transcription start site had one region at −2000 bp whose sequence is conserved among mice and human. This region contains three TF binding sites (HNF-1, C/EBP, and HLF) of which only HNF-1 has one SNP (Additional file 2: Figure S5). In humans, one SNP G (−1211)A on the functional promoter of Gnas showed an association with weight loss during a 7 day fasting period. Individuals with the GG genotype had a 5 ± 1.5 Kg weight loss, whereas the AA genotype had a 3.2 ± 1.2 Kg difference [17]. However, Gnas was not found differentially expressed between cast/cast and b6/b6 homozygous in fat and whole brain from both HG2D-4 and HG2D-5 F2 littermates.

Discussion
The present subcongenic experiments provides strong evidence that the Fatq2 QTL, previously reported by Farber and Medrano [9], is composed of two QTL each with small effects. Congenic F2 intercrosses of five subcongenic strains with overlapping MMU2 donor regions were used to fine map Fatq2b, and in combination with high density genotyping in a large population allowed reducing the size of the QTL peaks to 2.3 Mb intervals and the identification of 22 positional candidate genes. Differential expression experiments reported here and known mouse knockout phenotypes suggest Rab22a and Gnas as candidate genes for the Fatq2b QTL, though, additional experiments are required to confirm any gene as the causal gene for Fatq2b.
The original peak of Fatq1 mapped to 136 Mb [9], within the donor region of the HG2D-2 strain. Our results suggest that CAST alleles in this region increase total fat in congenic males by 0.072 ± 0.32 g (p = 0.03) ( Table 1). Other fat QTL, such as Aibl, Epfp1, Mob5, and Scfq1 have also been localized to this region [18][19][20]. However, the effects of the latter QTL on body fat were much larger than the effects observed in the HG2D-2 strain and were not sex dependent. The Fatq1 QTL was originally identified by Farber and Medrano [9] in a population of 270 F2 mice from the HG2D congenic strain. This data was merged with the subcongenic data since the HG2D congenic is comparable to our five strains and a small peak at 140 Mb for FFP (LOD = 2.7), corresponding to Fatq1 (Figure 2).
The original peak of Fatq2 in Farber and Medrano [9] maps to 164 Mb, corresponding to the unique donor region of the HG2D-4 congenic. However, the present analysis suggests that the Fatq2 QTL is the product of two independent QTL; the first QTL located at 156.9 Mb, Fatq2a, and a the second QTL at 174 Mb, Fatq2b. (Figure 2). Results from the HG2D-4 congenic show decreases in GFP, RFP and FFP, consistent with the effects of Fatq2. Subcongenic analysis of this region suggests that the effects of CAST alleles on fat pad weights are greater in the HG2D-4 strain than in the HG2D-5 strain, consistent with the hypothesis that alleles present in the overlap region of the HG2D-4 and the HG2D-5 strain affect these traits. However, this does not completely explain the reductions in body fat by CAST alleles in HG2D-4 ( Figure 1). The significant decrease of TF observed in the HG2D-3, HG2D-4 and HG2D-5 strains is likely the result of cumulative allelic effects on individual fat pads each controlled by more than one QTL (Figure 1).
The combined results from the HG2D-3 and HG2D-4 strain suggest that GFP and RFP are affected by loci present in overlap region of these strains (Figure 1), and MFP is affected by loci within the unique region of the   HG2D-3 strain as HG2D-4 mice did not show significant differences in MFP. Furthermore, CAST alleles present in the HG2D-3 strain reduce GFP in female mice and CAST alleles from the HG2D-4 strain reduce GFP equally in both sexes (Table 1). This suggests a sex-specific regulation of affecting GFP among loci of the Fatq2a QTL present in the unique region of HG2D-3 and HG2D-5. The combined results strongly support the notion that Fatq2 is not the result of a single QTL with one large effect, but rather the combined cumulative effect of at least two QTL each with a smaller contribution to TF, shown by the reduced TF of cast/cast mice from the Differential expression of Ctsz in whole brain of was stronguest in HG2D-4 males (data not shown); though both sexes were differentially expressed (p < 0.05). Rab22a was differentially expressed in both whole brain and GFP in both the HG2D-4 and the HG2D-5 strains (p < 0 .001). Brackets represent the comparisons being made with the corresponding p-value *p = 0.05, **p < 0.01, and ***p < 0.001. HG2D-3 and HG2D-5 strains (Figure 2). With the current panel of subcongenic strains the presence or absence of a third QTL at 163.5 Mb -the location of the original Fatq2 QTL cannot be confirmed as it is not uniquely isolated with a subcongenic strain. The physical limit of Fatq2b is the HG2D-5 donor region, which spans MMU 2 from 168.5 to 178.5 Mb. Though, by integrating results from the linkage analysis, the most significant region is a 2.3 Mb interval between 173.3 Mb and 174.6 Mb. This region contains 22 known genes, of which gene expression of Rab22a, Stx16, Atp5e, Gnas and Ctsz were quantitated by RT-qPCR in GFP and whole brain from 60 HG2D-4 F2, and 60 HG2D-5 F2 mice. Ctsz showed differential expression in brain of HG2D-4 mice only, whereas Rab22a showed differential expression in both tissues and both strains analyzed ( Figure 4). The region proximal to the peak of Fatq2b, between 168.7 to 173.3 Mb, contains 27 predicted genes, 23 known genes, 13 Riken transcripts and one microRNA. From the 23 known genes and 13 Riken transcripts; Dok5, Mc3r, Bmp7, and Pck1 have been associated with obesity [21][22][23][24]. The region distal to the Fatq2b peak, between 174.6 and 178.5 Mb, has a segmental duplication that may be polymorphic in copy number between C57BL/6J and CAST/EiJ. Differential expression of the genes listed above was first checked in microarrays between cast/cast congenic and control littermates. It is known that microarrays have limitations for QTL gene discovery, therefore, other positional candidate genes should not be discarded based on expression data alone [16]. In this study data from microarrays was used for an initial screening of differentially expressed genes that were later analyzed by RT-PCR in this study, but untested genes are still considered as plausible candidates.
Fine Mapping analyses, combined with the HG2D microarray and the RT-qPCR expression data in whole brain and GFP, suggest Rab22a as a plausible candidate for the Fatq2b QTL affecting body fat. However, published literature also shows body weight and obesity phenotypes between Gnas knockout and control mice [25], and has been implicated with performace traits in cattle [26], and carcass quality in pigs [27]. Other possible explanations for the observed phenotypic variation are possible SNP in the regulatory regions of the genes or variation at the transcription factor binding sites that may change the affinity of the transcription factor to regulate a gene [28]. With our current data it is not possible to discard genes outside the QTL peak that remain within the HG2D-5 congenic. However, the replicated CIM approach encourages further analysis of a 0.6 Mb region around D2Mit213. Though, the accuracy of replicating CIM to positionally clone individual genes has not been fully investigated, this method prioritized seven genes among the 51 known genes contained in the HG2D-5 strain. Several sources contributed information to the fine mapping of Fatq2b and support the use of this approach as an entry point to prioritize candidate genes in a QTL with a high density of genes.
The statistical methods IM and CIM used to fine map Fatq2b have several limitations when applied to mapping using subcongenic strains [29]. These are 1) the inability to detect epistasis, 2) the generation of ghost QTL, and 3) over fitting the analysis using too many markers as covariates. The inability to detect epistasis is caused by changing the native genotypes of interacting genes in the engineered panel of subcongenics, thus, changing the interaction itself or confounding the epistatic nature of the interaction to an additive effect in the individual subcongenics. Furthermore, any epistatic interactions with background alleles outside the congenic region are undetectable. The generation of ghost QTL is of major concern. However, this is unlikely in this experiment as the two identified QTL are observed in two independent subcongenics. Though, at this stage it is still unknown if a single gene is responsible for each QTL or if it is the cumulative effect of several small effects from many genes. Lastly, over fitting the QTL model by adding too many covariate markers. Over fitting the CIM may result in the increase of false positives. For this study, three covariate markers were used in all CIM analysis. Decreasing to two markers or increasing up to five markers did not have a significant effect in the results presented here.
The novel approach of replicated CIM and using a summary statistic of the LOD scores to generate a LOD profile has not been reported in the scientific literature. This approach averages the results of multiple CIM runs each with different imputation of missing markers and with unique covariate markers, resulting only one or very few markers with detectible signals after. Changing parameters such as step size and window size provides an assesment of the robustness of the location of the peak. However, once LOD scores are averaged, only those robust markers, consistently having high LOD scores, will result in a peak in the LOD profile. This can explain why only the strongest QTL, Fatq2b, was able to retain median LOD score of 3 after the 4,000 replications of CIM. Despite that this method has not been used elsewhere and the lack of alternative methods for subcongenic strains, the results obtained by this approach suggest further investigation of the "Gnas imprinted locus" and the effects that CAST alleles have on obesity at this locus.
The 0.6 Mb peak of Fatq2b suggested by the replicated CIM is the combined result of having a large mapping population (1,990 F2 mice), a high density SNP panel (1 SNP/600 Kb), and the replication of CIM with several parameter combinations. Though, Fatq2b has several genes involved in energy metabolism, protein transport and GTP cell signaling; analyses of gene expression in five genes suggests Rab22a as candidates, and the obesity phenotype of the Gnas knockout suggests these two genes as strong candidates for Fatq2b, though further analyses are required to confirm and ascertain the biological mechanisms of these genes in order to understand how Fatq2b exerts its effects on body fat.

Conclusions
An integrative approach that included congenic and subcongenic QTL analysis, a large mapping population, and high density genotyping successfully partition one major QTL, Fatq2, into at least two QTL, and fine mapped the Fatq2b QTL on distal chromosome 2 to 174.5 Mb. The use of replicated composite interval mapping provided a 0.6 Mb region corresponding to the peak of Fatq2b. The confidence interval of Fatq2b contained several genes for which gene expression was measured, of these; Rab22a and Gnas are positional candidates that show differential gene expression for Fatq2b.

Mouse husbandry
Mice were housed in polycarbonate cages for a total of 6 weeks after weaning. Water and Purina LabDiet® 5008 chow (23.5% protein; Purina Mills Inc., St Louis MO) were offered ad libitum. A constant temperature (21°C ± 2°C) and humidity (40-70%) were maintained. Mice received 14 h of light, starting at 7:00 AM, and 10 h of dark. Mice were weighed at 14 days, and weaned at 21 days. Tail clips for DNA extractions were collected between 3 and 6 weeks of age. Animals were managed according to the guidelines of the American Association for Accreditation of Laboratory Animals and the Institutional Animal Care and Use Committee (IACUC).
To characterize the genotypic effects of the congenic region, F2 crosses were developed for each congenic strain. A total of 1,990 F2 mice were analyzed from five subcongenic strains, 384 for HG2D-1, 208 from HG2D-2, 382 from HG2D-3, 353 from HG2D-4 and 385 from HG2D-5; and 278 from (the founder HG2D founder strain previously analyzed by Farber and Medrano (2007b). Data from all congenics was merged and the analysis was carried out in the merged data set. Development and original analysis of the HG2D founder congenic strain is described in Farber and Medrano [9]. Sex ratios and genotype frequencies were tested with a χ 2 test with 1 degree of freedom and did not differ from expected segregation ratios in any of the subcongenic strains (p > 0.05).

Genotyping
Each subcongenic strain was initially genotyped with three microsatellite markers per strain (Additional file 3) using DNA isolated from Proteinase K digested tail clips and genotyped as described by Farber and Medrano [30]. The PCR reactions contained approximately 100ng of DNA (5 μl of diluted lysate), 0.1 units of Taq DNA Polymerase (ABI), 1X PCR Buffer (ABI), 1.5 to 2.0 mM of MgCl 2 (Invitrogen, Carlsbad, CA), 0.17 mM of each dNTP (Invitrogen), 1μM of each primer in a total volume of 10 μl. PCR products were analyzed in 4% 0.5 TBE agarose gels containing 0.06 μg/ml EtBr.
To increase resolution for fine mapping after the initial QTL scans with microsatellite data, 366 mice with recombination events between 145 and 181 Mb from the HG2D founder, HG2D-3, HG2D-4 and HG2D-5 strains were genotyped with 48 SNP markers from 145 Mb to the end of MMU 2 at an average density of 1 SNP/~600 Kb based on the Genome reference mm9. Genotyping was carried out using the Sequenom MassArray® platform at GeneSeek Inc. (Lincoln, NE). DNA for SNP genotyping was purified by first incubating 100 μL of undiluted tail lysate with RNAse A at 37°C for 30 min. DNA was then precipitated with 200 μL of cold ethanol (EtOH), and centrifuged at 1300 RPM at 4°C for 15 min. The remaining pellets were washed with 300 μL of 70% EtOH, dried in a SpeedVac centrifuge for 4 min and resuspended in 32 μL of 10mM Tris pH 8. Purified DNA was diluted as necessary to keep the DNA concentration between 30 and 60 ng/μL.

Phenotypic characterization
Phenotypes were selected based on their relevance to body weight (BW) and body fat. Mice were weighed to the nearest 0.1 g at 2, 3, 6 and 9 weeks of age and prior to sacrifice, 63 ± 4 days, (SAC). At sacrifice mice were anesthetized with isoflourane until they lost consciousness and were then placed over a grid without stretching to measure their lengths. Following sacrifice, dissection of the femoral (FFP; subcutaneous fat on the outer thigh), gonadal (GFP; interstitial fat surrounding the testis or uterus and ovaries), mesenteric (MFP; intraperitoneal fat surrounding the gastro-intestinal tract from the duodenum to the start of the rectum) and retroperitoneal (RFP; fat behind the kidney and along the lumbar muscle) fat pads. Tissue weights were collected immediately after dissection. Total Fat mass (TF) was calculated based on the data as the sum of all fat pads. Gonadal Fat Pad, whole brain, pituitary, gastrocnemius muscle and liver were snap frozen in liquid nitrogen and stored at -80 o C for future RNA extractions. Additionally collected trunk blood, and weighed liver, spleen, kidney, testis, empty carcass (skinned carcass without organs, fat, gastrocnemius, or tail), and gastrocnemius muscle; and femurs were measured to the nearest 0.1 mm with a Vernier scale. Mice were dissected in accordance with the University of California, Davis IACUC approved protocols. Genotype and phenotype data supporting this work are available at https://github.com/RodrigoGM/Mmu2QTL.git (doi:10.5281/zenodo.12793).

RNA isolation and cDNA preparation
Total RNA was extracted from whole brain and GFP using TRIzol® (Invitrogen, Carlsbad, CA) according to manufacturer's protocol. Whole brain and GFP were homogenized in 2 ml TRIzol® using a Mini BeadBeater-8 for 5 to 7 sec. For Real Time PCR, complementary DNA (cDNA) was prepared by taking 5 μg of total RNA from brain or GFP and incubating it with DNAse I (Ambion, Austin, TX) to remove any DNA contamination. Then first strand cDNA was synthesized using Superscript III® (Invitrogen, Carlsbad, CA) with poly-T and random primers according to manufacturer's protocol. A final RNAse H (Ambion) incubation was done to eliminate single stranded RNA.

Traits and statistical analyses Characterization of subcongenic strains for body weight and body fat traits
Prior to QTL analysis, all subcongenics were independently analyzed using multiple linear regression models for GFP, MFP, RFP, FFP and TF traits. A detailed description of model selection, statistical analysis performed for each subcongenic strains is given in Additional file 1 and LS Means for all fat traits for males and females are in Table 1.
QTL Analysis of body fat using congenic and subcongenic strains Linkage mapping using interval mapping QTL analyses were performed on the combined body fat data from the five HG2D subcongenic F2 intercrosses (described above), and the HG2D founder F2 intercross [9]. Initially, linkage analyses were performed in a stepwise procedure first using microsatellite genotypic data (Additional file 2: Figure S2), and then combining genotypic data from 48 SNPs to increase our mapping resolution of the peaks detected with microsatellites (see Genotyping methods). Covariates to correct for known environmental effects were selected based on our Subcongenic Analysis (Additional file 1).
Body fat phenotypes were adjusted for strain, sex and SAC weight. Residuals were used for the QTL analysis using the R/qtl package of the R Language and Environment [31,32]. Genotype probabilities were calculated using the calc.genoprob function at 0.1 Mb intervals. This is known as a step size and is further utilized in the fine mapping stage (see next section). Initial linkage analysis was performed using Interval Mapping (IM) on the adjusted phenotypes with the scanone function using Haley-Knott regression [33,34] over the physical map (NCBI37/mm9 assembly), since a genetic map could not be calculated accurately from the combined genotypic data. Confidence intervals were estimated with the bayesint function at a probability of 0.95. No specific sex × QTL interaction was detected. Significance thresholds were calculated by 1,000 permutations of each trait to estimate a LOD score to declare significance at α = 0.05 and α = 0.01 [35].

Fine mapping QTL using composite interval mapping
To reduce the critical region of the Total Fat (TF) QTL composite interval mapping (CIM) [36,37] was used by applying the cim function of R/qtl using Haley-Knott regression, with a 2 Mb window and 0.2 Mb step size. Confidence intervals for the CIM were estimated using the bayesint function at a probability of 0.95. The cim function in R/qtl imputes missing genotypes using adjacent marker information and uses different markers as covariates in every run. This lead to an unstable location of the QTL peak in multiple sequential CIM runs. To accommodate this instability, the CIM analysis was replicated 400 times with a 2 Mb window (where three markers are used as covariates), and a 0.2 Mb step size in order to identify the most likely QTL peak location on the high density marker map. The Median LOD of the 400 replicates was used to summarize the replicates as the LOD score distribution at the tested marker were skewed due to changes in the imputed genotypes of missing markers being tested or used as covariates (data not shown). In addition, replicated CIM analyses for TF were repeated with a step size of 0, 1, and 0.5 Mb, and a window size of 1, 0.5, and 0.25 Mb arranged in a 3 × 3 factorial design to identify the most frequent location of the Fatq2b QTL peak. In total CIM was replicated 4,000 times (400 times with the initial parameters, and 400 times for each of the 9 factorial groups). The first replicates of CIM with a window of 2 Mb and step of 0.2 Mb serve as a baseline to compare the 3 × 3 factorial. This approach will be referred here as replicated CIM. A significance threshold value for the replicated CIM was not considered. This is because at this stage replicated CIM was used to pin point the most likely location of the peak within an already mapped QTL that is isolated within a 10 Mb congenic strain with phenotypic differences in body fat [38].

Analysis of gene expression using microarrays
The results of Verdugo et al. [16], corresponding to the GSE22042 dataset were used to screen for genes with differential expression in the 2.3 Mb Fatq2b interval. This dataset is a microarray experiment where global gene expression of three tissues (GFP, whole brain, and liver) from 4 cast/cast and 4 b6/b6 F2 mice that were nonrecombinant for the entire HG2D congenic donor region was compared. The details of the analysis performed on the microarray data is described in Verdugo et al. [16]. We considered all genes within the confidence intervals of the Fatq2b QTL as differentially expressed genes if p ≤ 0.05 for the genotype effect. The p-values were not corrected for multiple comparisons since we are focused on specific genomic locations and wanted to maximize the number of genes to verify with real time qPCR (RT-qPCR).

Primer design
The mRNA sequence for each gene was re-sequenced with overlapping primers in two HG2D-4 homozygous mice, one cast/cast and one b6/b6 homozygous mouse. Accession numbers for the transcripts that were resequenced are shown in Additional file 4.
Primers used for SYBR green gene expression assays were designed from our CAST and B6 mRNA sequences using Primer Express® v.2.0 (Applied Biosystems, Foster City, CA) to ensure that the primer was not designed over a SNP not previously reported in dbSNP. These genes were used based on three lines of information: 1) Differential gene expression between C57BL/6J and CAST/ EiJ in microarray experiments, 2) localization within the confidence interval from the replicated CIM analysis, and 3) association with growth or obesity based on the results of transgenic and/or knockout experiments. Any gene meeting at least one of these criteria was considered as a putative candidate.
The comparative Ct method was used to asses differential expression across genotypes [39]. Briefly, ΔΔCt calculations were performed using the ddCt package from Bioconductor in the R: Language and Environment [40,41] as follows: the median Ct value of two housekeeping genes were used for the estimation of the ΔCt, thus ΔCt was estimated as Ct Target -Ct median(Gus, SDHA) in both strains. The reference sample for the ΔΔCt was the mean Ct value between a pool of B6 Females and B6 Males, this was used only in the HG2D-4 strain. For the HG2D-5 a control sample was used instead. Finally, relative gene expression was estimated as 2 -ΔΔCt . The 2 -ΔΔCt had different variances among genotypes, thus data was analyzed using a natural log (ln) transformation. Finally, to address differential expression across genotypes, gene expression as ln(2 -ΔΔCt ) was fitted to a linear model that accounted for the fixed effects of sex and genotype, and using PROC GLM in SAS® v. 9.2 (SAS Institute, Cary, NC). P-values were adjusted for multiple comparisons using the SIMU-LATE adjustment of LSMEANS statement using a sample size (nsamp) of 100,000 in SAS [42]. Comparisons between strains were not performed, as focus was placed to compare genotypes within each line.

Analyses in conserved non-coding and promoter sequence analyses of Fatq2b
The entire 2.3 Mb genomic interval of Fatq2b was compared to the human, and dog genomes on the Vista Genome Browser to identify Conserved Non-coding Sequences (CNS) and conserved promoter regions and gene elements (UTR, exons) [43]. Also, the entire 2.3 Mb region and its intergenic CNS were screened for micro RNA (miRNA) from miRBase (Rel. 15) [44] using BLAST with default search parameters. Thresholds for considering a putative miRNA site were an e-value cut off of 0.01, alignment length greater than 80 bp and identity greater than 90%.
Three thousand base pairs upstream of the first exon or UTR were considered as the promoter region for each gene. These promoter sequences and the CNS within it were screened for putative Transcription Factor Binding Sites (TFBS) using the MATCH™ tool, which uses positional weight matrices from TRANSFAC® [45]. SNP and In/Dels polymorphic between CAST/EiJ and C57BL/6J