Gene expression identifies metabolic and functional differences between intramuscular and subcutaneous adipocytes in cattle

Background This study used a genome-wide screen of gene expression to better understand the metabolic and functional differences between commercially valuable intramuscular fat (IMF) and commercially wasteful subcutaneous (SC) fat depots in Bos taurus beef cattle. Results We confirmed many findings previously made at the biochemical level and made new discoveries. The fundamental lipogenic machinery, such as ACACA and FASN encoding the rate limiting Acetyl CoA carboxylase and Fatty Acid synthase were expressed at 1.6–1.8 fold lower levels in IMF, consistent with previous findings. The FA elongation pathway including the rate limiting ELOVL6 was also coordinately downregulated in IMF compared to SC as expected. A 2-fold lower expression in IMF of ACSS2 encoding Acetyl Coenzyme A synthetase is consistent with utilisation of less acetate for lipogenesis in IMF compared to SC as previously determined using radioisotope incorporation. Reduced saturation of fat in the SC depot is reflected by 2.4 fold higher expression of the SCD gene encoding the Δ9 desaturase enzyme. Surprisingly, CH25H encoding the cholesterol 25 hydroxylase enzyme was ~ 36 fold upregulated in IMF compared to SC. Moreover, its expression in whole muscle tissue appears representative of the proportional representation of bovine marbling adipocytes. This suite of observations prompted quantification of a set of oxysterols (oxidised forms of cholesterol) in the plasma of 8 cattle exhibiting varying IMF. Using Liquid Chromatography-Mass Spectrometry (LC-MS) we found the levels of several oxysterols were significantly associated with multiple marbling measurements across the musculature, but (with just one exception) no other carcass phenotypes. Conclusions These data build on our molecular understanding of ruminant fat depot biology and suggest oxysterols represent a promising circulating biomarker for cattle marbling.


Background
Deposition of marbling (IMF) fat in cattle is commercially valuable. It has a positive impact on organoleptic properties of meat such as flavour, juiciness and tenderness [1]. On the other hand, the other fat depots including subcutaneous and organ fats do not add value to meat cuts. Excessive amounts of these undesirable depots are often associated with carcasses expressing high levels of IMF. Therefore, there is a continued interest in developing our understanding of the metabolic and functional differences between the various fat depots with a view to better uncouple IMF and SC deposition.
Marbling has been considered a late maturing trait only becoming visible after the other depots, this despite relative rates of increase in IMF being similar to other fat depots [2][3][4]. Furthermore, while breeds of cattle like Wagyu and Hanwoo are pre-disposed to precociously and selectively develop IMF, the underlying genetic, cellular, biochemical and physiological mechanisms have not been well established [5]. We know from previous work that marbling adipocytes tend to be relatively small [6] and comprising more saturated fatty acids [7] compared to those in the SC depot. Developmentally, marbling adipocytes are thought to arise from differentiation and lipid filling of fibroblasts within perimysial connective tissue [8].
In terms of differential metabolism between depots, previous biochemical evidence points to IMF having relatively slow rates of lipogenesis in both cattle [9] and pigs [10] and under certain nutritional circumstances a substrate preference for glucose carbon over acetate when compared to SC [6,11]. Post-weaning diets tailored to these specific metabolic properties of IMF, such as strategic feeding with high energy concentrate, have had mixed success [12] for reasons not certain but which probably include net energy available for tissue deposition. A recent review emphasises castration, digestion and absorption of feed, glucose availability and vitamin A, D and C levels as important factors in marbling development [13]. However, overall it is clear that there is scope for a deeper understanding of ruminant fat depot metabolism and biology that may inform new animal management strategies.
The emergence of genome-wide transcriptome screening technologies provides an opportunity to assess entire biochemical pathways in quantitative detail not yet possible at other levels of biological organisation. Here, we analyse data from 5 bovine fat depots (IMF, SC, intermuscular, kidney and omental), with a particular focus on the IMF versus SC depot comparison. These functional genomic data are one component of a much larger animal experiment exploring cattle genotype by nutritional effects on fat depot biology [12]. Tissue samples for the present study were taken from 26 month-old steers of 3 genotypes, Angus, Hereford and Wagyu x Angus following high energy nutrition in a feedlot for 259 days. The Herefords had relatively low IMF and high SC whereas the Angus and Wagyu x Angus were higher IMF and lower SC.
We explored two analytical approaches both focussing on differential fat depot biology, but with one hypothesis-driven and one hypothesis-free. The former tested the mRNA expression of canonical ruminant fatty acid synthesis and degradation pathways and compared the output against prior biochemical expectation. The latter explored genome-wide patterns of gene expression with an aim of making new metabolic and functional discoveries in an unbiased manner. The across depot genome-wide transcriptome data submitted with this research article represents a uniquely powerful data resource within the field of ruminant fat biology.

Hypothesis-free screen Data driven hierarchical clustering
Each fat depot could be clearly discriminated by gene expression as the data from each breed clustered at the depot level (Fig. 1). Put another way, the gene expression differences between fat depots clearly overwhelm any breed differences within a depot. Moreover, IMF was separated from the other 4 fat depots. Of the remaining 4 depots, Inter and Omen were most closely related, followed by Kid then SC. Given SC appears the most functionally divergent of the 'pure' (i.e. we can make a confident assertion of no muscle contamination) fat depot samples, we elected to compare all depots to SC in turn.

Differential expression (DE) analysis
SC versus all other fat depots We plotted all fat depots minus SC and annotated the extreme differentially expressed (DE) genes (Fig. 2). Two consistent outlier genes by expression profile in SC are HOXA10 and DLK1 (P < 0.01). The log2 normalised mean expression of these two genes in each fat depot is compared in Table 1. Other genes of interest for their extreme expression in at least one depot are TDH, TMEFF2 and CLDN10, also tabulated in Table 1.
We next focussed on the particular IMF versus SC comparison in more detail.
IMF versus SC In the IMF versus SC comparison the most extreme 1% (145 out of 14, 476) downregulated genes in IMF enriched for 'humoral immune response' (Hypergeometric statistic, FDR q-value = 0.00017) based on the presence of genes including CFB, CXCL3, CD163, CD36 and CD163L1). To generate this ranked list we used a modified DE metric called Phenotypic Impact Factor (PIF), which is a product of DE and average abundance across the treatments of interest. The extreme 20 most downregulated genes are shown in Table 2.
From Table 2 it can be seen that various aspects of fat metabolism (SCD, DGAT2, FASN, ACSS2, AGPAT2, CIDEA, G0S2), extracellular matrix biology (TIMP4, SPARC, CCDC80) and some inflammation related genes (CXCL3, CD163) are prominently featured in those transcripts whose expression is lower in IMF than SC. The extreme 5% downregulated genes (PIF) in IMF enriched for 'lipid metabolic process' (FDR q-value = 1.04 e- 16) and 'defense response' (FDR q-value 1.88 e-15) in line with those observations, but these functional enrichments were not as extreme as the top hit 'regulated exocytosis' (FDR q-value = 5.11 e-27).
On the other hand, the upregulated 1% in IMF enriched very significantly for 'muscle system process' (FDR q-value = 3.54 e-46) based on 44 genes generally regarded as either muscle-specific (e.g. myoglobin) or very highly expressed in muscle cells (such as numerous specialised myosin light and heavy chain isoforms as illustrated by MYL2 and MYH2). A more lenient 5% extreme upregulated PIF (or 724 genes) marginally lessens the impact of the muscle specific detection (FDR q-value = 3.44 e-44). Fig. 1 A dendrogram of relationships between the various fat depots based on the expression profiles of 10,000 genes selected at random. The treatment labels are breed (Ang = Angus, Her = Hereford, Wag = Wagyu x Angus), diet (past = pasture, supp = supplement), Kill number (2 or 5) and finally tissue (LD = longissimus dorsi muscle, IMF = intramuscular fat, Inter Musintermuscular fat, Omental = omental fat, Kidney = kidney fat, SC rump = subcutaneous rump fat). The first major split shows that the LD muscle is discriminated from all fat depots, reflecting muscle-specific patterns of gene expression. All fat depots are clearly resolved i.e. the three breeds form fat depot specific clusters in all cases which shows the various depots all possess diagnostic genome-wide expression signatures. IMF was awarded a unique branch within the fat tree, but this is presumably influenced by muscle derived gene expression arising from small amounts of LD muscle contamination It is clear from these functional enrichments that the dissected IMF must have contained a very small amount of longissimus dorsi (LD) muscle contamination not present in the other fat depots. Genes strongly expressed by the muscle cells in the IMF sample therefore appear to make up the majority of the atypical triangle shaped protuberance in the IMF versus SC MA plot (as highlighted by the red circle on Fig. 2a). We have elected not to tabulate the extreme upregulated genes in IMF versus SC, as this would simply return a list of genes dominated by mRNA encoding muscle structural and muscle metabolic proteins.
To better determine the cellular origin (marbling adipocyte versus myocyte) of many of the upregulated genes in the dissected IMF we first computed Differential Expression (DE) between pure LD (nearly all muscle, small amount of IMF) and SC (all fat, no muscle contamination). We then identified genes which had nominally > 4 fold higher expression in LD than SC as likely being transcribed predominantly from muscle cells, not fat cells. This yielded a list of 171 genes (hypergeometric FDR q value of 2.04e-39 for "muscle system process"), which we have then highlighted on the IMF versus SC plot ( Fig. 3 panel a; Additional file 1). The highlighted  genes, which have been identified using a numerical strategy, all clearly fall in the triangular shaped protuberance distorting the overall IMF versus SC MA distribution, indicating that this atypical data distribution is indeed a consequence of muscle contamination. This set of analyses reinforces the conclusion that prima facie those outlier genes in the IMF versus SC are almost certainly driven by the presence of some LD muscle in the 'pure' IMF sample and therefore need to be interpreted cautiously. The impact of the 'contaminating' muscle derived RNA on the expression of most of the remaining genes is harder to foresee, as there will be a continuum of shared expression between the myocytes and adipocytes, depending on the particular gene being investigated. In this submission, we have provided the normalised mean expression for LD in addition to all the fat depots for the 34,227 probes (Additional file 2) so the interested reader can assess the possible impact of contaminating LD on a case by case basis. This is also the reason why we have included the LD normalised mean expressions in each table for comparison. The outcome of a multiple criteria thresholding process is described below ("Structural differences between marbling adipocytes and the other fat depot adipocytes"). This multiple thresholding does allow us to define lists of genes whose expression likely arises from the IMF adipocytes themselves and therefore can be considered biologically informative of the IMF depot.

Mitoproteome
In an effort to explore the behaviour of the mitochondria in our understanding of the metabolism of IMF versus SC metabolism we quantitated the collective expression of those mRNA known to encode mitochondrial proteins. Using the downloaded mitochondrial protein database we matched 886 of these in our data, 595 of which were higher in IMF than SC, but only 287 of which were lower ( Fig. 3 panel b). This is a significant deviation (P = 2.29 e-26) from the null expectation of equilibrium (i.e. symmetrically distributed around 0, with 443 above and 443 below) assessed by binomial distance. This upward skew is consistent with IMF having a higher mitochondrial content and / or mitochondrial activity than SC, but the presence of some high mitochondrial content LD muscle in the IMF depot is presumably influencing the result. Notable among the mitochondrial genes strongly downregulated in IMF compared to SC (despite the probable impact of the contaminating muscle) is PCK2.
Structural differences between marbling adipocytes and the other fat depot adipocytes To better account for and visualise the effect of the contaminating LD on the IMF gene expression we colour coded the Minus Average (MA) plot comparing IMF to SC based on a formal numerical analysis that accounts for the presence of contaminating LD in the IMF sample (Fig. 4). Colour coding each gene individually in this manner visually highlights those genes (such as CH25H) that are most likely higher in IMF than SC because of particularly high expression from the IMF adipocytes per se and not because of the contaminating LD muscle. Yellow / orange dots above 0 on the y axis are colour coded in this manner because they have a higher expression in IMF than LD which implies their observed higher expression in IMF than SC is a true feature of the marbling adipocytes. This logic also applies to the purple dots below 0 on the y axis.
To identify a short list of these genes whose high expression is confidently ascribed to the marbling adipocytes and not contaminating skeletal muscle we used a multiple criteria thresholding approach. To begin with, we asked the question, "which mRNA are more highly expressed in IMF than the average of all other fat depots by <1.32 fold (a difference of 0.4 on the log2 scale) and also much more highly expressed in IMF than LD (>2 fold)." This analysis returns a list of 49 genes whose expression is more confidently derived from marbling adipocytes (Additional file 1). There is no functional enrichment for 'muscle systems process.' Tabulating the top 20 of these ranked on IMF SC Phenotypic Impact Factor (PIF) yields the gene list in Table 3.
In this adapted list whose expression signals are derived from marbling adipocytes there is functional enrichment for components of the cytoskeletal architecture (CNN1, ACTA2, MYH11, ACTB, ACTRT2, SORBS2, KRT18). Other genes of interest include a) CH25H which encodes an enzyme that catalyses the production of a particular oxysterol metabolite b) the microRNA MIR145 and c) CTPS2 which catalyses the production of CTP, a high energy analog to ATP but whose hydrolysis is coupled to a restricted subset of metabolic reactions including glycerophospholipid synthesis. Similarly, querying the full list for those genes a Red dots are those 168 mRNA 4 fold higher in LD than SC indicating the atypical triangular protuberance comes from contaminating LD (b) the 886 mRNA encoding mitochondrial proteins for which we detected matches in our data, blue above 0, red below 0. There is a significant skew upwards (595 above the 0 line) from the null expectation of equilibrium (data centred on 0, with 443 falling on either side of the line) whose expression > 1.68 fold high in IMF than the other fat depots combined and also higher in IMF than LD by any value yields a list of 76 genes (Additional file 1). There is no functional enrichment for 'muscle systems process. ' Moreover, in another analysis directed at the specific IMF versus SC comparison, we reported those genes > 2 fold higher expression in IMF than SC (without considering expression in the other fat depots), and higher expression in IMF than LD (by any amount). This produced a list of 73 genes (Additional file 1) including those encoding proteins in extracellular matrix organisation (GFAP, COL28A1, COL2A1, ITGA8, TNC, SNCA, MYH11, MKX, COL4A6 and TNFRSF11B). Manual curation of the gene list highlighted 3 additional functional groups of interest that are relatively upregulated in IMF versus SC: cholesterol metabolism (PMP2, CH25H, CYP4B1), retinoic acid metabolism (STRA6, MEST) and insulin and carbohydrate metabolism (GRB14, NR4A3 and MOXD1). The expression profiles for the genes within these three functional groupings are shown in Table 4.
Cluster analysis of the normalised mean expression of the 73 genes across the 5 fat depots and LD muscle indicates the expression of this panel of genes is diagnostic of the IMF depot (Fig. 5). This can be contrasted with the original clustering performed on a randomly selected 10,000 genes whose first branch separates the LD muscle, and not IMF, from all the fat depots. The clustering on rows clusters genes who are co-expressed across tissues. Some of the clusters reflect known functional relatedness, with KRT8 with KRT18 reflecting keratin biology and ACTA2 and MYH11 reflecting cytoskeletal biology.
Of those 41 genes we identified as more lowly expressed in IMF than SC (by a minimum of 1.32 fold, but unlikely to be due to low expression in the contaminating LD because IMF expression is lower than LD) (Additional file 1) the most extreme 10 are shown in Table 5. There is no functional enrichment for 'muscle systems process.' Importantly, none of these refined gene lists generated from our multiple criteria approach yield a significant hypergeometric enrichment for 'muscle system process.' This indicates that the multiple criteria thresholding method we have adopted here has successfully eliminated those genes representative of muscle contamination in the IMF sample. The summaries of the number of genes identified by the various single and multiple criteria approaches, and their respective hypergeometric functional enrichments, are found in Tables 6 and 7, respectively.

Hypothesis-driven analysis Lipogenesis in adipocytes
To formally connect these mRNA data to traditional biochemical knowledge, we identified and tabulated the expression profiles of those genes encoding ratelimiting enzymes and other proteins considered influential in the various lipogenic processes (Table 8). This includes the following biochemical processes: precursor transport into the adipocyte cells (glucose and free FA), aspects of intermediate energy metabolism (glycolysis and pyruvate metabolism), de novo FA synthesis, FA elongation, FA desaturation, FA esterification with glycerol and finally the supply of reducing power equivalents.
We can see that some of these canonical lipogenic pathways show clear, consistent patterns of gene expression based on the key enzymes.  Here, we have colour coded each mRNA using a formal numerical approach such that the expression of those highlighted in red and yellow tones (above 0) and purple (below 0) are most likely derived from the IMF adipocytes themselves and not from contaminating LD muscle. To achieve the colour coding we exploited the difference in expression detected between the 'pure' IMF and 'pure' LD muscle samples. For example, if an mRNA has higher expression in IMF than in LD then we conclude it likely derives from the marbling adipocytes. The mRNA encoding CH25H exemplifies this logic. In addition to being higher in IMF than SC, it is also higher in IMF than LD. In terms of exact position on the plot, the mRNA encoding CH25H has an A value of 9.77 and an M value of 1.52 rather consistently downregulated in IMF versus SC. On the other hand, there are patterns of both up and downregulation within other pathways. Glycolytic flux and pyruvate metabolism are two such pathways, comprising key genes exhibiting both higher and lower expressed in IMF than SC. The~4 fold upregulation of PFKM and PKM (i.e. muscle specific isoforms of the enzymes) in IMF versus SC glycolytic flux can most likely be attributed to LD contamination.
We examined a subset of core lipogenic processes in more detail at the whole pathway level. The normalised mean expression data for the FA biosynthesis and FA elongation pathways are shown in Additional file 3. We next manually explored the gene lists, detecting a number of particular isoforms as being upregulated in IMF for esterification and glycerolipid synthesis. These have been tabulated (Table 9) and are contrary to the pathway output (based on GPAM, DGAT) outlined in Table 8. The expression of AGPAT9 (esterification) and MBOAT2 (glycerolipid synthesis) are potentially noteworthy, with a DE in excess of 1.4 fold higher in IMF compared to SC that is not attributable to LD contamination.

qRT-PCR
CH25H was found to be significantly (P < 0.0001; 2 Tailed Mann Whitney U Test) more highly expressed in dissected IMF than SC by~34-fold using the first primer pair (Fig. 6) and 38-fold using the second primer pair, yielding a 36 fold average. The direction of change is the same as for the microarray probe but the absolute difference is~10 fold higher. Table 3 Genes more highly expressed in IMF than all other fat depots by at least 1.32 fold whose expression appears driven by marbling adipocytes (IMF expression greater than 2 fold higher than LD

Analysis of LD muscle in Wagyu x Hereford crosses versus Piedmontese x Hereford crosses
The microarray based expression profile for CH25H in intact mature postnatal LD muscle is higher in Wagyu x Hereford crosses than Piedmontese x Hereford crosses by 20 months of age, the difference in expression increases with increasing developmental time and by 30 months of age is 2 fold higher (Fig. 7). This 2 fold difference very closely approximates the close to 2 fold difference in carcass IMF previously reported (8.8% IMF in Wagyu x Hereford and 5.1% IMF in Piedmontese x Hereford animals). The increasing significance of the observed differences at 20 m, 25 m and 30 m are reflected by P values of 0.478, 0.158 and 0.003 respectively.

Oxysterol metabolite quantitation
The relationships of the oxysterols to the IMF phenotypes are documented below in Table 10 and Fig. 8. The Full SAS output to all 15 phenotypes can be found in Additional file 4. Despite most of the phenotypes provided (8/15) being non-marbling related, at the P < 0.05 threshold, the only phenotypes significantly associated with the various oxysterols are marbling related phenotypes (the one exception being 7α,25-dihydroxycholesterol and Eye Muscle Area with r = 0.72, P = 0.04). This means the non-IMF fat depot phenotypes did not reach significance with any of the quantitated oxysterols.
We have detected 4 positive correlations and 3 negative correlations. In terms of absolute correlations to IMF phenotypes, 24S-hydroxycholesterol's relationship to Eye Round IMF is the top performer (r = 0.91; P < 0.001) (Additional file 4). No significant relationship to loin IMF was detected for any of the oxysterols although 7α,26-diHCO approached significance (r = 0.67; P = 0.0646). Finally, 25-hydroxycholesterol is detected at much higher levels in cattle plasma than in human plasma, consistent with our prediction that the metabolite is largely derived from IMF and humans are essentially a zero IMF species.

Ruminant fat metabolism
In ruminants the adipocytes are the primary lipogenic site. Consequently, we have focussed our study on the metabolic properties of the various fat depots. Within a ruminant adipocyte, a number of biochemical processes play a role in taking the basic metabolic building blocks (namely pre-formed FA, acetate, D-3 hydroxybutyrate and glucose) from the circulation and converting them into mature TAG. Using a genome-wide transcriptome approach we find evidence for coordinate downregulation of lipogenesis in IMF compared to SC in line with expectation. For example, de novo FA synthesis (FASN and ACACA 1.63-1.79 fold), FA elongation (ELOVL6 1.61 fold), desaturation (SCD 2.2 fold), supply of reducing power (G6PD, ME1 and PGD 1.33, 1.43 and 1.51 fold) and esterification (GPAM and DGAT2 1.3 to 1.82 fold) are rather consistently downregulated in IMF versus SC. However, elevated expression of MBOAT2 and AGPAT9 does complicate the picture for our understanding of esterification and glycerolipid synthesis in IMF. Table 4 Gene expression patterns of those genes identified by multiple criteria (>2 fold higher expression in IMF than SC and also higher expression in IMF than LD) that encode proteins involved in cholesterol metabolism, retinoic acid metabolism and carbohydrate metabolism. CH25H not shown here as its expression profile is documented in In a given pathway we find that the rate limiting enzyme is sometimes among the member(s) of the pathway subject to the highest DE (e.g. FASN and ACACA in FA synthesis and ELOVL6 in elongation). Intriguingly, a number of metabolic pathways (lipolysis, pyruvate metabolism, pentose phosphate pathway) have complex patterns of up and down regulated genes between IMF and SC which are challenging to interpret. The major intracellular organelles involved in regulating this collection of lipogenic processes include the endoplasmic reticulum, the peroxisome and the mitochondria and we observe patterns of differential expression associated with all 3 organelles. For example, mitochondrial TDH is strongly downregulated in IMF versus SC despite most mitochondrial mRNA for which we have data tending to be elevated in the IMF depot. TDH encodes the enzyme threonine dehydrogenase which converts the amino acid threonine into pyruvate during catabolism.

All fat depots gene expression comparison
All 5 fat depots could be clearly discriminated by the genome-wide gene expression patterns, with the fat depot variation substantially greater than any breed variation within a depot. This analysis clearly shows that at the molecular level all 5 bovine fat depots are biologically quite distinct from each other. However, it is beyond the scope of this article to discuss all depots in detail. Nevertheless, the clustering output does represent a robust quality checking metric, implying the various tissue samples have been dissected, RNA extracted, microarray hybridised and statistically normalised without any major mix-up or substantive analytic error.
IMF was awarded a separate branch from the other 4 fat depots suggesting it is by some margin the most functionally divergent. However, the presence of a small amount of LD muscle in the IMF sample but not the other fat depots cannot be ruled out as the major cause of this clustering. On this note, the muscle contamination in IMF had to be formally accounted for in subsequent analyses aimed at understanding the particular biology of IMF. Of the other 4 fat depots that we are confident contain no muscle contamination at all, SC appears to be the most functionally divergent. Plotting the MA plots of SC versus all the other depots in turn consistently points to high expression of HOXA10 and low expression of DLK1 in SC. Analogous gene expression observations have been made previously in the SC depots of humans and rodents and have been attributed to variable proportions of populations of pre-adipocytes [15,16].

IMF versus SC PART 1Basic structural differences
It has been previously established that marbling adipocytes are substantially smaller than SC adipocytes with peak diameters of 104 ± 2 μm and 141 ± 5 μm respectively [6]. For geometric reasons IMF samples will possess more cell membrane per cytoplasmic volume on a per unit tissue basis. This will be reflected in the total RNA (See figure on previous page.) Fig. 5 Hierarchical cluster analysis on the 73 genes whose expression suggest particular relevance to the biology of the IMF depot. This list was generated by seeking genes that satisfied two criteria, higher expression in IMF than LD by any amount, and higher expression in IMF than SC by at least 2 fold. The hierarchical clustering on tissues supports the selected gene list as being IMF diagnostic given the first branch in the tree separates the IMF samples from not only the other fat depots but the LD muscle samples too. The clustering on genes identifies positive patterns of co-expression across the treatments  [17]. The presence of several genes highly expressed in nervous tissue (MPZ, MBP, METRN, PMP2, GFAP) as upregulated in IMF is consistent with the notion that the IMF depot has a modified and/or denser innervation than the SC depot. Because these findings have accounted for LD contamination, we suggest the result is unlikely to be attributed to the simple presence of motor nerves.
Furthermore, IMF appears to possess a unique cytoskeletal architecture, with the marbling adipocytes expressing substantially higher levels of a subset of cytoskeletal genes (CNN1, ACTA2, MYH11, ACTG2, ACTB, ACTRT2, SORBS2). It has been known for some time that cytoskeletal architecture changes substantially during adipocyte differentiation from a structured to disorganised state [18]. In line with our data, CNN1 encoding calponin 1 was found to be upregulated in the muscle transcriptomes of high IMF Meishan compared to low IMF Landrace pigs [19], and also to be more than 2 fold upregulated in 20 month Wagyu by Hereford crosses compared to Piedmontese by Hereford crosses (data from [20]). Our analysis also suggests that IMF adipocytes are embedded in an extracellular matrix environment that is either structurally distinct (or overall more pronounced) when compared to SC, given elevated expression of these genes encoding various extracellular matrix (ECM) components (GFAP, COL28A1, Table 6 The numerical and biological outcomes of the single criteria approach we applied to first diagnose the impact of contaminating LD muscle on the IMF samples. The background list was composed of 34,228 genes including duplicates (14,477 unique    GPAM (alias GPAT1) a quantitatively influential enzyme of esterification according to [14] b Cytosolic NADP isocitrate dehydrogenase (IDH1) is not registered as expressed in our fat depot mRNA data c While not considered rate-limiting for esterification (and glycerolipid synthesis) we wish to draw attention to the expression profiles for a number of additional genes in these two pathways as some have expression profiles indicating higher activity in the IMF than the SC (that cannot be explained by LD muscle contamination) (Table 9) COL2A1, ITGA8, TNC, SNCA, MYH11, MKX, COL4A6 and TNFRSF11B).
An additional functional group that appears diagnostic of the IMF depot includes a subset of solute transporters (SLC6A6 and SLC3SF1). From a biomarker perspective we have also identified several genes elevated in IMF that encode proteins that have been connected to human obesity and potentially could be detectable in the plasma; these include INHBA and ANGPTL7.

IMF versus SC part II metabolic biochemistry
It is believed SC adipocytes contain lower levels of saturated fats, such as stearic acid (C18:0), than do IMF adipocytes [7]. Prior work profiling bovine fatty acids by gas chromatography and corresponding calculations of desaturase indices are broadly consistent with IMF expressing 2.4 fold less SCD (encoding the Δ9 desaturase enzyme) in the present data. Furthermore, marbling fat has been shown in vitro to have a relative preference for glucose versus acetate carbon for de novo FA synthesis compared to SC as measured by radioisotope incorporation into bovine adipose tissue slices [6]. In that study glucose was found to be the primary precursor in IMF (51-76%), whereas acetate was dominant in SC (67-77%). In our present data we note that ACSS2 encoding acetyl coenzyme A synthetase was expressed at~2 fold Table 9 Expression profiles of particular isoforms in esterification and glycerolipid synthesis where IMF shows higher expression than SC that is not attributable to LD contamination. Normalised mean expression values (log2) for LD muscle, dissected IMF and SC. P values are reported for both DE and PIF (based on the SC versus IMF comparison). These non significant P values need to be interpreted cautiously as these genes do satisfy the independent requirement of higher expression in IMF than LD which we argue makes them potentially noteworthy higher levels in SC than IMF consistent with a relative preference for acetate in the SC depot. Furthermore, we wish to point out that NR4A3 (alias NOR1) encoding a nuclear hormone receptor in the steroid-thyroid hormone retinoid superfamily was upregulated more than 2 fold in IMF compared to SC. This transcription factor has been found to augment insulin's ability to stimulate glucose transport in rodent 3 T3-L1 adipocytes [21]. In fact, NR4A3 was one of several genes identified that indicate substantive differences in carbohydrate (GRB14, NR4A3, MOXD1), retinoic acid (STRA6, MEST) and cholesterol (PMP2, CH25H, CYP4B1) metabolism when comparing IMF to SC.
Malic enzyme (ME1) and ATP citrate lyase (ACLY), both presumed missing or unimportant in ruminant adipose tissue based on enzymatic analyses under standard metabolic conditions [22] are apparently reasonably highly expressed in these samples at the messenger RNA level. Surprisingly, we did not detect cytoplasmic NADPisocitrate dehydrogenase (IDH1) mRNA in our data, which has previously been shown to be important lipogenic player in both depots, and particularly in SC [6].

IMF versus SC PART III CH25H and the 25 hydroxylation of cholesterol
A qRT-PCR assay for CH25H yielded a~34-fold higher expression in IMF than SC. CH25H encodes the enzyme cholesterol 25 hydroxylase which converts cholesterol into the metabolite 25 hydroxycholesterol (25-HC). This metabolite has never previously been linked to marbling fat specifically but is certainly known as a potent regulator of cholesterol metabolism [23,24], fibroblastmediated tissue remodelling [25] and inflammation [26] all of which have been considered relevant to adiposity in various biological circumstances [27][28][29]. The tissue specific gene expression profile of CH25H in humans favours soft tissue (which includes adipose tissue) according to the TiGER gene expression database [30] supporting the possibility that adipose tissue may have a disproportionate influence over the systemic expression of this enzyme and its products. The identification of CH25H as an enzyme of interest prompted us to re-examine a published time course experiment where intact bovine LD muscle had been sampled in low marbling Piedmontese x Hereford breed crosses versus high marbling Wagyu x Hereford breed crosses over development [20].
Interestingly, between 20 and 30 months when IMF is accumulating most rapidly the CH25H expression profile in the Wagyu sired cattle increases at a faster rate than Fig. 8 Hierarchical clustering of phenotypes and oxysterol metabolites. This approach clusters (correlations of correlations) on positive relationships only. The metabolite with the tightest positive cluster to a phenotype is metabolite 1 (24S-HC) and eye round IMF (0.91; P < 0.001). While 25-HC is not significantly associated with any phenotype, 25-HC does cluster with the metabolites 26-HC and 7-OC which are in turn both correlated with chuck IMF. The key connecting metabolite number to metabolite name is documented in Table 10 the Piedmontese sired cattle, and ends up~2 fold higher. This final expression difference almost exactly mirrors the close to 2 fold difference in actual IMF content observed between the breed crosses, 8.8% in Wagyu sired animals versus 5.1% in Piedmontese sired animals [31]. In other mammalian species, oxysterols have been shown to be present at detectable levels in the circulation using mass spectrometry approaches. Consequently we became interested in the possibility that circulating oxysterols such as 25-HC might be a) detectable in cattle plasma and b) reflect carcass-wide marbling.

LC-MS estimation of oxysterol metabolites
To test our hypothesis regarding a potential link between circulating oxysterols and carcass-wide marbling we identified 8 animals displaying variation in IMF for whom we had fresh frozen plasma samples. LC-MS estimation of the 10 oxysterols using the methods of [32] was correlated to 15 carcass phenotypes (of which 7 were IMF related). Intriguingly, we found significant associations at the P < 0.05 significance threshold for 5 oxysterols to 8 carcass phenotypes, and 7 of those 8 phenotypes were IMF related. This implies that, despite the relatively small sample size, the IMF correlations we have detected are unlikely to be false positives. It appears that circulating oxysterols could form the basis of a diagnostic blood test for IMF content. Such a test could in principle help enable a decision on whether or not to invest in expensive feedlotting.

Conclusions
We have characterised gene expression patterns discriminating bovine fat depots with a particular focus on IMF versus SC. We found the expression patterns were suggestive of a number of observations previously made using traditional biochemical techniques, notably a generalised reduction in lipogenic activity and an apparent preference for glucose carbon in the IMF depot. A number of new discoveries were made, including expression changes consistent with coordinate modifications in the manner IMF handles cholesterol, retinoic acid and carbohydrate metabolism compared to the other depots. These gene expression signals would have remained hidden from view if we had not corrected for the presence of a small amount of muscle in the dissected IMF, and this correction was in turn made possible only by collective normalisation of the fat depot data with the LD muscle data. Of particular interest was an indication of very elevated activity of the cholesterol 25 hydroxylase enzyme in the IMF depot. A suite of oxysterol metabolites were quantitated in plasma using mass spectrometry and appear to be promising circulating biomarkers for whole carcass IMF content.

Beef cooperative research Centre III marbling and fat distribution experiment
The basic experimental details relating to this animal trial have been previously reported [12]. In brief we studied 15 individual 259 day grain fed Angus, Hereford and Wagyu x Angus steers (n = 5 per breed) slaughtered at~26 months of age as part of the larger experiment detailed by [12]. Management was standardised for all individuals during growth on pasture from weaning through long-feedlotting. For the transcriptome analysis of the fat depots we focussed on animals' pasture-fed only during the immediate post-weaning period, rather than animals that received supplement during the immediate post-weaning period [12].
A total of 10 sires were represented across the 15 individuals, with 3 sires per genotype and only 1 or 2 individuals per sire. Fat depot samples were dissected from each carcass as soon as possible after slaughter. Dissected IMF was derived from the M. longissimus dorsi (IMF), with other fat samples derived from intermuscular fat (Inter), omental fat (Omen), kidney fat (Kid) and SC (over the rump) depots. The longissimus dorsi muscle with IMF in situ (LD) was also sampled. Further LD samples were also included in our analysis from animals slaughtered at an earlier time point (kill 2) representing both supplement fed and pasture fed animals. For this transcriptome analysis of the fat depots per se we focussed on animals pasture-fed only during the immediate post-weaning period.

RNA extraction, microarray hybridisation and normalisation
Total RNA was phenol chloroform extracted using Trizol (Invitrogen) following the manufacturer's instructions. RNA yield and purity were determined using spectrophotometry and RNA integrity was determined by agarose gel electrophoresis. For each breed by depot comparison 4 of the potential 5 RNA samples were prioritised for microarray hybridisation on the basis of RNA integrity and purity. The RNA was submitted to the Ramaciotti Institute (Randwick, NSW, Australia) for hybridisation to the 4x44K one colour Agilent bovine array. Data was normalised using a previously described mixed model approach [33] and all data in all samples expressed as log2 values (Additional file 2). Given the hybridisation to the microarray is performed based on a given total amount of input RNA, the expression measurements reported effectively represent mRNA abundance on a per unit total RNA basis. All the normalised gene expression data is publicly available via accession GSE136981 and allows for the recreation of our chosen plots and analyses.

Hypothesis-free screen
Data driven hierarchical clustering As part of a quality checking procedure and to gain a high level view of the across tissue relationships, the expression profiles of 10, 000 genes chosen at random were imported into Per-mutMatrix [34] for across-depot cluster analysis, with each breed by depot group being represented (with 4 individual animals used to produce the mean breed by depot values) (Fig. 1). With a large number of rows performance is very slow. This random 10,000 can be considered representative of the full genome-wide data. Figure labels are built from concatenating breed and tissue. Although we only used pasture fed animals from kill 5 for the fat depots gene expression, the LD muscle samples we included here are derived from a combination of pasture and supplement fed animals and 2 different kills. Given this, we included the diet and kill information in the figure labels for completeness. Global relationships between depots based on the molecular data were then determined using unsupervised hierarchical clustering performed on columns for all tissues. LD muscle was included in this analysis as an outgroup.

Differential expression (DE) analysis SC versus the other fat depots
For the next set of analyses to contrast the depots we first combined the (breed) data within each depot. This meant each depot value is the mean average of 12 individual samples (n = 4 for each of the 3 breeds). Two statistical strategies were used to identify differential expression between IMF and SC. First, we used a standard t test for the differences in means assuming equal variances (Additional file 5). Secondly, we used z scores from a modified DE (PIF) that we processed through an inverse normal distribution to obtain 1 tailed P values (Additional file 6). In terms of visualisation of the data, all fat depots were contrasted for differential expression (DE) against SC, expressed in the form of Minus Average (MA) rocket plots (Fig. 2). For the remainder of the analyses we mainly focussed on the specific comparison between SC and IMF.
IMF versus SC We quantitated the cellular basis of an anomaly (the atypical triangular shaped protuberance) in the structure of the IMF versus SC MA plot. This was achieved by querying the data using the "=AND" function in Microsoft Excel. We asked for all genes that satisfy the following criteria to be returned: expression > 4 fold higher in LD muscle than in SC. We then visually highlighted those genes (i.e highly expressed in LD versus SC) on the IMF versus SC plot (Fig. 3a). The SC fat depot can be considered representative of a clean (i.e. muscle free) sample.
In order to perform function enrichment analysis we computed a modified DE (called Phenotypic Impact Factor or PIF) which we have previously found to account for the distribution of DE with different abundances. This approach is analogous to binning the data by abundance and prioritising the extreme DE within each bin. We view this as attractive because ranking on simple DE runs the risk of prioritising those mRNA that are lowly abundant and near the detection limit of the microarray platform, which is likely the noisy part of the data distribution. PIF is calculated by multiplying the DE by the average abundance across the treatment of interest [35]. After ranking on PIF one can identify the extreme 1% (145 / 14,476) or 5% (724 / 14,476) which can then be submitted for hypergeometric functional enrichment.
Mitoproteome Next, we aimed to establish whether there was a major intracellular composition change in terms of adipocyte mitochondrial content. To achieve this we downloaded the human mitochondrial proteome (http://mitominer. mrc-mbu.cam.ac.uk/release-3.1/begin.do) and after filtering for duplicates generated a unique list of 1045 nuclear and mitochondrially-encoded genes which encode all known mitochondrial proteins. The derivation of this list of mRNA encoding mitochondrial proteins and its use in a similar biological context has been previously described [36]. We overlaid these mitoproteome genes onto the IMF versus SC MA plot and also estimated the deviation from null equilibrium (i.e. to what extent is the mass of the mRNA expression data skewed upwards or downwards with respect to 0) as a proxy for overall mitochondrial content / activity using the method of binomial distance.
Structural differences between marbling adipocytes and the other fat depot adipocytes In a further attempt to analyse (and also better visualise) the effect of contaminating LD on the dissected IMF gene expression we overlaid the IMF versus SC MA plot with a continuous colour spectrum reflecting genes that are most likely arising from the IMF adipocytes per se and not the contaminating LD muscle. This colour coding was achieved in an objective numerical manner by mapping a continuous colour spectrum to the DE calculated between the intact LD muscle (representing expression from mainly muscle cells) and the dissected IMF samples (reflecting expression from mainly marbling fat cells). This means that on the Y axis (DE) of the MA plot those values above 0 that are coloured yellow / red (higher expression in IMF than LD, as well as higher expression in IMF than SC) and those values below 0 that are coloured purple can be confidently ascribed to gene expression arising from the IMF adipocytes themselves.
In a related set of quantitative criteria based analyses aimed at highlighting IMF expression derived from marbling adipocytes and not contaminating muscle cells, we queried the full data set for those genes whose expression satisfied at least one of two criteria: higher expression in IMF than LD muscle and higher expression in IMF than the other depots. There is no standardised approach in the literature for handling the contamination problem identified here. The exact thresholds used in our case were developed empirically through an exploratory, iterative process.
We systematically explored a number of thresholds and observed which ones produced short, manageable gene lists functionally in line with existing biological knowledge. This particular process yields thresholds that are not formal but we wish to emphasise that the numerical decisions we settled on are a) made across the board and are thus transparent and fair b) reproducible from the original data c) are entirely pragmaticthey serve our stated purpose of eradicating those genes whose expression in IMF is likely derived from the contaminating muscle signal and d) hypergeometric analysis of the target list can then be calculated with respect to the background (whole genome) gene list using the GOrilla webtool.
The multiple criteria queries were performed in Microsoft Excel using the "=AND" function. The 4 queries we have reported were generated as follows: 1) identify genes likely expressed from myocytes not adipocytes (expression in LD muscle > 4 fold higher than SC); 2) identify genes diagnostic of IMF adipocytes compared to all the other fat depots 1 (expression in IMF > 2 fold higher than LD and expression in IMF > 1.32 fold higher than the average expression in the other 4 fat depots); 3) identify genes diagnostic of IMF adipocytes compared to all the other fat depots 2 (expression in IMF > LD by any amount and expression in IMF > 1.68 fold higher than the average expression in the other 4 fat depots); 4) identify genes diagnostic of IMF adipocytes compared to SC specifically (expression in IMF > LD by any amount and expression in IMF is > 2 fold higher than in SC).
The output from the last of these queries (higher expression in IMF than LD, at least 2 fold higher expression in IMF than SC) was used as input for hierarchical cluster analysis in Permut matrix following production of a tab delimited .txt file with as many rows as genes and as many columns as tissues.
Finally, we also identified those genes more lowly expressed in IMF than SC (and probably not derived from the LD contamination) as follows: expression in IMF < LD by any amount and < SC by at least 0.4 units).

Hypothesis-driven analysis
Lipogenesis in adipocytes Canonical KEGG pathways (https://www.genome.jp/kegg/pathway.html) were used to define lipogenic genes of interest selecting the Bos taurus pathway in all cases. For the purposes of this analysis we developed criteria to retrieve one value per gene as most genes in the microarray platform are represented by more than 1 probe. In many cases the probes show a general agreement in direction and magnitude of change, but this is not always the case. In an attempt to generate a single value per gene we decided to prioritise using the Phenotypic Impact Factor (PIF) metric. PIF is computed by multiplying DE by abundance. It has a number of appealing characteristics, both numericalsuch as tracking the distribution of the MA plots and de-emphasising noisy rarely expressed transcripts -and biological, such as clearly identifying gross myofibre composition changes when comparing skeletal muscle samples [35]. This prioritisation produces a file containing unique values for 14,476 genes (Additional file 7). The Bos taurus KGML files for "Fatty acid synthesis" (BTA00061), "Fatty acid degradation" (BTA00071), "Adipocyte lipolysis" (BTA04923) "Fatty acid elongation" (BTA00061), "Glycerolipid metabolism" (BTA00561), "Pyruvate metabolism" (BTA00620) and "Pentose phosphate metabolism" (BTA00030) were used as a resource to help us identify the mRNA encoding the relevant proteins in each biochemical pathway.
In an effort to further explore the mRNA associated with canonical pathways relevant to ruminant fat metabolism, a number of lipogenic and lipolytic pathways were identified through literature searches in NCBI Pubmed. The component enzymes and genes of the key pathways were then identified through the KEGG pathway database (http://www.genome.jp/kegg/pathway.html) using Bos taurus as the species. Further targeted literature searches were also used to determine which enzymes and other proteins could be considered influential in overall pathway flux, including the rate-limiting enzymes.

qRT-PCR
We designed 2 sets of primers for CH25H (target gene) and 1 set for RPLP0 (housekeeper) using Primer 3 and accepting the default settings but forcing the amplicon to be 150-250 bp long. The first primer sequences for CH25H were forward ttcagtcgcccttctttcct and reverse gtcatggggaacacgaacac. The second primer sequences for CH25H were gcaccatcagaattcgtcccand agccagatgttgacaacgtg. For RPLP0 the sequences were forward caaccctgaagtgcttgacat and reverse aggcagatggatcagcca. These primer pairs produce amplicons of 208 bp, 153 bp and 227 bp respectively. cDNA from IMF and SC was made from total RNA using Superscript III following the manufacturer's instructions. Primers were first tested by conventional PCR then estimating product size on a 1.6% agarose gel. qRT-PCR was performed on an Applied Biosystems machine using Sybr Green fluorescence chemistry in 10 ul total reaction volumes. No template controls were included on the reaction plate. 12 individual cDNA samples were quantitated for each of the two tissues of interest i.e. IMF and for SC.

Analysis of LD muscle gene expression in Wagyu x Hereford versus Piedmontese x Hereford crosses
We made use of a previously published microarray experiment where LD muscle had been sampled across pre and postnatal development in two breed crosses, one with higher IMF and one with lower IMF and higher muscle mass [20,35]. The normalised mean expression values for the last 3 postnatal time points (20 months, 25 months and 30 months) were retrieved from the data and the expression profiles for CH25H plotted for the breed crosses.
The 15 phenotypes we assessed can be categorised as follows: 7 are marbling related (Meat Standards Australia Marbling Score = MSA, AUSMARBLE Score = AUS MB, Biceps femoris IMF%, Loin IMF%, Chuck tender IMF%, Eye round IMF%, Oyster blade IMF%) and 8 are non-marbling related. The 8 non marbling related phenotypes can be further broken down as follows: 3 are muscle mass related (LD eye muscle area = EMA, Carcass weight derived from hot sides = W1 and cold sides = CW2), 2 are non-IMF Fat related (Rump SC fat depth = P8, rib fat depth = RFT), 2 are meat quality related (temperature at pH 6 = TEMP, ultimate pH = pH) and 1 reflects physiological maturity through assessment of bone development (ossification score = OSS).

Statistical analyses
The microarray data were normalised using a mixed model approach as previously described [33]. We used a number of complementary methods to assess significance of differential expression. Genome-wide patterns of DE between IMF and SC were assigned significance based on two approaches: firstly, DE was assessed using a standard t test assuming equal variances; secondly, a modified DE called PIF (Phenotypic Impact Factor) was z score transformed then those z scores were processed through an inverse normal distribution to obtain 1 tailed P values. For the hypothesisfree genome-wide screens we ranked the entire gene lists using criteria of a modified DE (Phenotypic Impact Factor or PIF) which after ranking can then be assessed for functional gene set enrichment using hyper-geometric statistics reported as FDR corrected q values.
We used GOrilla software to perform all the functional enrichments [37,38]. When using the two list approach in GOrilla, we established a target list using a nominal threshold (extreme 1% or 5%) and used a background list taken from the entire genome-wide data. We also used multiple criteria thresholding to generate target gene lists that satisfied strict requirements such they were unlikely to be derived from muscle contaminating the IMF samples and then independently assessed functional enrichment using GOrilla. The estimate of deviation from equilibrium of the cohort of mRNA encoding the mitoproteome was performed in Excel using the BINOM.DIST function. The oxysterol metabolite quantities were related to 15 cattle phenotypes for each animal using a correlation analysis performed in SAS with a P < 0.05 considered statistically significant. The qRT-PCR data was analysed using a 2-tailed Mann Whitney U test.
depots. Significance was established from z scores processed through an inverse normal distribution to produce 1 tailed P values. Additional file 7. An excel spreadsheet containing the Minus and Average (MA) data to recreate the IMF versus SC MA plot, with values for 14,476 probes (one probe per gene). The extreme 1 and 5% by PIF are also listed here.
Additional file 8. The carcass phenotypes for those animals whose plasma was quantitated for oxysterols.
Additional file 9. The oxysterol quantitation data for the 8 selected animals.