Messenger RNA gene expression variation can be divided into two broad categories: differences in the overall mRNA level (due to transcriptional changes or mRNA stability) and alterations in the ratios of alternative transcripts. Variations in gene expression can be due to a number of different factors, including genetic variation, epigenetic variation, environment variation (which could include, for example, the hormonal differences between males and females) and interactions between these factors. In this paper, we first quantified the contributions to differential expression from two sources of variation - strain and sex - and secondly, provided evidence suggesting that variation in transcript structure contributes significantly to mRNA expression variation.
When we clustered the dataset using hierarchical methods, we found that the mice of the same sex, but different strains, were more similar in terms of gene expression than mice in the same strain but of different sex. Furthermore, after this initial subdivision into males and females, the phylogenetic tree obtained matches that shown in previous SNP-based genealogy studies[19, 20]. Thus, we conclude that (a) the natural variation of gene and exon expression is smaller between mouse populations of similar sex than that of the same strains but differing sexes, and (b) to the extent we have examined, gene and exon expression captures the differences identified by genealogy. Based on genealogy of mouse strains, we believe more differences in gene expression and splicing could have been observed if the strain selection had been even more diverse, such as the inclusion of strains 129S1/Svlm, and SWR/J [19, 21].
If we compare our estimates of strain-bias genes from 3' gene expression profiling to those found in the literature, our estimates of 23% fall within the range that has been documented by others. For example, Nadler et al. (2006) found 57% of genes exhibited strain-biases at the level of gene expression whereas Pavlidis and Noble (2001) and Sandberg et al. (2000) found only 1% to 2% of genes show inter-strain variation [4, 22, 23]. As pointed out by Nadler et al. (2006), the higher estimates are likely due to the inclusion of a larger more diverse set of strains, i.e., 10 strains in Nadler et al. (2006) study vs. 2 strains in Sandberg et al. (2000) [4, 23]. In addition, differences in the tissues examined are also likely to significantly influence variance estimates, since the studies mentioned above used brain tissue while our study was performed using liver. In terms of sex estimates, we find that our estimate, at 17%, is markedly smaller than Yang et al. . This difference likely reflects the dramatic differences in power between the two studies to detect expression differences, given only 11 or 12 animals were profiled from each sex for each strain in or study, versus the more than 150 individuals per sex profiled in the study by Yang et al.
We detected significant variation in exon expression with regards to the genetic background and sex. While the probes used for exon expression profiling may be more susceptible to cross-hybridization and higher background levels given the smaller target regions, the use of cDNA amplification products partially mitigates this effect . Furthermore, by averaging multiple exon and junction probes, we increased the reliability of each measurement and reduce the impact of individual SNPs. However, averaging exon and junction probes makes it more difficult to distinguish different types of splicing events. Nevertheless, through the use of exon expression profiling technologies, we were able to detect 234 and 90 genes with strain- and sex-bias effects, respectively, that were not detected in the 3' gene expression or whole-transcript gene expression profiling analysis. These numbers suggest that many alternative splicing events are differentially expressed but go undetected by current gene expression profiling technologies.
Splicing differences between groups can be attributed to genetic or epigenetic variation. For example, variations in cis-acting regulatory elements, such as SNPs within promoter sequences, splicing enhancers or splicing silencers can alter transcriptional initiation rates and splicing patterns. Structural variations in trans-acting splice regulatory proteins may affect global splicing patterns and nucleotide variation in mRNA transcripts can influence translational efficiency (as shown with apolipoprotein A-II in mice ) and/or mRNA decay rates. Expression and splicing differences observed between different sexes, however, showcase the amount of underlying biological mechanisms that have yet to be elucidated. With the exception of the sex chromosomes, the genome is essentially identical between the males and females of an inbred mice strain, hence the possible mechanisms that give rise to gene expression and/or splicing variation include trans-acting factors on the sex chromosomes (such as SRY or the Sox proteins), epigenetic variations, and/or hormonal differences.
Oligonucleotide probes overlapping SNPs are biased towards differential expression, leading to overestimation of differential expression. A study using probes from the Affymetrix platform recently demonstrated the susceptibility of single probes to SNPs and highlights the impact of natural variation on hybridization based methods . We found similar findings in longer 36 nt and 60 nt probes. For example, we found that 36 nt junction probes overlapping a SNP show higher sensitivity towards differential expression, possibly due to alternative splicing brought about by SNPs within splice sites or due to differences in probe binding affinities due to the SNP.
We have confirmed that gene expression is significantly affected by strain and sex and provided evidence suggesting that this effect extends to alternative splicing which, to our knowledge, had not been shown in mammals. Given that variations in alternative splicing patterns lead to a wide variety of downstream biological effects, our results provides further justification for investigations on alternative splicing variations in genetically segregating populations.