Natural genetic variation in transcriptome reflects network structure inferred with major effect mutations: insulin/TOR and associated phenotypes in Drosophila melanogaster
© Nuzhdin et al; licensee BioMed Central Ltd. 2009
Received: 19 December 2008
Accepted: 24 March 2009
Published: 24 March 2009
A molecular process based genotype-to-phenotype map will ultimately enable us to predict how genetic variation among individuals results in phenotypic alterations. Building such a map is, however, far from straightforward. It requires understanding how molecular variation re-shapes developmental and metabolic networks, and how the functional state of these networks modifies phenotypes in genotype specific way. We focus on the latter problem by describing genetic variation in transcript levels of genes in the InR/TOR pathway among 72 Drosophila melanogaster genotypes.
We observe tight co-variance in transcript levels of genes not known to influence each other through direct transcriptional control. We summarize transcriptome variation with factor analyses, and observe strong co-variance of gene expression within the dFOXO-branch and within the TOR-branch of the pathway. Finally, we investigate whether major axes of transcriptome variation shape phenotypes expected to be influenced through the InR/TOR pathway. We find limited evidence that transcript levels of individual upstream genes in the InR/TOR pathway predict fly phenotypes in expected ways. However, there is no evidence that these effects are mediated through the major axes of downstream transcriptome variation.
In summary, our results question the assertion of the 'sparse' nature of genetic networks, while validating and extending candidate gene approaches in the analyses of complex traits.
One of the frontiers of current day genomics is to build predictive models for how molecular variation in known pathways affects their performance. Here, we approach this question by focusing on one of the best mechanistically described processes. The insulin receptor/TOR kinase (InR/TOR) pathway underlies many physiological processes that redirect organismal resources into activity, maintenance, and survival, depending on dietary energy intake. Experiments performed on rats towards the start of the twentieth century identified dramatic increases in lifespan associated with a restricted diet . Similar extensions in lifespan were subsequently identified in relation to dietary restriction in a broad range of other organisms, including the model organisms Caenorhabditis elegans and Drosophila melanogaster . C. elegans has since served as a primary model for molecular descriptions of the InR/TOR pathway. Specifically, the tyrosine kinase receptor DAF-2 induces phosphorylation of the phosphoinositide 3-kinase (PI3K) catalytic subunit AGE-1 . AGE-1 then phosphorylates the serine/threonine kinases AKT-1 and AKT-2, which form a complex with the serine/threonine protein kinase SGK-1 and subsequently phosphorylate the transcription factor DAF-16 [4–6]. The phosphorylated DAF-16 cannot enter the nucleus. Under normal dietary conditions, DAF-16 is retained in the cytoplasm in a phosphorylated state. However, if the receptor DAF-2 is inhibited, DAF-16 is not phosphorylated and so is capable of diffusing into the nucleus, where it regulates a range of genes that induce the starvation phenotype .
In model species for genetic research, the InR/TOR signaling network structure and its phenotypic effects were studied via analyses of major effect mutations. It is unknown if smaller genetic alterations affect gene expression and phenotypes in an analogous manner. One type of smaller effect perturbation is genetic variation among natural genotypes, which in turn results in variation among gene transcript levels . The most comprehensive recent data set of this type is by Wayne et al.  describing whole genome expression levels in 72 genotypes of males and females. Here we use this data set to investigate whether we can detect covariance in transcript variation in InR/TOR pathway genes. Because the high dimensionality of genome-wide expression data presents multiple challenges [17–21], we focused our analyses on three a priori defined groups of co-regulated genes. Gershman et al.  identified 3519 genes involved in the transcriptional response to nutrition in Drosophila by assaying RNA of adults that had been fed yeast following a period of starvation. They also identified a 995 gene subset of the nutrition-affected genes using microarray analysis of Drosophila S2 cells constitutively expressing dFOXO. Likewise, a list of 1016 genes downstream of TOR has been inferred via rapamycin treatment of Drosophila S2 cells . Using the gene expression states across 72 genotypes described in Wayne et al , we examined how the patterns of variance/covariance in these gene expression levels reflect variation in the upstream InR/TOR network. Specifically, we wanted to determine whether or not gene expression levels showed tighter covariance among genes from the individual TOR or dFOXO branches of the pathways, relative to the pathway overall.
Given that thousands of genes can be perturbed at once by variation in only a few upstream genes, the question arises as to how to summarize variation of their expression levels into fewer gene clusters co-varying for expression among genotypes. Several dimensionality reduction approaches are available, including principal components, spectral map and correspondence analyses [22, 23]. We focused on factor analysis, which uses covariation among genes to identify factors affecting the transcription of multiple genes at once . One may interpret the factor as the mechanism, for example, a transcription factor, by which genes are co-regulated. Another example would be that of tissue-specific expression; genotypes with a larger volume of this particular tissue would have higher levels of all tissue-specific genes. Whatever the true underlying mechanism, the factor model represents sets of coordinately expressed genes. Taking this a step further, these covarying genes can be viewed as participating jointly in a network. Each gene has an estimate of its participation in the common network; this is called the factor load, where the strength of the load indicates how much that particular gene contributes to a given factor. The product of the factor load and the gene expression value for a given genotype can then be summed over the individual genes considered into a factor value. The factor value then represents the functional state of this network, i.e., how much the individual genes in the network are controlled by a common regulatory mechanism in each genotype. Our first goal was therefore to use factor analysis to investigate the variance/covariance structure of genes in the InR/TOR pathway.
Our second goal was to investigate the phenotypic consequences of natural genetic variation in the InR/TOR pathway, since major effect mutations in the pathway are known to affect lifespan, survival, and other life history characters [25–27]. Similar phenotypic effects of these mutations were previously assayed in flies with gross abnormalities of pathway functioning [e.g., [11, 27]]. We know that knocking out the gene function of dFOXO results in higher desiccation resistance . Will smaller, quantitative expression changes of dFOXO result in accompanying phenotypic modification or it will be buffered? Our goal was to determine whether natural quantitative variation in transcript levels of these genes would have quantitative effects on the phenotype resembling the effects of major mutations. We analyzed co-variation between transcript levels in the upstream InR/TOR pathway with candidate life-history associated phenotypes, and confirmed many patterns expected from previous analyses of major-effect mutations. Our findings extend the candidate gene approach as a helpful tool in the analysis of phenotypic variation.
The summary factor analysis of the downstream InR/TOR-affected transcriptome can also be examined for associations with phenotypic traits by correlating the estimated factors with the phenotypes . This correlation serves to infer which factors, i. e., major axes of transcriptome variation, contribute to the modification of which phenotypes. By studying these correlations, we were unable to connect major axes of transcriptional variation in the InR/TOR to phenotypic variation. This observation questions the utility of this type of 'perturbation screen' for identifying de-novo axes of expression variation and simultaneously accounting for phenotypic variation.
(i) Genetic variation-covariation between the upstream genes in the InR/TOR pathway
We have reanalyzed the data of Wayne et al.  to determine whether genetic variation and covariation in transcript abundance among genotypes reflect the structure of a known D. melanogaster network, the InR/TOR pathway. Wayne et al.  recorded microarray hybridization signals of whole body RNA extracted from heterozygous F1 male and female progeny obtained from all possible crosses between 9 genotypes from a single natural population, resulting in 72 replicated measurements of expression in each sex. Significant co-variation between gene expression levels of two or more genes in this dataset may be due not only to biology, but also to experimental artifacts. Linked alleles within a homozygous parental genome are always co-inherited in F1 progeny originating from this parent. F1 offspring sharing a parent may then exhibit patterns of association due only to the co-inheritance of the same block of alleles rather than true association among the loci in their underlying mechanism. In such a case, 9 genotypes out of 72 will have a high combination for their transcript levels, which is a pattern akin to pseudoreplication. To deal with this problem, instead of studying the patterns of covariance among the 72 genotypes, we first estimated breeding values for transcript levels in the 9 parental genotypes (Additional files 1 and 2) and determined the effect of breeding values on transcript levels in our subsequent analyses. We focused on the portion of the InR/TOR network that: i) is well established in flies, ii) genetically varies among the genotypes, and iii) is represented by high quality oligonucleotides on the custom microarray designed by McIntyre et al. . Some genes were tagged by several oligonucleotides and all of them were retained and their signals averaged in the subsequent analyses.
The InR/TOR signaling network combines regulation at the level of transcriptional regulation, phosphorylation, protein relocation and other molecular mechanisms. Genes that are in the InR/TOR pathway and have been analyzed by this study are represented on Figure 1. We determined the covariance of transcript levels of these genes. Despite the diversity of molecular functions and mechanisms of signaling, the transcripts of nearly all genes in the network strongly co-vary with each other (see numbers on arrows, Figure 1). For instance, the genetic correlation between the transcript levels of AKT and TOR (AKT phosphorylates the TOR protein) are 0.88 in females and 0.92 in males, both highly significant. Clearly, this co-variation is not necessarily due to causal effects of the gene's transcript levels on one another, but is most likely due to the influence of other genetic factors onto both of these genes. This might indicate a common regulatory mechanism for the pathway as a whole within a cell. Alternatively, as tissue representation might vary among the genotypes and InR/TOR network function varies between tissues, the correlations might be due to these organism-level differences among the genotypes. Note that environmental or developmental fluctuations may be ruled out as explaining these patterns of co-variation: transcript level measurements in each heterozygous genotype were biologically replicated to account for dye and environmental effects. Further, an effect of each genotype (breeding value, see  for definitions) was estimated from a joint analysis of 72 heterozygous genotypes, further reducing the opportunity for developmental and environmental noise to influence the analyses.
While we only present correlation coefficients for the genes known to signal each other on Figure 1, we also estimated correlations between all pairs of these genes (Additional file 3). Overall, they represent a highly connected, co-varying group from which it is difficult to recover any fine structure. We conclude that, as expected, the genes of the InR/TOR network are strongly co-regulated at the cellular or organismal level, but the strength of genetic co-variations among them does not mirror the expectations based on the mechanistic details known for this signal transduction pathway.
(ii) Genetic variation for the levels of message among feeding-affected genes
Gershman et al.  used changepoint analysis to identify 3519 Affymetrix probe-sets that changed their transcript level during 7 hours after feeding. Only 3270 of these probe-sets could be unambiguously identified in Drosophila genome version 5.1. For the remainder, at least one probe in the probe-set was similar to more than one genome position using a BLAST algorithm. These 3270 probe-sets correspond to 3171 genes. Oligonucleotides for 3126 out of 3171 genes were represented on the Agilent microarray. Of these, 3048 genes showed significant hybridization and genetic variation in breeding values in females and 3031 of these were identified in males.
To summarize transcript level variation of these approximately 3000 genes, we employed factor analysis. Each gene has 9 biologically independent measurements per sex per probe. Accordingly, we can infer up to 8 axes of transcriptome variation (factors). An individual gene might or might not vary in its expression level along a factor, which is indicated by the magnitude of its loading onto this factor (Additional file 4). In females, the first factor explained 44.35% of the total transcriptome variation with 2296 genes having bigger than 0.4 loading, but only 10 smaller than -0.4 loading. The second factor accounted for 22.17% of the variation contributed by substantial loading of 1306 genes. The third and all subsequent factors explained 12.78% (5.53%, 4.92%, 4.10%, 3.60%, 2.54%) of the transcriptome variation and included 908 (260, 228, 182, 148, 75) genes. In males, likewise, the factors accounted for 49.95, 12.52, 9.95, 7.45, 6.71, 5.47, 4.43, and 3.53 percent of the variation with 2633 (only four of them with negative loading), 874, 666, 438, 371, 275, 194, and 116 genes. The average loading onto the first factor was 0.59 in females and 0.67 in males. The factor loading was very consistent between males and females with most of the same genes loading on the same factors for the two sexes.
We investigated the biological processes and molecular functions of these genes using gene ontology (GO) enrichment analysis. For each of the first three factors in males and females separately, which together accounted for 72.42% and 79.30% of the transcriptome variation respectively, we report all of the GO terms that we found overrepresented in our gene lists using a 0.15 FDR  significance cutoff (Additional file 5). For both sexes, the first factor reflected terms related to metabolism, and females were further enriched for terms involved with translation and transport. The second and third factors for both sexes were enriched for terms related to signal transduction and electron transport, respectively.
(iii) Covariation of transcript levels in dFOXO and TOR branches of InR/TOR network
Out of the 995 dFOXO-affected genes reported by Gershman et al. , 911 genes were present on the Agilent array. 887 of these showed evidence of variation in females and 872 showed evidence of variation in males. These genes are a subset of their larger list of ~3000 feeding-affected genes described above. We first analyzed whether variation in these dFOXO genes was distinctly directed along the major axes of transcriptome variation. Factor 1 influences the subset of dFOXO-affected genes somewhat more strongly than the rest of the genes in the InR/TOR pathway: the average loading of dFOXO affected genes on the first factor was 0.69 in females and 0.71 in males, while for the rest of the approximately 2000 genes it was somewhat smaller at 0.55 in females and 0.65 in males. dFOXO genes did not consistently co-vary for expression with the other factors. To further assess variation in these genes as a subgroup, we conducted a factor analysis for the dFOXO affected genes separately. The factors from the initial inclusive analysis and the limited analysis of dFOXO targets were largely collinear (data not shown).
Direct targets of dFOXO have been identified using ChIP-chip technology and the binding motifs identified by these assays were found in hundreds of genes . We asked whether or not genotypes with naturally higher levels of dFOXO were more correlated with their dFOXO binding targets. This permits us to test whether or not the dFOXO to downstream target co-variation is explained by the number or quality of binding sites in the proximity of the target gene. We determined the correlation between dFOXO transcript level and the transcript level for the bound gene among the 9 genotypes (Additional file 6). We then compared the correlation of transcript levels of dFOXO and the targets of dFOXO (and their absolute values) to the intensities of binding of the target gene as defined by ChIP-chip analyses. The correlation across all genes was not significant. While the intensity of binding might not be a good predictor of the quality of the downstream gene regulation by dFOXO, we hypothesize that dFOXO transcript level variation is not mechanistically responsible for the variation in the transcript level of its downstream genes among the 9 genotypes. We also made an analogous analysis with the AKT transcript levels. Our rationale is that AKT activity, which is potentially dependent on the AKT transcript level, might be responsible for a varying degree of dFOXO exclusion from the nucleus among genotypes. This would result in varying degrees of the dFOXO binding to the downstream genes. Again, we did not observe this effect (data not shown).
Guertin et al.  reported approximately 1200 genes that changed their expression level in response to rapamycin treatment and thus are TOR affected genes. However, only a fraction of these genes were reported in the Supplementary Tables available for that manuscript. From this paper, we combined gene lists from Tables S1-S3. Among them, 54 genes had significant transcript level variation in males and females, and 46 were represented in the gene set reported by Gershman et al. . Similar to the pattern detected for the dFOXO affected genes, TOR affected genes were tightly coregulated, with the average loading onto the first factor being 0.79 for females and 0.77 for males. Limiting the factor analysis to only the TOR affected genes did not change the direction of axes of transcriptome variation (data not shown), again resembling the pattern detected in the dFOXO branch of InR/TOR pathway.
These factor loadings suggest that the dFOXO affected genes and the TOR affected genes are more tightly covarying than the overall set of feeding-affected genes. To test this hypothesis explicitly we examined the distribution of the pairwise correlation co-efficient in the feeding-affected genes in females and males (average of ρ = 0.35, ρ = 0.38 for females, males, respectively) to the dFOXO subset (ρ = 0.46, ρ = 0.44) and the TOR subgroup (ρ = 0.78, ρ = 0.77). Random samples of the feeding-affected gene list of the size of the dFOXO and TOR subgroups were taken 1000 times and the average pairwise correlation calculated. We conclude that at P < 0.001 dFOXO affected genes  and TOR affected genes  are more tightly co-varying sub-groups of a larger group of 3500 feeding-affected genes. However, the major axes of transcript level variation in these two subsets of genes do not seem to differ from those of the larger group of feeding-affected genes.
(iv) Partial regression analyses of multiply connected genes
Linear models with dependent effects given to the left of the equals sign and independent effects to the right of the equals sign.
P-values* in the order given by the model
Myc = TOR + AKT + Rheb + TSC1 + slif + e
0.0064, 0.0555, 0.0905, 0.7840, 0.3521
TOR = Rheb + TSC1 + slif + e
0.0002, 0.0001, 0.0099
TOR = Rheb + AKT1+TSC1+ slif +e
0.0001, 0.0001, 0.002, 0.0003
TOR = AKT + Pdk1+ S6K+ e
0.0017, 0.5166, 0.0724
(v) Genotype-to-phenotype map exhibits some effects expected from the analyses of major effect mutations
Correlations of individual genes with each of five phenotypes in females.
- 0.93 , 6.2 × 10 - 5
Correlations of factors with each of five phenotypes in females.
A common method of reconstructing and characterizing gene regulatory networks is to individually analyze the transcriptional profiles of a collection of single gene knockouts and infer regulatory relationships based on gene expression changes [for recent applications and a review, see [34–37]]. While these approaches have yielded important information regarding gene regulatory networks, the methods are limited given the cost and labor required to determine the expression profile for a knockout in every gene. Furthermore, analyzing gene knockouts does not allow for an assessment of smaller effect perturbations in genes, which may be a more common type of genetic variation in natural populations. One way around these limitations is to examine and analyze gene expression from many individual genotypes that each contains multiple mutations and to study the covariance in phenotypes and multiple gene expression levels to hypothesize which of the gene expression changes might cause phenotypic deviations [38–41]. If successful, this would contribute to elucidating components of the genotype-to-phenotype map. It might also greatly aid our understanding of the genetic basis of human disease, given that most human disease is caused by small effect mutations in many genes [42–44].
If most genetic regulatory networks are sparse, meaning that there is not a substantial overlap in the genes regulated within different networks , it should be possible to analyze a relatively small number of genotypes and use mathematical models to infer regulatory relationships between many genes. Major effect mutations do affect some downstream genes more strongly than others, thus establishing main regulatory connections. However, these strong effects also mask numerous weaker connections. Indeed, a major effect mutation typically disturbs hundreds or thousands of the genes, albeit to different degrees [34–37], as revealed by whole genome analyses of transcriptome alterations. With more powerful experimental and statistical approaches, these effects can likely be detected on many thousands of genes, or even onto all of the transcriptome. While with single major effect mutation per genotype, the main regulatory connections are easy to recognize, would this also hold true for natural genetic variation? More specifically, are genetic networks sparse enough for naturally segregating smaller-effect mutations to reveal individual network connections? From the analyses of transcriptome data presented here, it appears that the collective effects of numerous natural mutations might override a sparse nature of major effect mutations and dominate the collective properties of molecular networks.
In unrelated wild type flies, every genotype contains numerous sequence alterations. With approximately 1/100 bases different between two random unrelated flies, the total number of whole-genome differences is in the millions. How many of these differences represent regulatory mutations is not known, but some estimates may be offered as follows. Genissel et al.  investigated whole genome expression from an oligonucleotide microarray in two extensively studied genotypes of Drosophila melanogaster, Ore and 2b3, and six recombinant inbred lines derived from these parents. Approximately 10% of the transcriptome was differentially regulated among the lines. Regulatory effects in cis (regulatory mutations in the gene locus itself or nearby) appeared present in up to 1218 genes. 123 genes were affected by trans mutations, but the vast majority of trans effects probably remained undetected because there was not enough consistency among different analytical procedures. Wang et al.  assayed a full set of chromosome substitution lines between the two behavioral races of D. melanogaster Z and M. Only about 3% of the genes with an expression difference between races were purely cis regulated, while transcript levels of 80% of the genes were controlled by at least two different chromosomes. From these two studies, we can conclude that hundreds to thousands of genes in each genotype contain regulatory alterations. Therefore, regulatory mutations likely affect hundreds of networks in each genotype, and quite a few of them many times. These mutations in turn affect thousands or tens of thousands of downstream targets. If we now recall that our data set combines the analysis of variation over 9 genotypes, it becomes clear that the patterns of covariance we observe represents not individual gene-to-gene connections, but rather system-wide effects of massive genetic variation onto transcriptome architecture.
Why, then, have some analyses [see  for summary, ] been able to reconstruct quite a few network connections? We believe that there are two main reasons for this, both stemming from the fact that typical analyses have been based on Recombinant Inbred Lines (RILs). First, researchers mapped factors affecting transcription of focal genes to cis (position of the focal gene) and other regions of the genome (trans). With approximately a hundred RILs, only the strongest effects could have been unambiguously mapped, especially after corrections for the number of tested genes. Most regulatory connections must have been missed, although their composite effects might be overwhelming. Contrary to this supposition, the factors mapped typically accounted for an appreciable portion of the focal gene's transcript variation. We argue, though, that the Beavis effect, rather than the true Mendelian nature of the mapped factors, is a likely reason for such conclusions, meaning that when the power of QTL mapping is low, the effects of detected QTLs are strongly overestimated . The requirement to correct for multiple tested genes in the whole transcriptome eQTL mapping makes the power very low even in the best eQTL mappings. Accordingly, these experiments are likely to i) detect only the strongest effects and ii) overestimate their effects. This should result in a substantially oversimplified picture of genetic network connectedness. Additionally, a set of RILs is typically built from just two parental genotypes. We argue that both alleles sampled per gene (one from the first and another from the second parent) are likely to be functionally equivalent. Whenever the genetic network is connected through the gene in which there is no variation in the mapping RIL population, the network will appear 'broken' at this gene. This will contribute to apparent 'resolution' of the network connections. Overall, we feel that the eQTL based networks are likely to be oversimplified. We believe that higher power and larger genetic base network analyses will recover, in the future, much more connected and complex genetic networks.
We report one of the first attempts to use a population with a broad genetic basis to decipher the nature of genetic variation in the transcriptome and to link it with phenotypic variation. While the genetic architecture of transcriptome variation appears very complex, the use of prior information about the InR/TOR signaling cascade allows several interesting inferences. First, the transcript levels of the genes connected within the network appear coordinated. Second, transcript level variation is reflected in phenotypic variation in expected ways predicted from mutational analyses. Third, new connections are proposed that will help to focus further molecular analyses of InR/TOR network.
Genetic material and expression measurements
The microarray data acquisition and analysis were described by Wayne et al. . Briefly, nine isogenic lines of D. melanogaster used as parents were originally captured in an orchard in Winters, CA, and subjected to > 20 generations of full sibling inbreeding. Lines were crossed in a full diallel design with reciprocals, but without homozygous parents (72 F1 progeny). RNA was extracted from 20 whole 3-day post eclosion flies, snap-frozen in liquid nitrogen using Trizol reagent (Invitrogen, Carlsbad, CA). The chip was synthesized on an Agilent platform (http://www.genomics.purdue.edu/services/droschip, AMADID 012798; 3). Hybridizations were performed with males and females of the same genotype, labeled in contrasting dyes, hybridized to the same chip. We analyzed two independent biological replicates for each genotype and sex combination. Intensity values were normalized using the natural log transformation. The chip design contained 503 negative control sequences designed from human sequences with similar GC content to Drosophila, but with no known sequence homology. For each slide and dye combination the 90% value of the negative controls was used as the detection threshold. For a particular probe to be used both replicates needed to be detected. Probes that were not detected in at least one genotype were eliminated from further consideration for that sex. Additionally, the probe needed to show significant variation for expression for either genotype or sex to be considered further. For the probes that were detected, those corresponding to the list of genes identified as part of the dFOXO regulatory cascade (n = 1301 probes) were identified as described above. We found 1281 probes representing 887 genes in females and 1262 probes representing 872 genes in males.
Factor analytical techniques
Initial factor analysis was conducted by determining the eigenvalues for the matrix of genes, first for the feeding-affected genes and second for the subgroup of genes in the dFOXO pathway. Principle components factor analysis was used to estimate factor loads for 8 factors for both males and females. A gene was considered to contribute significantly to the factor if the estimated loading value had an absolute value of 0.4 or higher.
Gene Ontology Analysis
Genes with load scores greater than 0.4 or less than -0.4 were used to define gene lists for over-representation analysis of gene ontology (GO) terms implemented in DAVID (found at http://david.abcc.ncifcrf.gov). The P-value reported is produced from a modified Fisher's Exact Test called the EASE Score  computed in DAVID.
For the stress assays, mortality was recorded multiple times each day and for the life span assay mortality was recorded at three-day intervals. Oxidative stress survival was determined for flies held on methyl viologen as an oxidant. Starvation survival was measured in the absence of food at high humidity and with access to water. Desiccation survival was measured in a container with a desiccant. Life span was determined in small population cages within which flies had continuous access to food. Development time was the time between placing eggs on Drosophila food and the time of adult emergence from pupae (L. Harshman, unpublished data). Ovariole number per ovary was scored from three females from each of two replicate vials. Body size was estimated by measuring thorax length on ten males and ten females from each of two replicate vials .
For three expression profiles, G1, G2, and G3, consider the model where G1 affects G2, and G2 affects G3. In this case, G2 = α + β1 G1+ ε and G3 = α + β2 G2 + ε, thus, G3 = α + β2 G2 + β1G1 + ε. In the first two models, we expect the effects β1 and β2 to be significant, while in the last model the partial regression coefficient of β1should not be significant since the effect of G2 has been already accounted for in the model [33, 53]. Using this logic, the order of the pathway in Figure 1 can be tested. For example, to test the hypothesis that Rheb only affects myc via TOR, the model myc = α + β TOR + β Rheb + e is fit and the partial regression coefficient for Rheb is examined. If the hypothesis is false, then β Rheb will be different from zero. In this way, a series of models examining the relationships proposed by the known pathway can be tested.
This research was supported by 1R24GM065513 and NIH RGM076643 to SVN, DAAD 19-03-1-0152 to LH, K99ES017367 to JAB, and 5R01GM077618 to LMM and SVN. Thanks to Xiting Yan for analysis help and thanks to two anonymous reviewers for comments.
- McCay CM, Crowell MF, Maynard LA: The effects of retarded growth upon the length of life span and upon the ultimate body size. J Nutrition. 1935, 10: 63-79.Google Scholar
- Klass MR: Aging in the nematode Caenorhabditis elegans: major biological and environmental factors influencing life span. Mech Ageing Dev. 1977, 6 (6): 413-429.View ArticlePubMedGoogle Scholar
- Morris JZ, Tissenbaum HA, Ruvkun G: A phosphatidylinositol-3-OH kinase family member regulating longevity and diapause in Caenorhabditis elegans. Nature. 1996, 382 (6591): 536-539.View ArticlePubMedGoogle Scholar
- Paradis S, Ruvkun G: Caenorhabditis elegans Akt/PKB transduces insulin receptor-like signals from AGE-1 PI3 kinase to the DAF-16 transcription factor. Genes Dev. 1998, 12 (16): 2488-2498.PubMed CentralView ArticlePubMedGoogle Scholar
- Hertweck M, Gobel C, Baumeister R: C. elegans SGK-1 is the critical component in the Akt/PKB kinase complex to control stress response and life span. Dev Cell. 2004, 6 (4): 577-588.View ArticlePubMedGoogle Scholar
- Kenyon C, Chang J, Gensch E, Rudner A, Tabtiang R: A C. elegans mutant that lives twice as long as wild type. Nature. 1993, 366 (6454): 461-464.View ArticlePubMedGoogle Scholar
- Lin K, Hsin H, Libina N, Kenyon C: Regulation of the Caenorhabditis elegans longevity protein DAF-16 by insulin/IGF-1 and germline signaling. Nat Genet. 2001, 28 (2): 139-145.View ArticlePubMedGoogle Scholar
- Gershman B, Puig O, Hang L, Peitzsch RM, Tatar M, Garofalo RS: High-resolution dynamics of the transcriptional response to nutrition in Drosophila: a key role for dFOXO. Physiol Genomics. 2007, 29 (1): 24-34.View ArticlePubMedGoogle Scholar
- Guertin DA, Guntur KV, Bell GW, Thoreen CC, Sabatini DM: Functional genomics identifies TOR-regulated genes that control growth and division. Curr Biol. 2006, 16 (10): 958-970.View ArticlePubMedGoogle Scholar
- Teleman AA, Hietakangas V, Sayadian AC, Cohen SM: Nutritional control of protein biosynthetic capacity by insulin via Myc in Drosophila. Cell Metab. 2008, 7 (1): 21-32.View ArticlePubMedGoogle Scholar
- Tatar M, Kopelman A, Epstein D, Tu MP, Yin CM, Garofalo RS: A mutant Drosophila insulin receptor homolog that extends life-span and impairs neuroendocrine function. Science. 2001, 292 (5514): 107-110.View ArticlePubMedGoogle Scholar
- Hwangbo DS, Gershman B, Tu MP, Palmer M, Tatar M: Drosophila dFOXO controls lifespan and regulates insulin signalling in brain and fat body. Nature. 2004, 429 (6991): 562-566.View ArticlePubMedGoogle Scholar
- Accili D, Arden KC: FoxOs at the crossroads of cellular metabolism, differentiation, and transformation. Cell. 2004, 117 (4): 421-426.View ArticlePubMedGoogle Scholar
- Puig O, Tjian R: Nutrient availability and growth: regulation of insulin signaling by dFOXO/FOXO1. Cell Cycle. 2006, 5 (5): 503-505.View ArticlePubMedGoogle Scholar
- Nuzhdin SV, Wayne ML, Harmon KL, McIntyre LM: Common pattern of evolution of gene expression level and protein sequence in Drosophila. Mol Biol Evol. 2004, 21: 1308-1317.View ArticlePubMedGoogle Scholar
- Wayne ML, Telonis-Scott M, Bono LM, Harshman L, Kopp A, Nuzhdin SV, McIntyre LM: Simpler mode of inheritance of transcriptional variation in male Drosophila melanogaster. Proc Natl Acad Sci USA. 2007, 104 (47): 18577-18582.PubMed CentralView ArticlePubMedGoogle Scholar
- Dudoit S, Fridlyand J: A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol. 2002, 3 (7): RESEARCH0036-PubMed CentralView ArticlePubMedGoogle Scholar
- Tibshirani R, Hastie T, Narasimhan B, Eisen MB, Sherlock G, Brown PO, Botstein D: Exploratory screening of genes and cluster from microarray experiments. Stat Sinica. 2002, 12: 47-59.Google Scholar
- Parmigiani G, Garrett E, Irizarry RA, Zeger S: The analysis of gene expression data: an overview of methods and software. The Analysis of Gene Expression Data. Edited by: Parmigiani G, Garrett E, Irizarry RA, Zeger S. 2003, New York, NY: Springer, 1-36.View ArticleGoogle Scholar
- Fabrigar L, MacCallum R, Wegener D, Stahan E: Evaluating the use of exploratory factor analysis in psychological research. Psychol Methods. 1999, 4: 272-299.View ArticleGoogle Scholar
- Preacher KJ, MacCallum RC: Exploratory factor analysis in behavior genetics research: factor recovery with small sample sizes. Behav Genet. 2002, 32 (2): 153-161.View ArticlePubMedGoogle Scholar
- Fellenberg K, Hauser NC, Brors B, Neutzner A, Hoheisel JD, Vingron M: Correspondence analysis applied to microarray data. Proc Natl Acad Sci USA. 2001, 98 (19): 10781-10786.PubMed CentralView ArticlePubMedGoogle Scholar
- Wouters L, Gohlmann HW, Bijnens L, Kass SU, Molenberghs G, Lewi PJ: Graphical exploration of gene expression data: a comparative study of three multivariate methods. Biometrics. 2003, 59 (4): 1131-1139.View ArticlePubMedGoogle Scholar
- Coffman CJ, Wayne ML, Nuzhdin SV, Higgins LA, McIntyre LM: Identification of co-regulated transcripts affecting male body size in Drosophila. Genome Biol. 2005, 6 (6): R53-PubMed CentralView ArticlePubMedGoogle Scholar
- Vellai T, Takacs-Vellai K, Zhang Y, Kovacs AL, Orosz L, Muller F: Genetics: influence of TOR kinase on lifespan in C. elegans. Nature. 2003, 426 (6967): 620-View ArticlePubMedGoogle Scholar
- Jia K, Chen D, Riddle DL: The TOR pathway interacts with the insulin signaling pathway to regulate C. elegans larval development, metabolism and life span. Development. 2004, 131 (16): 3897-3906.View ArticlePubMedGoogle Scholar
- Kapahi P, Zid BM, Harper T, Koslover D, Sapin V, Benzer S: Regulation of lifespan in Drosophila by modulation of genes in the TOR signaling pathway. Curr Biol. 2004, 14: 885-890.PubMed CentralView ArticlePubMedGoogle Scholar
- Curtis C, Landis GN, Folk D, Wehr NB, Hoe N, Waskar M, Abdueva D, Skvortsov D, Ford D, Luu A, et al: Transcriptional profiling of MnSOD-mediated lifespan extension in Drosophila reveals a species-general network of aging and metabolic genes. Genome Biol. 2007, 8 (12): R262-PubMed CentralView ArticlePubMedGoogle Scholar
- Hatcher L: A Step-by-Step Approach to Using the SAS System for Factor Analysis and Structural Equation Modeling. 1994, Cary, NC: SAS PublishingGoogle Scholar
- McIntyre LM, Bono LM, Genissel A, Westerman R, Junk D, Telonis-Scott M, Harshman L, Wayne ML, Kopp A, Nuzhdin SV: Sex-specific expression of alternative transcripts in Drosophila. Genome Biol. 2006, 7 (8): R79-PubMed CentralView ArticlePubMedGoogle Scholar
- Lynch M, Walsh B: Genetics and Analysis of Quantitative Traits. 1997, Sunderland, MA: Sinauer AssociatesGoogle Scholar
- Storey JD: A direct approach to false discovery rates. J Roy Stat Soc Ser B. 2002, 64: 479-498.View ArticleGoogle Scholar
- Neto EC, Ferrara CT, Attie AD, Yandell BS: Inferring causal phenotype networks from segregating populations. Genetics. 2008, 179: 1089-1100.View ArticleGoogle Scholar
- Landry CR, Castillo-Davis CI, Ogura A, Liu JS, Hartl DL: Systems-level analysis and evolution of the phototransduction network in Drosophila. Proc Natl Acad Sci USA. 2007, 104 (9): 3283-3288.PubMed CentralView ArticlePubMedGoogle Scholar
- Hersh BM, Nelson CE, Stoll SJ, Norton JE, Albert TJ, Carroll SB: The UBX-regulated network in the haltere imaginal disc of D. melanogaster. Dev Biol. 2007, 302 (2): 717-727.PubMed CentralView ArticlePubMedGoogle Scholar
- Hansen D, Schedl T: The regulatory network controlling the proliferation-meiotic entry decision in the Caenorhabditis elegans germ line. Curr Top Dev Biol. 2006, 76: 185-215.View ArticlePubMedGoogle Scholar
- Friedman A, Perrimon N: Genetic screening for signal transduction in the era of network biology. Cell. 2007, 128 (2): 225-231.View ArticlePubMedGoogle Scholar
- Sieberts SK, Schadt EE: Moving toward a system genetics view of disease. Mamm Genome. 2007, 18 (6–7): 389-401.PubMed CentralView ArticlePubMedGoogle Scholar
- Roberts A, Pardo-Manuel de Villena F, Wang W, McMillan L, Threadgill DW: The polymorphism architecture of mouse genetic resources elucidated using genome-wide resequencing data: implications for QTL discovery and systems genetics. Mamm Genome. 2007, 18 (6–7): 473-481.PubMed CentralView ArticlePubMedGoogle Scholar
- Ghazalpour A, Doss S, Zhang B, Wang S, Plaisier C, Castellanos R, Brozell A, Schadt EE, Drake TA, Lusis AJ, et al: Integrating genetic and network analysis to characterize genes related to mouse weight. PLoS Genet. 2006, 2 (8): e130-PubMed CentralView ArticlePubMedGoogle Scholar
- Schadt EE, Lum PY: Thematic review series: systems biology approahces to metabolic and cardiovascular disorders. Reverse engineering gene networks to identify key drivers of complex disease phenotypes. J Lipid Res. 2006, 74: 2601-2613.View ArticleGoogle Scholar
- Schadt EE: Exploiting naturally occurring DNA variation and molecular profiling data to dissect disease and drug response traits. Curr Opin Biotechnol. 2005, 16 (6): 647-654.View ArticlePubMedGoogle Scholar
- Schadt EE, Lamb J, Yang X, Zhu J, Edwards S, Guhathakurta D, Sieberts SK, Monks S, Reitman M, Zhang C, et al: An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet. 2005, 37 (7): 710-717.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhu J, Lum PY, Lamb J, GuhaThakurta D, Edwards SW, Thieringer R, Berger JP, Wu MS, Thompson J, Sachs AB, et al: An integrative genomics approach to the reconstruction of gene networks in segregating populations. Cytogenet Genome Res. 2004, 105 (2–4): 363-374.View ArticlePubMedGoogle Scholar
- Schadt EE: Novel integrative genomics strategies to identify genes for complex traits. Anim Genet. 2006, 37 (Suppl 1): 18-23.PubMed CentralView ArticlePubMedGoogle Scholar
- Genissel A, McIntyre LM, Wayne ML, Nuzhdin SV: Cis and trans regulatory effects contribute to natural variation in transcriptome of Drosophila melanogaster. Mol Biol Evol. 2008, 25 (1): 101-110.View ArticlePubMedGoogle Scholar
- Wang HY, Fu YX, McPeek MS, Lu X, Nuzhdin S, Xu A, Lu J, Wu M-L, Wu C-I: Complex genetic interactions underlying expression differences between Drosophila races: analysis of chromosome substitutions. Proc Nat Acad Sci USA. 2008, 105: 6362-6367.PubMed CentralView ArticlePubMedGoogle Scholar
- Kruglyak L: The road to genome-wide association studies. Nat Reviews Genet. 2008, 9: 314-318.View ArticleGoogle Scholar
- Beavis WD: The power and deceit of QTL experiments: lessons from comparitive QTL studies. Proceedings of the Forty-Ninth Annual Corn & Sorghum Industry Research Conference. 1994, Washington, DC: American Seed Trade AssociationGoogle Scholar
- Dennis G, Sherman BT, Hosack DA, Yang J, Baseler MW, Lane HC, Lempicki RA: DAVID: Database for annotation, visualization, and integrated discovery. Genome Biology. 2003, 2003 (4): P3-View ArticleGoogle Scholar
- Hosack DA, Dennis G, Sherman BT, Lane HC, Lempicki RA: Identifying biological themes within lists of genes with EASE. Genome Biology. 2003, 4: R70-PubMed CentralView ArticlePubMedGoogle Scholar
- Telonis-Scott M: Genetic architecture of two fitness-related traits in Drosophila melanogaster: ovariole number and thorax length. Genetica. 2005, 125: 211-222.View ArticlePubMedGoogle Scholar
- Zhu J, Zhang B, Smith EN, Drees B, Brem RB, Kruglyak L, Bumgarner RE, Schadt EE: Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nat Genet. 2008, 40: 854-861.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.