Population transcriptomics of Drosophila melanogaster females
© Müller et al; licensee BioMed Central Ltd. 2011
Received: 5 November 2010
Accepted: 28 January 2011
Published: 28 January 2011
Skip to main content
© Müller et al; licensee BioMed Central Ltd. 2011
Received: 5 November 2010
Accepted: 28 January 2011
Published: 28 January 2011
Variation at the level of gene expression is abundant in natural populations and is thought to contribute to the adaptive divergence of populations and species. Gene expression also differs considerably between males and females. Here we report a microarray analysis of gene expression variation among females of 16 Drosophila melanogaster strains derived from natural populations, including eight strains from the putative ancestral range in sub-Saharan Africa and eight strains from Europe. Gene expression variation among males of the same strains was reported previously.
We detected relatively low levels of expression polymorphism within populations, but much higher expression divergence between populations. A total of 569 genes showed a significant expression difference between the African and European populations at a false discovery rate of 5%. Genes with significant over-expression in Europe included the insecticide resistance gene Cyp6g1, as well as genes involved in proteolysis and olfaction. Genes with functions in carbohydrate metabolism and vision were significantly over-expressed in the African population. There was little overlap between genes expressed differently between populations in females and males.
Our results suggest that adaptive changes in gene expression have accompanied the out-of-Africa migration of D. melanogaster. Comparison of female and male expression data indicates that the vast majority of genes differing in expression between populations do so in only one sex and suggests that most regulatory adaptation has been sex-specific.
Over the past decade, microarray studies have shown that variation at the level of gene expression is abundant within natural populations [1, 2]. Similar studies have also revealed extensive differences in gene expression between males and females . Indeed, in the well-studied model organism Drosophila melanogaster, genes that differ in expression between the sexes (sex-biased genes) greatly outnumber those that differ in expression between individuals of the same sex [4–6]. Thus, it is important to account for sex when characterizing gene expression variation within species.
To date, most studies of gene expression variation within Drosophila species have been limited to a small number of laboratory strains, or to strains derived from a single non-African population [4–8]. These studies are useful for determining the amount and underlying genetic architecture of gene expression variation among individuals, but reveal little about the potential for gene expression levels to evolve adaptively in response to local environmental conditions. Studies of genomic and mitochondrial DNA variation suggest that D. melanogaster expanded from its ancestral range in sub-Saharan Africa and began to colonize Europe about 15,000 years ago [9–13], with a subsequent colonization of North America occurring within the past 500 years . Presumably, the out-of-Africa expansion was accompanied by adaptation to the new, temperate environment, and several studies have provided evidence for genetic adaptation in derived D. melanogaster populations [11, 15–17].
A previous microarray analysis of male gene expression variation in eight D. melanogaster strains from the ancestral species range (Zimbabwe, Africa) and eight strains from Europe (the Netherlands) identified 153 genes with a significant expression difference between the populations . These genes represent candidates for those having undergone adaptive regulatory evolution in response to the local environment and were enriched for genes with functions in insecticide resistance, fatty acid metabolism, and flight . The male expression data, however, provide only half of the story. Given the extent of sex-biased gene expression in D. melanogaster [19, 20], the potential for differences in the mode of inheritance of gene expression between males and females , the impact of the Y chromosome on gene expression variation [22, 23], and the proposed differences in effective population size between males and females of the African and European populations [24, 25], it is desirable to investigate expression variation among females of the same populations.
Here we report a microarray survey of gene expression variation in adult females of the African and European D. melanogaster populations. Our analyses are performed on three levels. First, we use the new microarray data to determine levels of gene expression polymorphism among females of each population, as well as gene expression divergence between populations. Second, we examine the contribution of sex-biased genes to the observed patterns of expression polymorphism and divergence. Third, we compare the female results with previously published results from males in order to detect differences in expression variation between the sexes. We find that, in females, there is little gene expression polymorphism within populations, but a relatively large number of genes with a significant expression difference between populations. The latter represent candidates for population-specific gene regulatory evolution and several of these genes show evidence that positive selection has acted on linked, cis -regulatory sequences. We find that sex-biased genes do not make a disproportionate contribution to expression variation among females. A comparison of the female and male results suggests that substantial sex-specific adaptation of gene expression levels has occurred following the out-of-Africa migration of D. melanogaster.
Expression polymorphism within and between populations
Number of polymorphic genes
Mean differences per pairwise comparison
Mean pairwise differences per gene (in %)
Among all strains
Across all 16 D. melanogaster strains, we found significantly less expression polymorphism in females than what was previously reported for males of the same strains , with females having 1.7-fold fewer polymorphic genes (24% vs. 38%; χ2 = 230, P < 0.0001), and 3.7-fold fewer significant pairwise differences per gene as males (0.89 vs. 3.28; Mann-Whitney test, P < 0.0001). These comparisons are conservative, because they use a common P -value of 0.001 for both sexes, which corresponds to a FDR of 30% in females, but only 7% in males. Reducing the FDR in females would reduce the number of polymorphic genes even further. However, even using the minimal P -value possible in our analysis (P = 0.0001), the FDR does not drop below 20%. A contributing factor to the observed difference between the sexes may be that there is less statistical power to detect expression polymorphism in the female experiment. Townsend  proposed the statistic GEL50, which is the fold-change difference at which there is a 50% chance of detecting a significant difference with P < 0.05, as a standard for comparing the power of microarray experiments. For the female experiment, the GEL50 was 1.85. This is higher than the GEL50 of 1.51 reported for the male experiment , but still within the range reported for similar surveys of expression polymorphism in Drosophila and other species . However, it is possible that small differences in GEL50 can lead to large differences in the percentage of genes detected as differentially expressed .
Expression polymorphism in sex-biased genes
Number of genes on array
Percentage of genes detected as expressed
Percentage of expressed genes:
Polymorphic in Europe
Polymorphic in Africa
Differentially expressed between populations
Average percentage of pairwise differences:
It was previously found that, among males, genes residing on the X chromosome show less expression polymorphism than those residing on the autosomes . This was attributed to the paucity of male-biased genes, which are the most polymorphic class in males, on the X chromosome . Consistent with this interpretation, we found no significant difference in the level of expression polymorphism between X-linked and autosomal genes in females, where many fewer male-biased genes are expressed. The proportions of polymorphic X-linked and autosomal genes were 25.3% and 23.9%, respectively (χ2 = 0.97, P = 0.33). The ratio of X-linked to autosomal significant pairwise differences per gene was 0.96.
The above results suggest that the difference in expression polymorphism between males and females can be explained partly by sex-biased gene expression, as male-biased genes tend to show the greatest expression polymorphism whether assayed in males or in females [8, 28] (Table 2) and make up a much greater proportion of the genes detected as expressed in males. However, when considering only unbiased genes (those expressed nearly equally in males and females), the percentage of polymorphic genes is still 1.6-fold lower in females than in males (24.7% vs. 39.2%; χ2 = 230, P < 0.0001). Similarly, unbiased genes show 3.9-fold fewer pairwise differences per gene in females than in males (0.95 vs. 3.74; Mann-Whitney test, P < 0.0001). This suggests that there are general differences between the sexes with respect to the regulation of gene expression and/or the level of purifying selection that restricts gene expression variation.
It has been observed that infection with sigma virus alters the expression of many more genes in males than in females , which is consistent with male gene expression being more sensitive to genetic and/or environmental perturbations than female gene expression. It has also been shown that genetic variation on the Y chromosome can affect expression levels of many X-linked and autosomal genes [22, 23]. Thus, one would expect there to be more expression variation among males, as this Y-linked source of expression variation is absent in females. Because our experiments used inbred strains that are homozygous over most of the genome, we are not able to detect gene expression variation caused by non-additive interactions between alleles in heterozygotes. Thus, the level of expression variation measured in our sample may be less than that observed among individuals sampled directly from natural populations. However, since the same inbred lines were used for both the male and female experiments, non-additivity cannot explain the difference observed between the sexes. Previous studies have shown, however, that non-additive interactions are more prevalent in females than in males [5, 21], which suggests that the difference between male and female expression polymorphism might be smaller in natural populations than in comparisons of inbred lines.
There was not an overrepresentation of sex-biased genes among those showing a significant expression difference between the African and European populations. In fact, there was a slight (but significant) under-representation of female-biased genes among the genes showing differential expression between the populations in females (Table 2). There was also no significant difference in the proportions of X-linked (10.0%) and autosomal (10.3%) genes that showed differential expression between the populations (χ2 = 0.10, P = 0.76).
GO-term enrichment of genes over-expressed in the European population
Olfactory receptor activity
Serine-type endopeptidase activity
Nucleoside transmembrane transporter activity
Eye-antennal disc development
Sensory organ boundary specification
Detection of chemical stimulus
Protein-DNA complex assembly
Sensory perception of smell
Regulation of action potential
Glycine metabolic process
GO-term enrichment of genes over-expressed in the African population
Triglyceride lipase activity
Single-stranded DNA binding
Nucleotide kinase activity
Cytochrome-c oxidase activity
Hexose metabolic process
Rhodopsin mediated phototransduction
Regulation of dendrite morphogenesis
Skeletal myofibril assembly
Female germ-line cyst encapsulation
Response to ecdysone
Induction of programmed cell death
Fatty acid metabolic process
Actin filament organization
Ovarian follicle cell stalk formation
Genes involved in carbohydrate metabolism were also enriched among the genes over-expressed in the African population (Table 4) and several of these genes were among the most over-expressed, including the maltase CG30360, and two α - glucosidases LvpH and tobi (Figure 3). tobi has been shown to be a target of the insulin- and glucagon-like signaling system . In this respect, it is noteworthy that the highly over-expressed gene Nlaz, which plays a role in stress response and determination of adult lifespan, also functions in carbohydrate homeostasis and has been suggested to interfere with insulin signaling .
Other enriched functions among the Africa over-expressed genes included oxidative phosphorylation and muscle formation (Table 4). However, many of the Africa over-expressed genes are of unknown function, including six of the 15 genes with the greatest over-expression in Africa and the gene showing the highest overall difference in expression between the African and European populations, CG8997 (Figure 3).
There were many more genes that differed significantly in expression between the European and African populations in females than in males. In females, 10.6% (569/5370) of the genes analyzed showed a significant inter-population difference with a FDR of 5%. In males, 3.4% (153/4528) of the genes analyzed showed a significant inter-population difference with a FDR of 8.7% (χ2 = 189, P < 0.0001). The lower FDR of the female experiment indicates that this is a conservative comparison. Furthermore, the GEL50 values for the female and male experiments were 1.22 and 1.18, respectively, indicating that the female experiment had slightly less statistical power to detect differences. This suggests that the different amounts of inter-population gene expression divergence observed between females and males have a biological basis. At the protein level, it has been reported that autosomal female-biased genes show evidence for greater adaptive evolution in the European population than in the African population . If this is indicative of a general pattern of stronger selection on females to adapt to the European environment, it could explain the excess of between-population expression differences in females relative to males. A possible reason for this is that females may be under greater selection to survive through the winter, while males that do not survive the winter may still contribute genes to future generations if their sperm is stored in a surviving female . The above hypothesis predicts that most expression differences between populations should be the result of changes occurring within the European population during colonization. At present, we do not have data that would allow us to infer the direction of inter-population expression changes and test this prediction.
Genes with a significant inter-population expression difference in both females and males
Unfolded protein binding; response to heat
Unknown; expressed in eye
Adenylate kinase; ADP biosynthesis
Actin filament; indirect flight muscle; immune response
Cytochrome P450; insecticide resistance
Unknown; upregulated under anoxia
Triglyceride lipase; lipid metabolism
Acyl-CoA dehydrogenase; fatty acid beta-oxidation
cGMP-dependent protein kinase; feeding behavior
The vast majority of genes detected as being differentially expressed between populations showed this pattern in only one sex. Of the 569 genes that differed in expression between females of the European and African populations, 557 showed this difference only in females. Of these, 310 genes were not detected as expressed in males, while 245 were detected as expressed but their expression did not differ significantly between the populations. Two other genes showed a significant expression difference between populations, but in opposite directions in the two sexes. The first, CG11395, is a gene of unknown function that had significant Africa over-expression in females, but significant Europe over-expression in males. The second is the foraging (for) gene, which encodes a cGMP-dependent protein kinase that influences larval and adult feeding behavior [37–39]. In females, for is significantly over-expressed in Africa, while in males it is significantly over-expressed in Europe.
Results of McDonald-Kreitman (MK) tests
MK-test P -value
Previous behavioral studies have shown that there is uni-directional mate-choice preference between D. melanogaster strains from Zimbabwe (Z) and cosmopolitan (M) strains, with Z females showing a preference for Z males [43, 44]. Michalak et al.  investigated gene expression in female heads from flies of the two mating types and identified 45 candidate genes that might be involved in the behavioral difference. Only one of these genes (CG7530) was significant in our experiment using whole females. This gene had higher M expression in the experiment of Michalak et al. , but higher Zimbabwe expression in our experiment. Thus, there is no concordance between the putative mating-behavior genes expressed differently in Z and M female heads and the genes expressed differently between the Zimbabwe and European populations in whole females. Two of the top candidate genes from Michalak et al. , desaturase2 and Odorant receptor 63a were not detected as being consistently expressed in our experiment and, thus, were excluded from the analysis. However, desaturase2 expression was detected in a higher proportion of Zimbabwe strains (7/8) than European strains (4/8) and, on average, showed two-fold higher expression in Zimbabwe strains in the hybridizations where it could be detected. This is consistent with the finding of Michalak et al.  that desaturase2 shows over-expression in Z strains. A comparison of gene expression in male heads between a single Z strain and a single M strain uncovered 1216 genes that differed in expression between the mating types . Although only 77 of these genes were detected as differentially expressed between the populations in our analysis, several of the overlapping genes were among those showing the greatest expression difference between Europe and Africa in both sexes, including Cyp6g1, Act88F, and the clustered genes CG8997, CG7916, and CG7953.
Our microarray survey identified over 500 genes showing low within-population expression polymorphism, but high between-population expression divergence in female D. melanogaster from Europe and Africa. The combination of low polymorphism and high divergence is a hallmark of positive selection and suggests that adaptive evolution at the gene regulatory level has occurred in conjunction with the recent colonization of non-African habitats. This is supported by the finding that Cyp6g1, whose expression is known to play an ecologically relevant role in insecticide resistance, was among the genes with the greatest inter-population expression difference. The functional basis for the inter-population divergence of the other genes is unknown, however there was an over-representation of genes involved in proteolysis, carbohydrate metabolism and sensory perception (both vision and olfaction). There was very little overlap between genes showing a significant expression difference between populations in females and in males. This suggests that most adaptive changes in gene expression are sex-specific and highlights the need for both sexes to be considered in studies of gene regulatory evolution.
Because our study focused on only one population from the ancestral species range (Zimbabwe) and one from the derived range (the Netherlands), it is not possible to distinguish global "out-of-Africa" adaptations from those that are specific to a local population. Surveys of nuclear DNA polymorphism indicate that there is little population structure within Europe, but more differentiation among some American, Asian, and African populations [47, 48]. There is also evidence for adaptive evolution of pigmentation, a trait known to be influenced by gene-regulatory variation, among African populations . Thus, there is likely to be gene expression divergence among various African and non-African populations. Further expression studies are needed to investigate this possibility.
Expression variation was surveyed for eight isofemale strains of both a European (Leiden, the Netherlands) and an African (Lake Kariba, Zimbabwe) population of D. melanogaster. The populations are as described in Glinka et al. . The fly strains were the same as those used in the expression analysis of adult male flies by Hutter et al. . Flies were maintained on standard cornmeal-molasses medium at 22° and constant lighting.
The CDMC 14kv1 microarray (Canadian Drosophila Microarray Centre, Mississauga, Canada) was used for all hybridizations. This platform features a total of 32,448 oligonucleotide probes (65-69 bases), each spotted in duplicate. The probes represent 13,688 unique genes, which correspond to 92% of those in the current D. melanogaster genome annotation (FlyBase release 5.27). Since the transcript-specific probes were designed to release 4.1 of the genome, some genes in the current annotation are not represented on the array, whereas others are represented by more than one probe.
For each strain, total RNA of 40 mated female flies, four-to-six days of age, was extracted using TRIzol reagent (Invitrogen, Carlsbad, CA, USA) and samples were stored at -80°. Reverse transcription was conducted using 50 μg of total RNA per strain and anchored oligo(dT) primers. cDNA samples were labeled with Alexa Flour 555 and 647 dyes using the SuperScript Plus Indirect cDNA Labeling System (Invitrogen) and following the manufacturer's protocol.
To compare expression levels of all fly strains to each other, the hybridization scheme developed by Hutter et al.  was followed. This approach allows expression levels of all strains to be compared, while keeping the number of hybridizations at a practical level (Figure 1). Six or eight replicate hybridizations per strain were performed on a total 56 microarrays. For each strain, three or four competitive hybridizations with other strains, plus their respective dye-swap hybridizations were performed. For technical replicates (dye-swaps), RNA from the same extractions was used, whereas for biological replicates (different pairwise hybridizations of strains), RNA extracted from a new set of flies was used. Arrays were pre-hybridized and washed using the Pronto! Universal Microarray Kit (Corning, Lowell, MA, USA) according to the manufacturer's protocol. Otherwise, hybridizations were conducted following the CDMC protocol. Arrays were scanned with an aQuire 2-laser microarray scanner and Qscan software (Genetix, New Milton, UK). All microarray data have been submitted to the Gene Expression Omnibus database under the accession numbers GSM580470-GSM580525 (platform GPL3603, series GSE23662).
Raw fluorescence intensities were normalized using CARMAweb , which is a web-based interface to the 'limma' package of Bioconductor . The default settings of 'minimum', 'printtiploess', and 'quantile' were used for background correction, within-array normalization, and between-array normalization, respectively. Between-array normalization was done using pairs of dye-swap hybridizations. As a quality control step to eliminate background noise from genes that are not expressed (or expressed only at very low levels) in adult females, we required that a spot have mean signal intensity at least one SD above local background in both channels to be included in the analysis. In cases where both replicate spots of a probe passed quality control, the arithmetic mean of their log2(red/green) intensities was used. Otherwise, only the red/green intensity of the spot passing quality control was used.
The resulting normalized red/green-ratios were used as input for BAGEL , a program that estimates relative expression levels for each gene in each of the 16 strains using a Bayesian framework. To determine the experiment-wide false discovery rate (FDR), we repeated the BAGEL analysis on a randomized version of our final data set. Randomization was performed by sampling- with-replacement within each hybridization (i.e., randomizing within a column), thereby maintaining the underlying data structure (e.g., excluded genes) within each hybridization. The resulting output was used to determine the FDR corresponding to a given P -value.
To identify genes that differ in expression between Africa and Europe on a population level, we repeated the BAGEL analysis using only the 16 hybridizations in which an African strain was compared directly to a European strain (black lines in Figure 1). All strains from the same population were combined into a single node and, thus, treated as biological replicates from within a population. To determine the FDR, BAGEL was run on a randomized data set that was created by permuting the expression ratios of the replicate hybridizations within each gene (i.e., randomizing within a row). As an additional quality control step, we required that each gene be detected as expressed (by the criteria described above) in at least nine of the 16 replicate hybridizations.
For each strain, qRT-PCR of two biological replicates, representing two separate RNA extractions of 20 four-to-six day-old mated female flies, was performed. Following DNase I digestion, 5 μg of total RNA was reverse transcribed using Superscript II reverse transcriptase and random hexamer primers (Invitrogen). The resulting cDNA was diluted 1:40 and used for qRT-PCR with TaqMan probes and TaqMan Gene Expression Master Mix (Applied Biosystems, Foster City, CA, USA) according to the manufacturer's protocol. To validate expression differences detected by our microarray analysis, qRT-PCR was performed on a Bio-Rad Real-Time thermal cycler CFX96 (Bio-Rad, Hercules, CA, USA) for the following target genes (TaqMan IDs are given in parentheses): Cyp6g1 (Dm01819889_g1), Jon99Fi (Dm02146518_s1), CG18180 (Dm01801887_s1), CG14227 (Dm01845429_g1), m2 (Dm02151465_s1), CG8997 (Dm01791303_g1), CG7916 (Dm01791305_g1), Nlaz (Dm01844577_g1), CG7409 (Dm01840751_s1), Act88F (Dm02362815_s1), CG18179 (Dm01801878_s1), slif (Dm01792789_g1), and Socs16D (Dm01813854_g1). Expression levels of all target genes were normalized to Actin 5C (Dm02361909_s1), which was used as an internal control. All assays were performed in three technical replicates, and for each gene the average threshold cycle (Ct) value over all biological and technical replicates was determined. ΔCt values were calculated by subtracting the control Ct from the target Ct value. The fold-change in expression between two samples was calculated as 2(- ΔCt1- ΔCt2). To determine the fold-change between the African and the European population, ΔCt values were averaged among strains within each population and the European value was used as ΔCt2.
Enriched GO terms within the lists of differentially expressed genes were identified using the GOEAST web server . Prior to analysis, the annotation of the CDMC microarrays was updated to match FlyBase release 5.27 by performing a BLAT search of all probe sequences with the UCSC genome browser . Probes giving a unique hit to an annotated transcript were matched with their release 5.27 GO terms. Significant GO term enrichment was determined by the hypergeometric method with Hochberg FDR multiple-test correction , with the FDR set to 0.05. As a background for GO enrichment tests, we used all genes on the CDMC microarray that were detected as expressed in our experiments (i.e., those passing the quality control steps described above).
DNA sequence polymorphism in the genomic region encompassing the genes CG8997 and CG7916 was previously reported . These authors directly sequenced PCR-amplified genomic DNA from the same strains used in our microarray analysis, plus an additional four strains each from the Zimbabwe and Netherlands populations . We used all of the available sequences for McDonald-Kreitman tests  of selection on nonsynonymous and intergenic sites using the DnaSP (v5) software . The test compares ratios of divergence-to-polymorphism at the test sites (nonsynonymous or intergenic) to those at synonymous sites and provides evidence for adaptive evolution when there is a relative excess of divergence at the test sites, which is consistent with recurrent selective sweeps since the time of speciation.
To analyze sequence variation in the Cyp6g1 region, we performed diagnostic PCR on the 16 strains used in our microarray analysis. The primers used to detect the Accord insertion were 5'-GAAAGCCGGTTGTGTTTAATTAT-3' and 5'-CTTTTTGTGTGCTATGGTTTAGTTAG-3', which flank the insertion site. An additional forward primer complementary to the Accord insertion (5'-GGGTGCAACAGAGTTTCAGGTA-3') was used to confirm its presence . The primers used to detect tandem duplication of the Cyp6g1 locus were 5'-CGAGTACGAGAGCGTGGAG-3' and 5'-ATTAAACACAACCGGCTTTCTCG-3' . Following PCR, the products were sequenced to confirm that the expected target sequence was amplified.
For comparisons of categorical data (e.g., numbers of polymorphic and non-polymorphic genes in males and females) we used standard 2 × 2 contingency table analyses. P -values were determined by Fisher's exact test or, when the sample sizes were large, by a chi-squared approximation. To test for differences between two samples (e.g., Cyp6g1 expression between strains with and without the Accord insertion) we used the non-parametric Mann-Whitney U test, which compares the rank-sums of the observed values of two samples. This approach was used to avoid making assumptions about the underlying distribution of gene expression levels among individuals or classes of genes. All tests were performed using R (version 2.10.1) .
We thank Hedwig Gebhart and Carmen Iannitti for technical assistance and all members of the Munich population genetics group for comments and discussion. This work was carried out as part of the research unit "Natural selection in structured populations" (FOR 1078) funded by Deutsche Forschungsgemeinschaft grants STE 325/12-1 to WS and PA 903/5-1 to JP. RS was supported by a fellowship from the international research training program "RECESS: Regulation and evolution of cellular systems" (Deutsche Forschungsgemeinschaft GRK 1563).
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.