Immunity related genes in dipterans share common enrichment of AT-rich motifs in their 5' regulatory regions that are potentially involved in nucleosome formation
© Hernandez-Romano et al; licensee BioMed Central Ltd. 2008
Received: 11 March 2008
Accepted: 09 July 2008
Published: 09 July 2008
Understanding the transcriptional regulation mechanisms in response to environmental challenges is of fundamental importance in biology. Transcription factors associated to response elements and the chromatin structure had proven to play important roles in gene expression regulation. We have analyzed promoter regions of dipteran genes induced in response to immune challenge, in search for particular sequence patterns involved in their transcriptional regulation.
5' upstream regions of D. melanogaster and A. gambiae immunity-induced genes and their corresponding orthologous genes in 11 non-melanogaster drosophilid species and Ae. aegypti share enrichment in AT-rich short motifs. AT-rich motifs are associated with nucleosome formation as predicted by two different algorithms. In A. gambiae and D. melanogaster, many immunity genes 5' upstream sequences also showed NFκB response elements, located within 500 bp from the transcription start site. In A. gambiae, the frequency of ATAA motif near the NFκB response elements was increased, suggesting a functional link between nucleosome formation/remodelling and NFκB regulation of transcription.
AT-rich motif enrichment in 5' upstream sequences in A. gambiae, Ae. aegypti and the Drosophila genus immunity genes suggests a particular pattern of nucleosome formation/chromatin organization. The co-occurrence of such motifs with the NFκB response elements suggests that these sequence signatures may be functionally involved in transcriptional activation during dipteran immune response. AT-rich motif enrichment in regulatory regions in this group of co-regulated genes could represent an evolutionary constrained signature in dipterans and perhaps other distantly species.
Organismal complexity is dependent on the network that regulates gene expression, rather than the number of genes in its genome [1–3]. Thus, one of the biggest challenges in postgenomic research is understand the regulatory mechanisms controlling location, timing and intensity of gene expression.
Organisms are permanently sensing changes in their environment. Environmental agents activate cellular signaling pathways that lead to a rapid expression of specific genes to respond to changes. These pathways transmit their signal to specific transcription factors (TFs) which gain access to response elements (REs) located in promoter and enhancer regions of the corresponding gene  resulting in transcriptional activation. In eukaryotes these protein-DNA interactions occur in the context of a chromatin template within the cell nucleus. The fundamental unit of chromatin is the nucleosome, composed by a segment of 146 base pairs of double stranded DNA wrapped around a core of histone proteins . Initially, nucleosomes were regarded as structures required for the packing of long DNA molecules into the cellular nucleus , but it is now clear that chromatin structure plays a central role in the regulation of gene expression [6–9]. At least three mechanisms have been proposed for the active role of chromatin in transcriptional regulation. First, by preventing TF binding to its cognate RE as revealed by the pioneering studies in the expression of PHO5 gene in response to phosphate starvation . Secondly, wrapping DNA in nucleosomes may promote transcription by allowing closely adjacent RE access to their cognate TF [11, 12]. Third, nucleosomes may approximate distant regulatory elements, as it occurs in the alcohol-dehydrogenase (Adh) promoter region of Drosophila .
Nucleosomes are located in preferred positions with respect to DNA sequence [14–21]. It has been shown that on a statistical level, groups of experimentally obtained nucleosomal sequences display periodicity in the occurrence of dinucleotides such as GG, TA, TG, and TT [14, 15, 20] or trinucleotides such as VWG ([G/C/A] [A/T]G) . This periodicity tends to occur approximately every 10 bp, coinciding with one turn of the DNA chain and confers better bending properties required for wrapping DNA around the histone core. However, this periodicity is difficult to identify on individual nucleosomal sequences due to a low signal/noise ratio. The non-random distribution of nucleosomes suggests that some DNA sequences are more likely to form stable nucleosomes, and therefore nucleosome forming sequences could be predicted using computational methods based on the sequence features identified so far [20, 22].
Immune responses are inducible phenomena resulting from a close relationship between the environment, pathogen signal detection systems and the gene expression machinery . Upon pathogen recognition, several transduction pathways are activated leading to the activation of TFs that induce gene expression . In Drosophila melanogaster, the Toll and Imd pathways converge in the activation of the NFκB/Rel-related TFs, Dif and Relish, respectively, which bind to NFκB REs located in the 5' upstream regions of antimicrobial peptide genes, thus promoting their transcription .
Understanding the transcriptional regulation mechanisms during insect immune response is of fundamental interest in biology, but also could provide the rational basis for developing strategies to control vector borne diseases. In this work, we describe that immunity genes induced upon immune challenge in D. melanogaster and Anopheles gambiae, the main African malaria vector, share an enrichment of AT-rich motifs in their 5' regulatory regions. Enrichment of AT-rich motifs was also observed in 10 additional non-melanogaster Drosophila species and Aedes aegypti immunity orthologs. These motifs are different to REs in terms of statistical frequency and length. Their occurrence correlates with predicted nucleosomal positions [18, 20, 22], suggesting that AT-rich motifs may be involved in chromatinization and transcriptional regulation of immunity related transcriptional gene modules in these insects.
Regulatory regions of immunity-related co-expressed genes of Anopheles gambiae and Drosophila melanogaster induced upon immune challenge are enriched in AT-rich specific DNA motifs
We used public available and author provided microarray databases [26, 27], coupled to bioinformatics analysis tools for regulatory sequences, to identify sequence patterns potentially involved in transcriptional regulation operating during immune response in D. melanogaster and A. gambiae .
Number of 5' upstream sequences analyzed according expression pattern and functional class.
Immunity-related genes from A. gambiae.
Serine protease domain (IPR001314)
Antimicrobial peptide Cecropin (IPR000875)
Peptidoglycan recognition protein long class (PGRP-LB)
Chitin binding domain, Glycoside hydrolase (IPR002557, IPR001579)
Serine protease domain, Gastrulation defective precursor (IPR001314, IPR001254)
Gram negative binding protein subgroup B (GNBPB1)
Fibrinogen domain (IPR002181)
Leucine-rich repeat (IPR001611)
Phosphotyrosine interaction region (IPR006020)
Putative infection responsive (Gambicin)
Dopa Decarboxylase isoform 1
Scavenger receptor class B Croquemort type
Thioester-Containing Protein (TEP3)
CLIP-domain serine protease subfamily B (CLIPB14) (IPR001254, IPR006604)
Immunity-related genes from D. melanogaster.
Ensembl Gene ID
Peptidoglycan-recognition protein-SA precursor (Protein semmelweis)
Nuclear factor NF-kappa-B p110 subunit (Relish protein) (Rel-p110)
Peptidoglycan-recognition protein-LB precursor
Peptidoglycan-recognition protein-SC2 precursor
Immune-induced peptide 23 precursor (DIM-23)
Immune-induced peptide 2 precursor (DIM-2)
Immune-induced peptide 1 precursor (DIM-1)
Immune-induced peptides precursor (DIM-10; DIM-12; DIM-13; DIM-24)
Peptidoglycan-recognition protein-LF (PGRP-like protein)
Protein toll precursor
NF-kappa-B inhibitor cactus.
Probable serine/threonine-protein kinase pelle
Protein spaetzle precursor
Embryonic polarity protein dorsal
Dorsal-related immunity factor Dif
Peptidoglycan-recognition protein-SD precursor.
Peptidoglycan-recognition protein-SB1 precursor
Non-modified genes from A. gambiae.
Cytoskeleton-associated proteins (CAP-Gly)
HMG-I and HMG-Y, DNA-binding Basic-leucine zipper (bZIP) transcription factor
Dehydrogenase, E1 component, Transketolase, central region
Fumarate lyase, Delta crystallin
Translation initiation factor 4C (1A)
Peptidase M16, C-terminal
Antifreeze protein, type I, Ubiquitin system component Cue
Domain of unknown function DUF1907
Endoplasmic reticulum targeting sequence, Torsin
Neutral zinc metallopeptidases, zinc-binding region signature
Leucine aminopeptidase-related (PTHR11963)
RNA 3'-terminal phosphate cyclase, insert region
Non-modified genes from D. melanogaster
Ensembl Gene ID
Eggshell protein, Zinc finger, CCHC-type, RNA-binding region RNP-1 (RNA recognition motif)
Serine/threonine protein kinase, active site
Histone H3, Dopamine D4 receptor
HAD-superfamily subfamily IB hydrolase, hypothetical 1
Zinc finger, C2H2-type
Probable mitochondrial 28S ribosomal protein S6
Putative 60S ribosomal protein L33
BAX inhibitor related (PTHR23291)
Peptidase, cysteine peptidase active site, Dynein heavy chain, N-terminal, ATPase associated with various cellular activities
Zinc finger, C2H2-type, HMG-I and HMG-Y, DNA-binding
Opsin, KiSS-1 peptide receptor
Calcium-binding EF-hand, Antifreeze protein, type I
Cyclin-dependent kinase 9
mitochondrial ribosomal protein L40
Aminoacyl-tRNA synthetase, class I, Phosphotyrosine interaction region
60S ribosomal protein L14
tRNA pseudouridine synthase D, TruD
Claudin tight junction protein, Voltage-dependent calcium channel gamma
Protease inhibitor, Kazal-type
60S ribosomal protein L10a-2
Orphan nuclear receptor, HMR type,
Probable 60S ribosomal protein L37-A
Down-regulated genes from A. gambiae.
Ribosomal protein S30, ubiquitin
Ribosomal protein S11
Putative Tyr/Ser/Thr phosphatase
Probable methylmalonate-semialdehyde dehydrogenase, Mitochondrial precursor
Alpha-2-macroglobulin RAP, C-terminal
ADP, ATP carrier protein 1 (ADP/ATP translocase 1)
Elongation factor 1 alpha
Down-regulated genes from D. melanogaster.
Ensembl Gene ID
Peptidase S1 and S6, chymotrypsin
D-amino acid oxidase
Hormone binding, cysteine peptidase active site
Trypsin delta/gamma precursor
Trypsin theta precursor
Trypsin zeta precursor
Sugar transporter superfamily
Immune-induced peptide 4 precursor (DIM-4)
Peptidase S1 and S6, chymotrypsin/Hap
Peptidase S1 and S6, chymotrypsin/Hap
Peptidase S1 and S6, chymotrypsin/Hap
Stretchin-Mlck, isoform E
Larval serum protein 1 beta chain precursor (Hexamerin 1 beta)
Peptidase S1 and S6, chymotrypsin/Hap
Peptidase S1 and S6, chymotrypsin/Hap
Adult cuticle protein 1 precursor (dACP-1)
Copper transporter 1B
Peptidase M14, carboxypeptidase A
Peptidase S1 and S6, chymotrypsin/Hap
Peptidase S1 and S6, chymotrypsin/Hap
Peptidase S1 and S6, chymotrypsin/Hap
Glycoside hydrolase, family 38
Glycoside hydrolase, family 38
To investigate whether the 5' regulatory regions of immunity-related genes shared common DNA motifs, 2500 bp 5' upstream sequences (5'-US) were recovered using Biomart, of Ensembl  and analyzed for statistically overrepresented motifs of 2 to 8 nucleotides in length, using Oligo-Analysis, which is based on binomial distribution . The background oligonucleotide frequencies were estimated calculating the relative frequencies of all possible oligonucleotides (ranging from 2 to 8 bp) within the 5'-US of 2500 bp of length of 13,166 A. gambiae or 13,172 D. melanogaster genes. Oligonucleotide occurrences were counted for each group of 5'-US and their statistical significance was estimated on the basis of the background frequencies. The significance index (sigocc) reflects the degree of overrepresentation of each motif on a logarithmic scale .
Motif overrepresentation in 5' upstream sequences (2,500 bp) of immunity-induced genes.
Motif length (bp)
Motifs of 4 bp reported by Oligo-Analysis in 5'-US (2500 pb) of A. gambiae genes.
Over-expressed immunity genes
Motifs of 4 bp reported by Oligo-Analysis in 5'-US (2500 pb) of D. melanogaster genes.
Motifs of 3 bp reported by Oligo-Analysis in 5'-UR (2500 pb) of A. gambiae genes.
Over-expressed immunity genes
Motifs of 3 bp reported by Oligo-Analysis in 5'-US (2500 pb) of D. melanogaster genes.
Motifs of 2 bp reported by Oligo-Analysis in 5'-US (2500 pb) of A. gambiae genes.
Over-expressed immunity genes
Motifs of 2 bp reported by Oligo-Analysis in 5'-US (2500 pb) of D. melanogaster genes.
Overrepresentation of AT rich motifs is specific of immunity-induced genes
Group of genes
5' upstream regions of immunity-related orthologous genes in the genus Drosophila and in Aedes aegypti are also enriched with AT-rich motifs
List of Drosophilid orthologous immunity genes used for over-representation analysis of AT-rich tetrads in 5' upstream regions (2000 bp)
Aedes aegypti orthologous immunity genes used for over-representation analysis of AT-rich motifs in 5'-US (2500 bp)
Ensembl Gene ID
antibacterial peptide, putative
developmental protein cactus
brain chitinase and chia
gram-negative bacteria binding protein
Orthologous of putative infection responsive short peptide
serine protease, putative
serine protease inhibitor 4, serpin-4
fibrinogen and fibronectin
Leucine rich domain
peptidoglycan recognition protein sb2
aromatic amino acid decarboxylase
aromatic amino acid decarboxylase
AT-rich tetrads are associated with high nucleosomal potential
Once observed that AT-rich motif enrichment was a general feature of 5'-US of immunity genes in several dipteran species, we evaluated the association of some of these motifs with predicted nucleosomal sites. Experimentally stable nucleosomes in mouse have AT-rich motifs, including the AA , TA [14, 20], TATA and ATAA motifs . The ATAA motif include the three 2 bp motifs statistically over-represented in immunity genes (AA, TA and AT) of A. gambiae and D. melanogaster. Taking into account the highly conserved nucleosomal structure and given that the ATAA motif was enriched in both D. melanogaster and A. gambiae, as well as in other Drosophila species, we hypothesized that the ATAA motif could also participate in nucleosome formation in dipteran immune response genes. Some algorithms have been developed to predict the chromatin structure from sequence [20, 22, 34]. The RECON algorithm uses experimentally determined nucleosomal sequences coupled to Monte Carlo methods and discriminant analysis of dinucleotide frequencies . It searches for a partition of non-overlapping regions in the nucleosomal sequences that provides the maximal value of the Mahalanobis distance that discriminates between nucleosomal and non-nucleosomal sequences. In this way, RECON determines the probability that a sequence forms nucleosomes and assign a nucleosomal potential value to each nucleotide according to the context of the sequence in which the nucleotide is immersed. Positive values of nucleosomal potential correspond to reliable predictions of nucleosome formation sites with a confidence level of p < 0.05 (α = 0.05), nucleosomal potential of +1 corresponds to the best predictions.
Additional information derived from Figure 5 is that the combination of RECON with Oligo-Analysis results allows detection of a property inherent to biological sequences. The ATAA distribution with respect to nucleosomal potential values was utterly different between biological and non-biological (artificial) sequences for both insects. On the one hand, biological sequences had more ATAA than non-biological sequences (1065 ATAA motifs in 30 random biological sequences versus 613 in 30 random non-biological sequences). The majority of the biological ATAA motifs were associated to positive nucleosomal potential (872/1065 or 81.9% motifs in 30 random biological sequences versus 208/613 or 33.9% in 30 random non-biological sequences). Additionally, the non-biological sequences presented an inverse distribution of ATAA motifs, with more ATAA motifs associated to negative values of nucleosomal potential (289/613 or 47.2% ATAA negatives versus 208/613 or 33.9% ATAA positives in 30 random non-biological sequences).
ATAA motifs correlate with high Nucleosomal Occupancy p values (pNO)
Segal and col.  recently reported an algorithm to predict nucleosome positions that takes into account sequence composition and thermodynamic properties. Using a collection of nucleosome bound DNA sequences from yeast, chicken or human, they constructed probabilistic models that represent the DNA sequence preferences for nucleosome formation and assign a p value to each nucleotide of the analyzed sequence; this value indicates the probability that the position is occupied by a nucleosome (p of Nucleosomal Occupancy, pNO).
More than 85% of all ATAA motifs found in 5'-US of A. gambiae and D. melanogaster were associated with pNO > 0.8 (p < 0.001). For A. gambiae immunity genes, 88% (476/541) of ATAA motifs were associated with pNO > 0.8, a similar distribution was obtained for the other A. gambiae gene groups (Figure 7A). The difference between ATAA associated with pNO > 0.8 and ATAA associated with pNO < 0.5 or undefined values was statistically significant (p < 0.001), showing a clear correlation between ATAA and high values of pNO. In a similar way, 85.9% (1340/1560) of ATAA motifs in 5'-US D. melanogaster immunity genes had pNO > 0.8, with a significant difference with regard to ATAA with pNO < 0.5 or undefined values (p < 0.001), the distribution of ATAA in the other D. melanogaster groups of genes also was statistically significant (p < 0.001) (Figure 7B).
The combination of Oligo-Analysis and pNO results also revealed a difference between biological and non-biological sequences. Thirty randomly selected biological sequences of D. melanogaster had 90% (959/1065) of ATAA motifs associated with pNO > 0.8 versus 55.1% (338/613) of ATAA motifs in 30 non-biological sequences, showing again a non-random distribution of biological ATAA motif and tagging it as part of a potential nucleosomal code (Figure 7B).
Surprisingly no ATAA motifs were found with pNO values between 0.8 and 0.5 (values that define the "medium p value" range, see methods) in any of the other gene groups analyzed in A. gambiae and D. melanogaster (Figure 7).
Taken together, we found a consistent tendency, demonstrated by two independent methods, showing that the AT-rich motif enrichment within a specific sequence context might favour nucleosome formation in immune genes of a wide variety of dipteran species.
A. gambiae and D. melanogaster 5'-US of immunity genes have NFκB response elements located in the first 500 pb of their 5'-US
In this work, we have documented that AT-rich motifs are over-represented in 5' upstream regions of immunity genes of mosquitoes and drosophilids. We documented also that the position of the AT-rich motifs is associated to nucleosomes as predicted by two different algorithms for nucleosome positioning, pointing out to a possible role of this motif in the transcriptional regulation of these functionally related genes through modification of chromatin structure involving nucleosome positioning.
Previous reports have found that sequences that form extremely stable nucleosomes are enriched with AT motifs referred as TATA boxes, which in many cases included the ATAA motif . When we correlated the positions of this motif with the output of two different algorithms that predict nucleosome positions [20, 22], we found that this motif correlates almost exclusively with positions with a high probability to form nucleosomes, suggesting that the ATAA motif enrichment is a DNA sequence pattern associated to nucleosome formation in these functionally related immunity genes. The conservation of enrichment of AT-rich motifs in 5'-US of immunity-related genes of Drosophilidae and Culicidae, which diverged 250 million years ago , suggest that this common feature may be the result of evolutionary constrained epigenetic mechanism of transcriptional regulation in immune-responsive genes in dipterans. More studies are required to define if this could be part of a more general mechanism of regulation in metazoans. The case of D. persimilis represents a caveat for our attempt generalize the implications of our findings, however, we cannot exclude that the current status of the annotation of such genomes may affect the results.
There are two conflicting views about nucleosome formation: one establishes that nucleosomes can potentially be formed anywhere in the genome regardless the sequence and therefore, it is not possible to predict sites for nucleosome formation . The other proposes that nucleosomes are associated to certain DNA sequences or sequence patterns that have an effect on the bending properties of DNA during nucleosome formation [14, 18]. This point of view has been gaining support in recent years due to the documentation of a great variability in the bending potential of DNA sequences [14, 15, 37] and therefore their capacity to form nucleosomes [18, 38, 39].
Two of the three two-letter motifs statistically over-represented in immunity promoters of A. gambiae and D. melanogaster, TA and AA, have been previously associated to nucleosome formation in human, yeast, chicken and mouse [14, 15, 20, 21]. Additionally, the ATAA motif, which is associated to nucleosome positions, and includes the three motifs containing two letters with the highest scores in both organisms (AA, TA and AT), have been found in sequences that form stable nucleosomes . Thus, AT-rich motifs in 5'-US regions of mosquitoes and drosophilid immunity genes could participate in the transcriptional regulation of genes induced by immune challenges in a different way to the typical response elements. In contrast to response elements, which can be functional single or in pairs in a promoter region, the AT-rich motifs are statistically enriched, with several copies distributed in a diffuse pattern through the promoter regions, suggesting its involvement in nucleosome formation. This diffuse sequence pattern of AT-rich motifs, different to discreet patterns displayed by response elements, represents a new insight on the role of DNA sequence context in transcriptional regulation.
Several reports have documented that genes with similar functions share similar nucleosomal occupancy patterns. Levitsky and col. , using the RECON algorithm to analyze distinct functional types of human promoters, found that tissue-specific gene promoters present higher nucleosomal potential than genes commonly expressed in many tissues (housekeeping genes). Segal and col. , using the Nucleosome Position Prediction algorithm to analyze different kinds of genomic sequences and gene sets biologically related, found that nucleosome occupancy varies depending of the analyzed genomic location type, and that groups of genes functionally related can be classified on the basis of their profiles of nucleosome occupancy in the open reading frames and intergenic regions. Recently, Lee and col. , using Hidden Markov Models to analyze experimentally obtained nucleosomes, also found a correlation between function and nucleosome occupancy. Each of these reports used a different method to analyze data sequences, and all found that nucleosomal sequences follow a distinctive pattern associated to the functionality of the genes.
In relation to RECON  and Nucleosome Positioning Prediction , it is important to note that none of these programs search for a priori defined motifs, the input for both programs are biological nucleosomal sequences from which information is extracted.
It has been shown that gene expression co-regulation is highly conserved in eukaryotes, for example, Saccharomyces cerevisiae and Caenorhabditis elegans, which diverged 1500 million years ago, still share a group of co-regulated genes , so it is plausible that Drosophilidae and Culicidae which diverged only 250 million years ago also share groups of functionally related co-regulated genes. The enrichment of AT-rich motifs in groups of co-regulated genes involved in immune response could provide the basis for developing new tools for the identification of different functional gene modules based on the compositional context of non-coding regulatory DNA. However, the high sigocc observed in manually curated datasets compared to the low sigocc observed in automatically annotated datasets highlights the importance of accurate TIS for regulatory region analysis.
Insect immunity relies on innate defense mechanisms to combat pathogens. In D. melanogaster, the Imd and Toll pathways lead to the activation of Rel/NFκB transcription factors that control a substantial proportion of the transcriptionally modified genes in response to pathogen infection . Many components of these pathways are conserved in A. gambiae and Ae. aegypti  and are also remarkably conserved in innate immunity signaling pathways in mammals (TLR and TNF-R signaling pathways, respectively) [33, 44–47]. The set of induced genes in both insects described here belong to the same functional group and many of them have NFκB response elements within 500 pb upstream from the predicted transcription start site, the same location where functionally important NFκB REs have been found in these and other insects [48–52]. Our findings indicate that besides being regulated by NFκB, the enrichment with the ATAA motif constitutes a particular pattern of chromatin structure involved in transcriptional regulation of these genes.
Interestingly, NFκB transcription factors bind to their response elements even if they are packaged in a nucleosome . Once bound to their response elements, NFκB transcription factors can recruit chromatin remodeling complexes to expose other response elements and allow the formation of the initiation complex . In the vertebrate immune system, chromatin structure is critical to establish Th1-Th2 differentiation through the action of specific transcription factor as GATA-3 and T-bet , and several cytokines posses nucleosomes located in their promoters which need to be removed to allow gene expression [56–59]. Thus, epigenetic phenomena such as histone modification (altered nucleosome conformation)  or remodeling of chromatin (change of nucleosome position)  are commonly a required step to achieve gene expression in response to external stimuli.
Based on the obtained results and previously reported information, we propose a model in which a subgroup of insect immunity genes remains silent in absence of an immune challenge due to nucleosome formation in their 5'-US regions. The presence of these nucleosomes occludes the access of transcription factors to REs involved in gene expression. After an immune challenge, the Toll and/or Imd pathways are activated which in turn lead to activation of Rel/NFκB transcription factors, which are translocated to the nucleus and bind to their NFκB REs and recruit chromatin modifying/remodeling factors that release DNA from nucleosomes allowing its interaction with the transcriptional machinery.
Functionally related genes could harbor in their regulatory region a regulatory code represented by the combination of REs plus, in some cases, particular short motifs associated to chromatin structure. This regulatory code functions like a lock, genes that need to be co-expressed will share the same lock, represented by REs organized in a similar way, or by specific REs associated to motifs that confer a distinctive chromatin structure. Cells are continuously sensing its environment and responding to adapt. The regulatory state of the cell, defined by the presence and state of activity of transcription factors , also changes continuously; this regulatory state represents the "key" needed to open the proposed lock. The active transcription factors present in a given time in the cell, determines the form of the "key" for the lock, and therefore, the class of promoters that will be open or closed. In the case of the immune genes studied here, we have identified evidence that is compatible with a potential regulatory unit involving chromatin structure (associated with ATAA), Rel/NFκB transcription factors and NFκB response elements. Other regulatory codes could exist involving anyone of these components, in addition to others.
The role of chromatin structure in gene expression regulation during immune response of insects remains poorly explored. This work provides a first insight into this complex regulatory mechanism potentially shared by immune genes of drosophilidae and culicidae.
Immunity genes of A. gambiae, Ae. aegypti, D. melanogaster and many other Drosophilid species share a common enrichment of AT-rich motifs in their 5'-US regions. AT-rich motifs are frequently associated to bioinformatic nucleosome positioning predictions, suggesting their participation in a particular nucleosome organization involved in transcriptional regulation of an immunity co-regulated module. Many of these regulatory regions also have NFκB response elements within the first 500 bp 5' from the transcription start site. These two features suggest that the mechanism of transcriptional regulation of immune response genes in dipterans are conserved and might occur through modifications in chromatin structure of promoter regions mediated by NFκB-dependent recruitment of remodeling factors. Our findings suggest that AT-rich motif enrichment in regulatory regions in this group of co-regulated genes could represent an evolutionary constrained signature in dipterans and perhaps other species, despite their evolutionary distance.
Gene selection criteria
Microarray data for A. gambiae immune response was kindly provided by the author . Microarray data from D. melanogaster immune response  was downloaded from . Analysis of expression profiles was conducted using the TMEV version 3.1 module of TM4 microarray software suite .
Three gene clusters were selected for each species by hierarchical clustering  from the microarray databases. For A. gambiae, gene selection criteria were as follows: Induced immunity-related genes: genes associated with immunity based on the protein structural features, and with expression values in log2(f2/f1) > 0 in at least 7/10 immunological challenges and 3/6 points for each challenge. Non-modified genes: genes with an expression mean of log2(f2/f1) ± 0.10 and a standard deviation of ± 0.15 in 9/10 immunological challenges. Down-regulated genes: genes with an expression level of log2(f2/f1) ≤ -0.5855 (a repression level of at least 1.5 times with respect to control cells) in at least 4/10 immune challenges and in 4/6 times for each challenge, and with a maximal expression level of log2(f2/f1) < 0.6785 in only one point per challenge (a maximal expression level less than 1.6 times with respect to control cells in only one point). For D. melanogaster, gene definitions were as follows: Immunity related genes: genes associated with immunity based on Gene Onthology classification (GO:0006952, defense response, biological process), and with expression values in log2(f2/f1) > 0 in 6/6 time points of bacterial challenge. Non-modified genes: genes with an expression mean of log2(f2/f1) ± 0.1 and a standard deviation of ± 0.07 in 6/6 time points of bacterial challenge. Down-regulated genes: genes with an expression level of log2(f2/f1) ≤ -0.5855 (a repression level of at least 1.5 times with respect to control) in 6/6 times of the bacterial challenge. Two additional groups were included in the analysis: Random genes: two groups of 20 and 30 genes were randomly selected from the A. gambiae and D. melanogaster genomes, respectively, using the "Random Gene Selection" tool of RSA-Tools . Random sequences (artificial): two groups of 20 and 30 random non-biological sequences were generated using the "Random DNA sequence" tool of Sequence Manipulation Suite, version 2 , using this tool we generated random sequences with equal proportions of each nucleotide (~0.25).
Using these definitions, for A. gambiae, we first selected the expression profiles and the associated gene was then identified. To verify the annotated transcription initiation site, each gene prediction was manually curated by two approaches: The first was by aligning corresponding EST clusters obtained from AnoEST  and UNIGENE  to the A. gambiae genome (AgamP3, Ensembl release 45, Jun 2007) using BLAST in the ENSEMBL genome browser . The second was based on manual verification of the presence of either TATA-box, Initiator sequence (Inr) or downstream promoter element (DPE) . For Drosophila, the gene ID was included in the microarray database.
Once the gene associated with each profile was identified, the 5' regulatory regions were recovered for D. melanogaster (BDGP4.3) and A. gambiae (AgamP3, Ensembl release 45, Jun 2007) genes using Ensembl's data mining tool Biomart .
An additional set of 5' upstream 2000 bp sequences from the 12 Drosophila species derived from the 12 Drosophila genome project  retrieved from , release R1.1 for D. virilis (23 sequences), R1.2 for D. ananassae (23 sequences), D. erecta (27 sequences), D. grimshawi (21 sequences), D. mojavensis (21 sequences), D. persimilis (20 sequences), D. sechellia (28 sequences), D. simulans (29 sequences), D. yakuba (27 sequences) and D. willistoni (24 sequences), R2.2 for D. pseudooscura (27 sequences) and R5.7 for D. melanogaster (35 sequences) and 15 Aedes aegypti  2500 bp sequences retrieved from Biomart  (AAEGL1) were also included for motif over-representation analysis. Selection of the Drosophila sequences was done based on orthology to D. melanogaster immunity gene data set described in Table 16, according to FlyBase annotations, but not expression data. For Ae. aegypti, genes were also selected based on one to one orthology to immunity genes in A. gambiae according to Table 17.
Statistically overrepresented DNA motifs
To identify statistically overrepresented DNA motifs in 5' DNA regulatory regions of selected genes, we used the Oligo-Analysis program  searching for DNA motifs of 2 to 8 nucleotides of length in 5' upstream regions of 2500 or 2000 nucleotides. For the analysis, we created our own expected frequency tables for each motif length, using 5' upstream regions of 2500 nucleotides length corresponding to 13,172 genes of D. melanogaster; 13,166 genes of A. gambiae, and 16,691 in Ae. aegypti. A similar approach was used for the 11 additional Drosophila species using pre-computed 2000 bp upstream 5' sequences. The obtained expected frequency tables were used to estimate the expected number of occurrences for each oligonucleotide in induced, down-regulated, non-modified, random biological and random no-biological sets of sequences. The analyzed sequences were aligned to detect and avoid duplication between sequences, and duplicated regions larger than 40 nucleotides inside a sequence were removed. Also, to prevent a bias due to self-overlapping, a non-overlapping mode was adopted. The detection of overrepresented oligonucleotides was based on an estimation of the significance of the observed occurrences (Oocc). For each oligonucleotide, the p value (Pocc) was calculated on the basis of the binomial distribution. Because the analysis comprise multiple tests (256 in the case of tetranucleotides), the possibility exists that even low p values appeared by chance. To correct for such a multitesting effect, the p values were multiplied by the number of oligonucleotides. This correction results in an expected value (Eocc). The significance index [sigocc = -log(Eocc)] reflects the degree of overrepresentation for each oligonucleotide in a logarithmic scale .
The motifs overrepresentation identified with Oligo-Analysis, was verified using POBO , which uses bootstrap to verify the statistical overrepresentation of a given motif.
Nucleosome positioning prediction
To predict regions of nucleosomal occupancy in the sequences of the distinct groups of A. gambiae and D. melanogaster 5'-US, we used two programs: RECON , which assigns a nucleosomal potential value at each position in a sequence using sliding windows of 160 pb based on statistical distribution of dinucleotide frequencies, and Nucleosome Position Prediction , which predicts nucleosomal positioning using probabilistic and thermodynamic models, assigning a p value of nucleosomal occupancy (pNO) to each position of a sequence. For this last program, we analyzed 5'-US using yeast, chicken and human models, and both published and working versions of the program. The length of the 5'-US analyzed was of 2500 bp for A. gambiae, D. melanogaster and Ae. aegypti and 2000 bp for non-melanogaster Drosophila species. For RECON, we used 2660 bp that comprised the 2500 pb promoter, flanked by 80 bp, in order to recover nucleosomal potential values for all promoter positions.
Analysis of motifs position regarding predicted nucleosomal regions
To associate the results obtained with the program Oligo-Analysis and those obtained with RECON and Nucleosome Position Prediction, perl scripts were written to automatically associate motifs coordinates (obtained with DNA-Pattern,  with tables containing "Nucleosomal potential" values (obtained with RECON) or "p of nucleosomal occupancy (pNO)" values (obtained with Nucleosome Position Prediction).
To determine if there was a correlation between the findings of Oligo-analysis and RECON, using perl scripts, the corresponding value of nucleosomal potential obtained with RECON was assigned to each position of the ATAA motif, having four values for each motif, on basis to these four values each ATAA motif was classified as: 1) positive, if the four positions of ATAA were positive; 2) negative, if the four positions were negative; 3) mixed, if at least one position was of opposite sign to the others, and 4) undefined, if at least one position was an N. Once classified, the distribution of ATAA motifs between these four categories was statistically evaluated.
Similarly, to determine if there was a correlation between the findings of Oligo-analysis and Nucleosome Position Prediction, using perl scripts, the corresponding pNO value obtained with the algorithm "Nucleosome Position Prediction" was assigned to each position of the ATAA motif, having again four values for each motif, one per each position. Based on these four values, each ATAA motif was classified as: 1) high p value motif if the four positions had occupancy p values higher than 0.8, 2) medium p value motif if the four positions had occupancy p values between 0.8 an 0.5, 3) low p value motif if the four positions had occupancy p values below of 0.5, and 4) undefined if at least one value were not belonging to the same range of values. Additionally, given that this software uses yeast, chicken and human models, and have a working and a published version, data generated with each model and each version were analyzed. This program does not accept sequences with Ns, therefore, in some cases the number of sequences analyzed by group of 5' upstream sequences was slightly smaller. For A. gambiae: 16 immunity, 12 down-regulated, 17 random genes and 17 non-modified. For D. melanogaster, only the group of non-modified genes was modified, from 32 to 31 sequences. For the other Drosophila species and Ae. aegypti the number of analysed sequences was the same for both programs.
Analysis of 5'-US with alignment matrices
In order to compare the distribution of AT-rich motifs associated with positive nucleosomal potential values among groups of genes, we fitted ordinary least squares regression models with robust standard errors, with the number of positive AT-rich motifs as the dependent variable and dummy variables of the corresponding group of genes as predictors for all dipteran species. We fitted similar models to compare among groups of genes the distribution of AT-rich motifs associated with pNO > 0.8, and the AT% difference between groups of genes. Additionally, we compared the number of AT-rich motifs within all types of nucleotide sequences, either associated with positive vs. negative values of nucleosomal potential or pNO > 0.8 vs. pNO < 0.5, by use of paired Student's T tests.
To evaluate the distribution of NFκB REs throughout the promoter regions, comparisons were done using Poisson regression with the count of NFκB RE as response variable and the group of genes as independent variable. The models support the χ2 goodness-of-fit test, when the model was not supported, a Kruskal-Wallis non-parametric test was done. Ordinary least squares regression analysis was performed to compare the counts of ATAA in the vicinity (± 200 bp) of NFκB motifs by type of gene group (immunity, non-modified and down-regulated), in both A. gambiae and D. melanogaster.
Conacyt PhD scholarship 159376 to JHR. SEP-Conacyt P44797 Young Researcher award to JMB. Maritza Solano for technical assistance; Mara Tellez Rojo, Julio Collado Vides, Mario Zurita and Jaques van Helden and the two reviewers for helpful discussions and suggestions.
- Carroll SB: Genetics and the making of Homo sapiens. Nature. 2003, 422: 849-857. 10.1038/nature01495.PubMedView ArticleGoogle Scholar
- Levine M, Tjian R: Transcription regulation and animal diversity. Nature. 2003, 424: 147-151. 10.1038/nature01763.PubMedView ArticleGoogle Scholar
- Davidson EH: The regulatory genome. 2006, Canada: Academic PressGoogle Scholar
- Khorasanizadeh S: The nucleosome: from genomic organization to genomic regulation. Cell. 2004, 116: 259-272. 10.1016/S0092-8674(04)00044-3.PubMedView ArticleGoogle Scholar
- Kornberg RD: Chromatin structure: a repeating unit of histones and DNA. Science. 1974, 184: 868-871. 10.1126/science.184.4139.868.PubMedView ArticleGoogle Scholar
- Wu C: Chromatin remodelling and the control of gene expression. J Biol Chem. 1997, 272: 28171-28174. 10.1074/jbc.272.45.28171.PubMedView ArticleGoogle Scholar
- Narlikar GJ, Fan HY, Kingston RE: Cooperation between complexes that regulate chromatin structure and transcription. Cell. 2002, 108: 475-487. 10.1016/S0092-8674(02)00654-2.PubMedView ArticleGoogle Scholar
- Smale ST, Fisher AG: Chromatin structure and gene regulation in the immune system. Ann Rev Immunol. 2002, 20: 427-462. 10.1146/annurev.immunol.20.100301.064739.View ArticleGoogle Scholar
- Schulze SR, Wallrath LL: Gene regulation by chromatin structure: paradigms established in Drosophila melanogaster. Annu Rev Entomol. 2007, 52: 171-192. 10.1146/annurev.ento.51.110104.151007.PubMedView ArticleGoogle Scholar
- Schmid A, Fascher KD, Hörz W: Nucleosome disruption at the yeast PHO5 promoter upon PHO5 induction occurs in the absence of DNA replication. Cell. 1992, 71: 853-864. 10.1016/0092-8674(92)90560-Y.PubMedView ArticleGoogle Scholar
- Lewin B: Chromatin and gene expression: constant questions, but changing answers. Cell. 1994, 79: 397-406. 10.1016/0092-8674(94)90249-6.PubMedView ArticleGoogle Scholar
- Beato M, Eisfeld K: Transcription factor access to chromatin. Nucleic Acids Res. 1997, 25: 3559-3563. 10.1093/nar/25.18.3559.PubMedPubMed CentralView ArticleGoogle Scholar
- Jackson JR, Benyajati C: DNA-histone interactions are sufficient to position a single nucleosome juxtaposing Drosophila Adh adult enhancer and distal promoter. Nucleic Acids Res. 1993, 21: 957-967. 10.1093/nar/21.4.957.PubMedPubMed CentralView ArticleGoogle Scholar
- Trifonov EN, Sussman JL: The pitch of chromatin DNA is reflected in its nucleotide sequence. Proc Natl Acad Sci. 1980, 77: 3816-3820. 10.1073/pnas.77.7.3816.PubMedPubMed CentralView ArticleGoogle Scholar
- Satchwell SC, Drew HR, Travers AA: Sequence periodicities in chicken nucleosome core DNA. J Mol Biol. 1986, 191: 659-675. 10.1016/0022-2836(86)90452-3.PubMedView ArticleGoogle Scholar
- Shrader TE, Crothers DM: Artificial nucleosome positioning sequences. Proc Natl Acad Sci. 1989, 86: 7418-7422. 10.1073/pnas.86.19.7418.PubMedPubMed CentralView ArticleGoogle Scholar
- Ioshikhes I, Bolshoy A, Derenshteyn K, Borodovsky M, Trifonov EN: Nucleosome DNA sequence pattern revealed by multiple alignment of experimentally mapped sequences. J Mol Biol. 1996, 262: 129-139. 10.1006/jmbi.1996.0503.PubMedView ArticleGoogle Scholar
- Widlund HR, Cao H, Simonsson S, Magnusson E, Simonsson T, Nielsen PE, Kahn JD, Crothers DM, Kubista DM: Identification and characterization of genomic nucleosome-positioning sequences. J Mol Biol. 1997, 267: 807-817. 10.1006/jmbi.1997.0916.PubMedView ArticleGoogle Scholar
- Stein A, Bina M: A signal encoded in vertebrate DNA that influences nucleosome positioning and aligment. Nucleic Acids Res. 1999, 27: 848-853. 10.1093/nar/27.3.848.PubMedPubMed CentralView ArticleGoogle Scholar
- Segal E, Fondufe-Mittendorf Y, Chen L, Thastrom A, Field Y, Moore IK, Wang JZ, Widom J: A genomic code for nucleosome positioning. Nature. 2006, 442: 772-778. 10.1038/nature04979.PubMedPubMed CentralView ArticleGoogle Scholar
- Johnson SM, Tan FJ, McCullough HL, Riordan DP, Fire AZ: Flexibility and constraint in the nucleosome core landscape of Caenorhabditis elegans chromatin. Genome Res. 2006, 16: 1505-1516. 10.1101/gr.5560806.PubMedPubMed CentralView ArticleGoogle Scholar
- Levitsky VG: RECON: a program for prediction of nucleosome formation potential. Nuc Acids Res. 2004, 32: W346-W349. 10.1093/nar/gkh482.View ArticleGoogle Scholar
- Kimbrell DA, Beutler B: The evolution and genetics of innate immunity. Nat Rev Genet. 2001, 2: 256-267. 10.1038/35066006.PubMedView ArticleGoogle Scholar
- Hernández-Romano J, Martínez-Barnetche J, Rodríguez-López MH: Transcriptional regulation of immune related genes in insects: Insights into the genomics of Anopheles immune response. Genes, Genomes and Genomics. Edited by: Thangadurai D, Tang W, Pullaiah T. 2006, New Delhi: Regency Publications, 1: 98-130.Google Scholar
- Ferrandon D, Imler JL, Hetru C, Hoffmann JA: The Drosophila systemic immune response: sensing and signalling during bacterial and fungal infections. Nat Rev Immunol. 2007, 7: 862-874. 10.1038/nri2194.PubMedView ArticleGoogle Scholar
- Dimopoulos G, Christophides GK, Meister S, Schultz J, White KP, Barillas-Mury C, Kafatos FC: Genome expression analysis of Anopheles gambiae : responses to injury, bacterial challenge, and malaria infection. Proc Natl Acad Sci. 2002, 99: 8814-8819. 10.1073/pnas.092274999.PubMedPubMed CentralView ArticleGoogle Scholar
- De Gregorio E, Spellman PT, Rubin GM, Lemaitre B: Genome-wide analysis of the Drosophila immune response by using oligonucleotide microarrays. Proc Natl Acad Sci. 2001, 98: 12590-12595. 10.1073/pnas.221458698.PubMedPubMed CentralView ArticleGoogle Scholar
- Birney E, Andrews D, Caccamo M, Chen Y, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Flicek P, Graf S, Hammond M, Herrero J, Howe K, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Kokocinski F, Kulesha E, London D, Longden I, Melsopp C, Meidl P, Overduin B, Parker A, Proctor G, Prlic A, Rae M, Rios D, Redmond S, Schuster M, Sealy I, Searle S, Severin J, Slater G, Smedley D, Smith J, Stabenau A, Stalker J, Trevanion S, Ureta-Vidal A, Vogel J, White S, Woodwark C, Hubbard TJ: Ensembl 2006. Nucleic Acids Res. 2006, 34: D556-561. 10.1093/nar/gkj133.PubMedPubMed CentralView ArticleGoogle Scholar
- van Helden, André B, Collado-Vides J: Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J Mol Biol. 1998, 281: 827-842. 10.1006/jmbi.1998.1947.View ArticleGoogle Scholar
- Hulzink RJ, Weerdesteyn H, Croes AF, Gerats T, van Herpen MM, van Helden J: In silico identification of putative regulatory sequence elements in the 5'-untranslated region of genes that are expressed during male gametogenesis. Plant Physiol. 2003, 132: 75-83. 10.1104/pp.102.014894.PubMedPubMed CentralView ArticleGoogle Scholar
- Kankainen M, Holm L: POBO, transcription binding site verification with bootstrapping. Nucleic Acid Res. 2004, 32: W222-W229. 10.1093/nar/gkh463.PubMedPubMed CentralView ArticleGoogle Scholar
- Drosophila 12 Genomes Consortium, Clark AG, Eisen MB, Smith DR, Bergman CM, Oliver B, Markow TA, Kaufman TC, Kellis M, Gelbart W, Iyer VN, Pollard DA, Sackton TB, Larracuente AM, Singh ND, Abad JP, Abt DN, Adryan B, Aguade M, Akashi H, Anderson WW, Aquadro CF, Ardell DH, Arguello R, Artieri CG, Barbash DA, Barker D, Barsanti P, Batterham P, Batzoglou S, Begun D, Bhutkar A, Blanco E, Bosak SA, Bradley RK, Brand AD, Brent MR, Brooks AN, Brown RH, Butlin RK, Caggese C, Calvi BR, Bernardo de Carvalho A, Caspi A, Castrezana S, Celniker SE, Chang JL, Chapple C, Chatterji S, Chinwalla A, Civetta A, Clifton SW, Comeron JM, Costello JC, Coyne JA, Daub J, David RG, Delcher AL, Delehaunty K, Do CB, Ebling H, Edwards K, Eickbush T, Evans JD, Filipski A, Findeiss S, Freyhult E, Fulton L, Fulton R, Garcia AC, Gardiner A, Garfield DA, Garvin BE, Gibson G, Gilbert D, Gnerre S, Godfrey J, Good R, Gotea V, Gravely B, Greenberg AJ, Griffiths-Jones S, Gross S, Guigo R, Gustafson EA, Haerty W, Hahn MW, Halligan DL, Halpern AL, Halter GM, Han MV, Heger A, Hillier L, Hinrichs AS, Holmes I, Hoskins RA, Hubisz MJ, Hultmark D, Huntley MA, Jaffe DB, Jagadeeshan S, Jeck WR, Johnson J, Jones CD, Jordan WC, Karpen GH, Kataoka E, Keightley PD, Kheradpour P, Kirkness EF, Koerich LB, Kristiansen K, Kudrna D, Kulathinal RJ, Kumar S, Kwok R, Lander E, Langley CH, Lapoint R, Lazzaro BP, Lee SJ, Levesque L, Li R, Lin CF, Lin MF, Lindblad-Toh K, Llopart A, Long M, Low L, Lozovsky E, Lu J, Luo M, Machado CA, Makalowski W, Marzo M, Matsuda M, Matzkin L, McAllister B, McBride CS, McKernan B, McKernan K, Mendez-Lago M, Minx P, Mollenhauer MU, Montooth K, Mount SM, Mu X, Myers E, Negre B, Newfeld S, Nielsen R, Noor MA, O'Grady P, Pachter L, Papaceit M, Parisi MJ, Parisi M, Parts L, Pedersen JS, Pesole G, Phillippy AM, Ponting CP, Pop M, Porcelli D, Powell JR, Prohaska S, Pruitt K, Puig M, Quesneville H, Ram KR, Rand D, Rasmussen MD, Reed LK, Reenan R, Reily A, Remington KA, Rieger TT, Ritchie MG, Robin C, Rogers YH, Rohde C, Rozas J, Rubenfield MJ, Ruiz A, Russo S, Salzberg SL, Sanchez-Gracia A, Saranga DJ, Sato H, Schaeffer SW, Schatz MC, Schlenke T, Schwartz R, Segarra C, Singh RS, Sirot L, Sirota M, Sisneros NB, Smith CD, Smith TF, Spieth J, Stage DE, Stark A, Stephan W, Strausberg RL, Strempel S, Sturgill D, Sutton G, Sutton GG, Tao W, Teichmann S, Tobari YN, Tomimura Y, Tsolas JM, Valente VL, Venter E, Venter JC, Vicario S, Vieira FG, Vilella AJ, Villasante A, Walenz B, Wang J, Wasserman M, Watts T, Wilson D, Wilson RK, Wing RA, Wolfner MF, Wong A, Wong GK, Wu CI, Wu G, Yamamoto D, Yang HP, Yang SP, Yorke JA, Yoshida K, Zdobnov E, Zhang P, Zhang Y, Zimin AV, Baldwin J, Abdouelleil A, Abdulkadir J, Abebe A, Abera B, Abreu J, Acer SC, Aftuck L, Alexander A, An P, Anderson E, Anderson S, Arachi H, Azer M, Bachantsang P, Barry A, Bayul T, Berlin A, Bessette D, Bloom T, Blye J, Boguslavskiy L, Bonnet C, Boukhgalter B, Bourzgui I, Brown A, Cahill P, Channer S, Cheshatsang Y, Chuda L, Citroen M, Collymore A, Cooke P, Costello M, D'Aco K, Daza R, De Haan G, DeGray S, DeMaso C, Dhargay N, Dooley K, Dooley E, Doricent M, Dorje P, Dorjee K, Dupes A, Elong R, Falk J, Farina A, Faro S, Ferguson D, Fisher S, Foley CD, Franke A, Friedrich D, Gadbois L, Gearin G, Gearin CR, Giannoukos G, Goode T, Graham J, Grandbois E, Grewal S, Gyaltsen K, Hafez N, Hagos B, Hall J, Henson C, Hollinger A, Honan T, Huard MD, Hughes L, Hurhula B, Husby ME, Kamat A, Kanga B, Kashin S, Khazanovich D, Kisner P, Lance K, Lara M, Lee W, Lennon N, Letendre F, LeVine R, Lipovsky A, Liu X, Liu J, Liu S, Lokyitsang T, Lokyitsang Y, Lubonja R, Lui A, MacDonald P, Magnisalis V, Maru K, Matthews C, McCusker W, McDonough S, Mehta T, Meldrim J, Meneus L, Mihai O, Mihalev A, Mihova T, Mittelman R, Mlenga V, Montmayeur A, Mulrain L, Navidi A, Naylor J, Negash T, Nguyen T, Nguyen N, Nicol R, Norbu C, Norbu N, Novod N, O'Neill B, Osman S, Markiewicz E, Oyono OL, Patti C, Phunkhang P, Pierre F, Priest M, Raghuraman S, Rege F, Reyes R, Rise C, Rogov P, Ross K, Ryan E, Settipalli S, Shea T, Sherpa N, Shi L, Shih D, Sparrow T, Spaulding J, Stalker J, Stange-Thomann N, Stavropoulos S, Stone C, Strader C, Tesfaye S, Thomson T, Thoulutsang Y, Thoulutsang D, Topham K, Topping I, Tsamla T, Vassiliev H, Vo A, Wangchuk T, Wangdi T, Weiand M, Wilkinson J, Wilson A, Yadav S, Young G, Yu Q, Zembek L, Zhong D, Zimmer A, Zwirko Z, Jaffe DB, Alvarez P, Brockman W, Butler J, Chin C, Gnerre S, Grabherr M, Kleber M, Mauceli E, MacCallum I: Evolution of genes and genomes on the Drosophila phylogeny. Nature. 2007, 8: 203-218.Google Scholar
- Richards S, Liu Y, Bettencourt BR, Hradecky P, Letovsky S, Nielsen R, Thornton K, Hubisz MJ, Chen R, Meisel RP, Couronne O, Hua S, Smith MA, Zhang P, Liu J, Bussemaker HJ, van Batenburg MF, Howells SL, Scherer SE, Sodergren E, Matthews BB, Crosby MA, Schroeder AJ, Ortiz-Barrientos D, Rives CM, Metzker ML, Muzny DM, Scott G, Steffen D, Wheeler DA, Worley KC, Havlak P, Durbin KJ, Egan A, Gill R, Hume J, Morgan MB, Miner G, Hamilton C, Huang Y, Waldron L, Verduzco D, Clerc-Blankenburg KP, Dubchak I, Noor MA, Anderson W, White KP, Clark AG, Schaeffer SW, Gelbart W, Weinstock GM, Gibbs RA: Comparative genome sequencing of Drosophila pseudoobscura : chromosomal, gene, and cis-element evolution. Genome Res. 2005, 15: 1-18. 10.1101/gr.3059305.PubMedPubMed CentralView ArticleGoogle Scholar
- Levitsky VG, Podkolodnaya OA, Kolchanov NA, Podkolodny NL: Nucleosome formation potential of eukaryotic DNA: calculation and promoters analysis. Bioinformatics. 2001, 17: 998-1010. 10.1093/bioinformatics/17.11.998.PubMedView ArticleGoogle Scholar
- Bailey TL, Elkan C: The value of prior knowledge in discovering motifs with MEME. Proceedings of the third international conference on intelligent systems for molecular biology. 1995, AAAI Press, Menlo Park, CA, 21-29.Google Scholar
- Zdobnov EM, von Mering C, Letunic I, Torrents D, Suyama M, Copley RR, Christophides GK, Thomasova D, Holt RA, Subramanian GM, Mueller HM, Dimopoulos G, Law JH, Wells MA, Birney E, Charlab R, Halpern AL, Kokoza E, Kraft CL, Lai Z, Lewis S, Louis C, Barillas-Mury C, Nusskern D, Rubin GM, Salzberg SL, Sutton GG, Topalis P, Wides R, Wincker P, Yandell M, Collins FH, Ribeiro J, Gelbart WM, Kafatos FC, Bork P: Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster. Science. 2002, 298: 149-159. 10.1126/science.1077061.PubMedView ArticleGoogle Scholar
- Widom J: Role of DNA sequence in nucleosome stability and dynamics. Q Rev Biophys. 2001, 34: 269-324.PubMedView ArticleGoogle Scholar
- Anderson JD, Widom J: Poly(dA-dT) promoter elements increase the equilibrium accessibility of nucleosomal DNA target sites. Mol Cell Biol. 2001, 21: 3830-3839. 10.1128/MCB.21.11.3830-3839.2001.PubMedPubMed CentralView ArticleGoogle Scholar
- Sekinger EA, Moqtaderi Z, Struhl K: Intrinsic histone-DNA interactions and low nucleosome density are important for preferential accessibility of promoter regions in yeast. Mol Cell. 2005, 18: 735-748. 10.1016/j.molcel.2005.05.003.PubMedView ArticleGoogle Scholar
- Lee W, Tillo D, Bray N, Morse RH, Davis RW, Hughes TR, Nislow C: A high-resolution atlas of nucleosome occupancy in yeast. Nat Genet. 2007, 39: 1235-1244. 10.1038/ng2117.PubMedView ArticleGoogle Scholar
- Snel B, van Noort V, Huynen MA: Gene co-regulation is highly conserved in the evolution of eukaryotes and prokaryotes. Nucleic Acids Res. 2004, 32: 4725-4731. 10.1093/nar/gkh815.PubMedPubMed CentralView ArticleGoogle Scholar
- De Gregorio E, Spellman PT, Tzou P, Rubin GM, Lemaitre B: The Toll and Imd pathways are the major regulators of the immune system response in Drosophila. EMBO J. 2002, 21: 2568-2579. 10.1093/emboj/21.11.2568.PubMedPubMed CentralView ArticleGoogle Scholar
- Waterhouse RM, Kriventseva EV, Meister S, Xi Z, Alvarez KS, Bartholomay LC, Barillas-Mury C, Bian G, Blandin S, Christensen BM, Dong Y, Jiang H, Kanost MR, Koutsos AC, Levashina EA, Li J, Ligoxygakis P, Maccallum RM, Mayhew GF, Mendes A, Michel K, Osta MA, Paskewitz S, Shin SW, Vlachou D, Wang L, Wei W, Zheng L, Zou Z, Severson DW, Raikhel AS, Kafatos FC, Dimopoulos G, Zdobnov EM, Christophides GK: Evolutionary dynamics of immune-related genes and pathways in disease-vector mosquitoes. Science. 2007, 316: 1738-1743. 10.1126/science.1139862.PubMedPubMed CentralView ArticleGoogle Scholar
- Hoffmann JA, Kafatos FC, Janeway CA, Ezekowitz RAB: Phylogenetic perspectives in innate immunity. Science. 1999, 284: 1313-1318. 10.1126/science.284.5418.1313.PubMedView ArticleGoogle Scholar
- Hoffmann JA, Reichhart JM: Drosophila innate immunity: an evolutionary perspective. Nat Immunol. 2002, 3: 121-126. 10.1038/ni0202-121.PubMedView ArticleGoogle Scholar
- Zheng L, Zhang L, Lin H, McIntosh MT, Malacrida AR: Toll-like receptors in invertebrate innate immunity. ISJ. 2005, 2: 105-113.Google Scholar
- Dziarski R, Gupta D: Mammalian PGRPs: novel antibacterial proteins. Cell Microbiol. 2006, 8: 1059-1069. 10.1111/j.1462-5822.2006.00726.x.PubMedView ArticleGoogle Scholar
- Sun SC, Lindstrom I, Lee JY, Faye I: Structure and expression of the attacin genes in Hyalophora cecropia. Eur J Biochem. 1991, 196: 247-254. 10.1111/j.1432-1033.1991.tb15811.x.PubMedView ArticleGoogle Scholar
- Kadalayil L, Petersen U, Engström Y: Adjacent GATA and κB-like motifs regulate the expression of a Drosophila immune gene. Nuc Acids Res. 1997, 25: 1233-1239. 10.1093/nar/25.6.1233.View ArticleGoogle Scholar
- Eggleston P, Lu W, Zhao Y: Genomic organization and immune regulation of the defensin gene from the mosquito, Anopheles gambiae. Insect Mol Biol. 2000, 9: 481-490. 10.1046/j.1365-2583.2000.00212.x.PubMedView ArticleGoogle Scholar
- Zheng XL, Zheng AL: Genomic organization and regulation of three cecropin genes in Anopheles gambiae. Insect Mol Biol. 2002, 11: 517-525. 10.1046/j.1365-2583.2002.00360.x.PubMedView ArticleGoogle Scholar
- Senger K, Armstrong GW, Rowell WJ, Kwan JM, Markstein M, Levine M: Immunity regulatory DNAs share common organizational features in Drosophila. Mol Cell. 2004, 13: 19-32. 10.1016/S1097-2765(03)00500-8.PubMedView ArticleGoogle Scholar
- Angelov D, Lenouvel F, Hans F, Müller CW, Bouvet P, Bednar J, Moudrianakis EN, Cadet J, Dimitrov S: The histone octamer is invisible when NF-kappaB binds to the nucleosome. J Biol Chem. 2004, 279: 42374-42382. 10.1074/jbc.M407235200.PubMedView ArticleGoogle Scholar
- Hoberg JE, Yeung F, Mayo MW: SMRT derepression by the IkappaB kinase alpha: a prerequisite to NF-kappaB transcription and survival. Mol Cell. 2004, 16: 245-255. 10.1016/j.molcel.2004.10.010.PubMedView ArticleGoogle Scholar
- Murphy KM, Reiner SL: The lineage decisions of helper T cells. Nat Rev Immunol. 2002, 2: 933-944. 10.1038/nri954.PubMedView ArticleGoogle Scholar
- Weinmann AS, Plevy SE, Smale ST: Rapid and selective remodelling of a positioned nucleosome during the induction of IL-12 p40 transcription. Immunity. 1999, 11: 665-675. 10.1016/S1074-7613(00)80141-7.PubMedView ArticleGoogle Scholar
- Agalioti T, Lomvardas S, Parekh B, Yie J, Maniatis T, Thanos D: Ordered recruitment of chromatin modifying and general transcription factors to the IFN-β promoter. Cell. 2000, 103: 667-678. 10.1016/S0092-8674(00)00169-0.PubMedView ArticleGoogle Scholar
- Attema JL, Reeves R, Murray V, Levichkin I, Temple MD, Tremethick DJ, Shannon MF: The human IL-2 promoter can assemble a positioned nucleosome that becomes remodeled upon T cell activation. J Immunol. 2002, 169: 2466-2476.PubMedView ArticleGoogle Scholar
- Holloway AF, Rao S, Chen X, Shannon MF: Changes in chromatin accessibility across the GM-CSF promoter upon T cell activation are dependent on nuclear factor kappaB proteins. J Exp Med. 2003, 197: 413-23. 10.1084/jem.20021039.PubMedPubMed CentralView ArticleGoogle Scholar
- Strahl BD, Allis CD: The language of covalent histone modifications. Nature. 2000, 403: 41-45. 10.1038/47412.PubMedView ArticleGoogle Scholar
- Saha A, Wittmeyer J, Cairns BR: Chromatin remodelling: the industrial revolution of DNA around histones. Nature Rev Mol Cell Biol. 2006, 7: 437-447. 10.1038/nrm1945.View ArticleGoogle Scholar
- Genome-wide Gene Expression Patterns of Drosophila in Response to Immune Challenge. [http://www.fruitfly.org/expression/immunity/data.shtml]
- Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, Sturn A, Snuffin M, Rezantsev A, Popov D, Ryltsov A, Kostukovich E, Borisovsky I, Liu Z, Vinsavich A, Trush V, Quackenbush J: TM4: a free, open-source system for microarray data management and analysis. Biotechniques. 2003, 34: 374-378.PubMedGoogle Scholar
- Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci. 1998, 95: 14863-14868. 10.1073/pnas.95.25.14863.PubMedPubMed CentralView ArticleGoogle Scholar
- Regulatory Sequence Analysis Tools. [http://rsat.scmbb.ulb.ac.be/rsat/]
- Sequence Manipulation Suite. [http://bioinformatics.org/sms2/random_dna.html]
- Kriventseva EV, Koutsos AC, Blass C, Kafatos FC, Christophides GK, Zdobnov EM: AnoEST: toward A. gambiae functional genomics. Genome Res. 2005, 15: 893-899. 10.1101/gr.3756405.PubMedPubMed CentralView ArticleGoogle Scholar
- Wheeler DL, Church DM, Federhen S, Lash AE, Madden TL, Pontius JU, Schuler GD, Schriml LM, Sequeira E, Tatusova TA, Wagner L: Database resources of the National Center for Biotechnology. Nucleic Acids Res. 2003, 31: 28-33. 10.1093/nar/gkg033.PubMedPubMed CentralView ArticleGoogle Scholar
- Burke TW, Kadonaga JT: The downstream core promoter element, DPE, is conserved from Drosophila to humans and is recognized by TAFII60 of Drosophila. Genes Dev. 1997, 11: 3020-3031. 10.1101/gad.11.22.3020.PubMedPubMed CentralView ArticleGoogle Scholar
- Durinck S, Moreau Y, Kasprzyk A, Davis S, De Moor B, Brazma A, Huber W: BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics. 2005, 21: 3439-3440. 10.1093/bioinformatics/bti525.PubMedView ArticleGoogle Scholar
- Flybase: a database of Drosophila genes and genomes. [http://www.flybase.org]
- Nene V, Wortman JR, Lawson D, Haas B, Kodira C, Tu ZJ, Loftus B, Xi Z, Megy K, Grabherr M, Ren Q, Zdobnov EM, Lobo NF, Campbell KS, Brown SE, Bonaldo MF, Zhu J, Sinkins SP, Hogenkamp DG, Amedeo P, Arensburger P, Atkinson PW, Bidwell S, Biedler J, Birney E, Bruggner RV, Costas J, Coy MR, Crabtree J, Crawford M, Debruyn B, Decaprio D, Eiglmeier K, Eisenstadt E, El-Dorry H, Gelbart WM, Gomes SL, Hammond M, Hannick LI, Hogan JR, Holmes MH, Jaffe D, Johnston JS, Kennedy RC, Koo H, Kravitz S, Kriventseva EV, Kulp D, Labutti K, Lee E, Li S, Lovin DD, Mao C, Mauceli E, Menck CF, Miller JR, Montgomery P, Mori A, Nascimento AL, Naveira HF, Nusbaum C, O'leary S, Orvis J, Pertea M, Quesneville H, Reidenbach KR, Rogers YH, Roth CW, Schneider JR, Schatz M, Shumway M, Stanke M, Stinson EO, Tubio JM, Vanzee JP, Verjovski-Almeida S, Werner D, White O, Wyder S, Zeng Q, Zhao Q, Zhao Y, Hill CA, Raikhel AS, Soares MB, Knudson DL, Lee NH, Galagan J, Salzberg SL, Paulsen IT, Dimopoulos G, Collins FH, Birren B, Fraser-Liggett CM, Severson DW: Genome sequence of Aedes aegypti, a major arbovirus vector. Science. 2007, 316: 1718-1723. 10.1126/science.1138878.PubMedView ArticleGoogle Scholar
- Hertz GZ, Stormo GD: Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics. 1999, 15: 563-577. 10.1093/bioinformatics/15.7.563.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.