- Research article
- Open Access
Meta-analysis of heterogeneous Down Syndrome data reveals consistent genome-wide dosage effects related to neurological processes
BMC Genomicsvolume 12, Article number: 229 (2011)
Down syndrome (DS; trisomy 21) is the most common genetic cause of mental retardation in the human population and key molecular networks dysregulated in DS are still unknown. Many different experimental techniques have been applied to analyse the effects of dosage imbalance at the molecular and phenotypical level, however, currently no integrative approach exists that attempts to extract the common information.
We have performed a statistical meta-analysis from 45 heterogeneous publicly available DS data sets in order to identify consistent dosage effects from these studies. We identified 324 genes with significant genome-wide dosage effects, including well investigated genes like SOD1, APP, RUNX1 and DYRK1A as well as a large proportion of novel genes (N = 62). Furthermore, we characterized these genes using gene ontology, molecular interactions and promoter sequence analysis. In order to judge relevance of the 324 genes for more general cerebral pathologies we used independent publicly available microarry data from brain studies not related with DS and identified a subset of 79 genes with potential impact for neurocognitive processes. All results have been made available through a web server under http://ds-geneminer.molgen.mpg.de/.
Our study represents a comprehensive integrative analysis of heterogeneous data including genome-wide transcript levels in the domain of trisomy 21. The detected dosage effects build a resource for further studies of DS pathology and the development of new therapies.
Down syndrome (DS) is the most frequent genomic aneuploidy with an incidence of approximately 1 in 700 live-newborn  resulting from the presence of an extra copy of human chromosome 21 (HSA21). DS is characterized by a complex phenotype with features that are not fully penetrant. The most frequent manifestations, which are virtually always present, include mental retardation, morphological abnormalities of the head and limbs, short stature, hypotonia and hyperlaxity of ligaments. Other features occur with less frequency such as organ malformations, particularly of the heart (50% of DS newborns), several types of gastrointestinal tract obstructions or dysfunctions (4-5% of DS newborns), increased risk of leukaemia (20 × higher compared to the normal population), and early occurrence of an Alzheimer-like neuropathology [2, 3]. DS has been investigated with multiple functional genomics studies aiming to understand the molecular basis underlying the various aspects of the disease [4–7].
The most commonly accepted pathogenetic hypothesis is that the dosage imbalance of genes on HSA21 is responsible for the molecular dysfunctions in DS, meaning that genes on the triplicated chromosome are overexpressed due to an extra chromosome 21, as demonstrated for selected genes like SOD1 and DYRK1A . Recent global transcriptome studies with microarrays, however, have generated a more complex picture in the sense that not all HSA21 genes have an elevated expression level as expected [9, 10]. An alternative hypothesis is that the phenotype is due to an unstable environment resulting from the dosage imbalance of the hundreds of genes on HSA21 which determines a non-specific disturbance of genomic regulation and expression. The significantly higher inter-individual variability in DS, as compared to euploid, individuals supports this hypothesis . Moreover, the two hypotheses could be coexistent . In both hypotheses it is understood that besides alterations of gene expression of HSA21 genes there are numerous genome-wide effects that lead to the dysregulation of many non-HSA21 genes through molecular pathways and interactions.
Many studies on the transcriptome and proteome levels have been conducted to understand the causal relationship between genes at dosage imbalance and DS phenotypes . Gene expression profiles have been analysed from DS fetal  and adult human tissues . Additionally, two classes of mouse models  have been developed for investigating the molecular genetics of DS, either mouse models with partial trisomies of the syntenic regions of HSA21 in mouse chromosomes 10, 16 or 17, such as Ts16 , Ts65Dn  and Ts1Cje mice , or transgenic mice for specific genes such as SOD1 . Studies of gene expression profiles in human DS samples and mouse models have shown high genome-wide variability [11, 19–22]. Furthermore, differences due to the applied experimental platforms, specific tissues, developmental stages or the triplicated segments under study introduce a high variation to the assessment of genome-wide effects of DS. Here, integrative and comparative studies are pivotal for the analysis of the complex nature of gene expression and regulation in DS at a more general level [2, 23].
Meta-analysis was proven to be a valid strategy to extract consistent information from heterogeneous data, in particular with respect to complex phenotypes for example cancer , Alzheimer  and type-2 diabetes mellitus . The purpose of meta-analysis is to compensate experiment-specific variations and to reveal consistent information across a wide range of experiments. To date, such a meta-analysis of DS data is missing.
In this paper we describe a comprehensive meta-analysis from 45 different DS studies on human and mouse on the transcriptome and proteome level including quantitative data such as Affymetrix microarrays, RT-PCR and MALDI studies as well as qualitative data such as SAGE and Western blot analyses. We applied an established computational framework  and identified 324 genes with consistent dosage effects in many of these studies. As expected, we observed a high fraction of HSA21 genes (N = 77) but also a large amount of non-HSA21 genes (N = 247). Besides well investigated genes in the context of DS we detected a significant proportion of novel ones (N = 62). The 324 genes were further investigated using functional information, molecular interactions and promoter analysis revealing over-represented motifs of four transcription factors: RUNX1, E2F1, STAF/PAX2 and STAT3. In order to test the relevance of the 324 genes for more general brain phenotypes we used independent publicly available data on cerebral pathologies not related to DS and identified a subset of 79 DS genes that were differentially expressed in these studies. The detected dosage effects can be used as a resource for further studies of DS pathology, functional experiments and the development of therapies. All data have been agglomerated and made available through a web server that tracks results of the meta-analysis http://ds-geneminer.molgen.mpg.de/ and that enables the community to validate any gene of interest in the light of the experimental data.
Genome-Wide Dosage Effects
Genome-wide dosage effects were computed with the numerical scoring method described in Material and Methods. In total, 45 case-control experiments were interrogated (Additional file 1, Table S1), the alteration for each gene between the trisomic and normal states was scored in each experiment, gene scores were summarised across all experiments and the significance of the summarised scores was judged with a Bootstrap approach. This procedure resulted in a cut-off score value of 3.67 and identified 324 genes as being predominantly affected by DS. The thirty genes with the highest dosage effects, either on HSA21 or on other chromosomes, are listed in Table 1. The entire gene list is given in Additional file 1, Table S2.
The meta-analysis identified genes that showed consistent changes in many of the different experiments rather than genes that were affected by a single (or few) experiment(s) (Figure 1A). This is an important fact since, for example, different mouse models have different coverage of triplicated HSA21 genes, and, thus, might introduce model-specific bias . The consistency of the dosage effect was measured for each gene with an entropy criterion (see Materials and Methods) and Figure 1A reveals a strong preference for the selection of high-entropy genes. Highest scores were assigned to HSA21 genes (Figure 1B) what indicates that the meta-analysis scores reflect the effect of an extra chromosome 21 on gene expression (Table 1). While proportionally most dosage effects were identified for HSA21 genes (77 out of 324), the majority of genes (247 out of 324) was located on other chromosomes highlighting the genome-wide impact of DS (Figure 1C).
Genome-wide dosage effects underlined the severe phenotypic consequences of DS caused by genes with a major role in human development (Additional file 2, Figure S1). Of the 247 non-HSA21 genes, 72 were associated with development, in particular with respect to organ development (62 genes, GO:0048513), tissue development (34 genes GO:0009888) and cell development (30 genes, GO:0048468). Amongst these genes were known interactors of HSA21 genes, for example REST (RE1-silencing transcription factor). REST modulates expression of genes encoding fundamental neuronal functions including ion channels, synaptic proteins and neurotransmitter receptors and has been linked to an inherited form of mental retardation. Recently, Canzonetta et al.  demonstrated that the region capable of affecting REST levels, in both mouse and human cells, could be assigned to the DYRK1A locus on HSA21 which was found among the top-scoring HSA21 genes (Table 1).
TXNIP (thioredoxin interacting protein) had the highest dosage effect (8.79) of all non-HSA21 genes. It has weak association with DS yet (through S100B ) but could play a major role for several DS phenotypes. It is a key signalling molecule involved in glucose homeostasis , cardiovascular homeostasis  and leukaemia .
Enrichment of genomic location with respect to the 324 genes was observed in regions of HSA21 and the respective syntenic regions on mouse chromosomes 16, 17 and 10 (Additional file 3, Figure S2). Moreover, in the human genome, additional enrichment on chr3q24 was computed containing the genes GYG1 (glicogenin), PLOD2 (involved in bone morphogenesis), PLSCR4 and CHST2 (involved in inflammatory response in vascular endothelial cells).
Dosage Effects on HSA21
Proportionally HSA21 contributed mostly to the detected dosage effects (Figure 1C). On the other hand, it is remarkable that only a third of all HSA21 genes (77 out of 255 studied here using the Ensembl genome annotation ) showed consistent effects across the different experiments (see also Discussion). While 57 genes had a positive score below the significance threshold of 3.67 indicating relevance with respect to specific experiments only, 121 genes had a score near zero indicating that dosage effects were either compensated or not detected with the selected experimental data (Figure 1B).
HSA21 dosage effects included, for example APP (beta-amyloid precursor protein) involved in senile plaque formation in DS and Alzheimer's disease , SOD1 (superoxide dismutase 1), a key enzyme in the metabolism of oxygen-derived free radicals , DYRK1A (dual-specificity tyrosine-(Y)-phosphorylation regulated kinase 1A) involved in neuroblast proliferation, crucial for brain function, learning and memory , RUNX1 (runt-related transcription factor 1) which plays a critical role in normal hematopoiesis , or GABPA (GA binding protein transcription factor, alpha subunit 60 kDa) encoding a DNA binding domain with a huge variety of targets including genes from different cell/tissue specificities and functions . HSA21 genes were mostly up-regulated in gene expression studies (69 out of 77) with the exception of eight genes that were either variable or down-regulated (SLC5A3, MRPS6, B3GALT6, CBS, KCNJ6, KCNJ15, CLDN14, COL18A1). Possible explanations for this observation might be tissue-specificity of gene expression as in the case of MRPS6 which was mostly up-regulated in brain samples and down-regulated in other tissues like heart or kidney, or differences in human and mouse gene expression as in the case of CBS which was up-regulated in human but down-regulated in mouse experiments what might be caused by differential tissue specificity of the orthologous mouse gene .
Three genomic regions on HSA21 were enriched with the significant genes using the MSigDB_c1 positional database: chr21q22, chr21q21 and chr21q11, located on the q-terminal arm (Figure 1D). This contradicts the hypothesis that a single region on HSA21 could be responsible for the molecular and phenotypic consequences of DS with only a few responsive genes [36, 37]. Rather our findings support studies that identified more than one HSA21 region causative for DS phenotypes so that the dosage effects were not uniformly distributed along the chromosome but rather enriched in certain regions on HSA21 similar to the results in [38, 39].
Functional Annotation Using Gene Enrichment Analysis
Functional annotation of biological pathways was retrieved from the ConsensusPathDB , a meta-database that summarizes the content of 22 human interaction databases. A total of 1,695 pre-defined pathways were screened with the 324 genes using gene set enrichment analysis . A total of 277 pathways were found significantly enriched (family-wise error rate (FWER)<0.01) of which several pathways were associated with neurological and neuropathological processes (Table 2). These pathways referred mainly to (i) neurodegeneration (e.g. Huntington's disease, Alzheimer's disease or Parkinson's disease) and (ii) defects in synapsis (e.g. Axon guidance, NGF signaling). Furthermore, the results emphasized the role of tyrosine-kinase receptors in DS pathology (for example P75(NTR)- mediating signalling or NGF signalling via TRKA) which interact directly with BDNF (brain-derived neurotrophic factor). Moreover, our results showed gene dosage effects caused either directly by genes located on HSA21 (e.g. SOD1, APP, DONSON, TIAM1, COL6A2, ITSN1 and BACE2) or indirectly by HSA21 interactors, highlighting the intrinsic complexity of the DS pathology. For example, PIK3R1 de-regulation impacts on many of these pathways and is a direct interactor of IFNAR1, a significant DS gene. A similar effect can be observed for TPJ1A that has interactions with HSA21 genes JAM2 and CDLN8 both showing consistent dosage effects (cf. Figure 2A).
Dosage Effects on Transcriptional Regulation
Dysregulation of transcriptional regulation is widely reported in DS . Among the 324 significant genes were 13 transcription factors (TFs) (PSIP1, RBPJ, TCF4, HES1, ETS2, BACH1, RUNX1, GABPA, SNAI2, REST, LITAF, EGR1, FOS), 6 TFs (PSIP1, HOXC8, DLX5, HIVEP3, ZNF187, ATF6) had significant enrichment of their targets as retrieved by the TRANSFAC  database. Additionally, 57 TFs had significant enrichment of their interacting proteins when judged with physical interactions retrieved from the ConsensusPathDB . In total, 70 different TFs were identified as being (directly or indirectly) affected by dosage imbalances. The list of TFs and their associated functional categories is given in Additional file 1, Table S3. GO categories indicate a broad impact of transcriptional regulation for neurological development, the central nervous system development (RUNX1 and TP53), nervous system development (DLX5, FOS, HES1, STAT3 and EP300), axonogenesis (DLX5, NOTCH1 and CREB1), neuron differentiation (HOXC8, NOTCH1 and RUNX1), negative regulation of neuron differentiation (HES1, NOTCH1 and REST) and regulation of long-term neuronal synaptic plasticity and learning or memory (EGR1 and JUN). Other prominent categories refer to organ development (RBPJ, ETS2, GABPA and SNAI2) and stress response (ATF6, FOS and RELA).
We further analyzed the promoter sequences of the 324 genes for enrichment of transcription factor binding sites using the AMADEUS software . Significant enrichment was computed for 4 TF motifs, E2F1, RUNX1, STAF/PAX2 and STAT3 (Table 3). Enrichment was evident for RUNX1, which is among the most studied genes implicated in DS. The implication of E2F1 in DS was also previously reported  and could be responsible for impaired cell proliferation documented for hippocampus, cerebellum and astrocytes of DS mouse models.
Dosage Effects and Molecular Interactions
Molecular interactions among the 324 significant genes on HSA21 and on other chromosomes exhibited a complex network supporting the important role of physical interactions as transmitter of dosage effects (Figure 2A). The consequences of HSA21 triplication on the interacting genes was fairly stable as Figure 2B demonstrates. For example, DNAJB1 (DnaJ (Hsp40) homolog, subfamily B, member 1) and PPP3CA (protein phosphatase 3, catalytic subunit, alpha isozyme, data not shown), both interacting with SOD1, were consistently and significantly down-regulated in the human microarray experiments as the fold-changes and P-values indicate. Opposite trends were observed for TJP1 and RHOQ.
Assessing General Relevance of DS Dosage Effects for Neurological Processes
We were further interested in identifying, among the 324 genes, those which were relevant for other brain disorders. To achieve this, we interrogated 19 independent data sets derived from publicly available microarray data (Additional file 1, Table S4). These studies followed heterogeneous research questions on different cerebral pathologies and identified a total of 623 differentially expressed genes. Gene set enrichment analyses  with the 324 genes and the corresponding lists of differentially expressed genes were significant for 10 of these 19 studies with 79 overlapping genes (Figure 3A). Furthermore, we used the HSA21 database http://chr21.molgen.mpg.de/hsa21 , a resource of RNA in situ hybridizations in postnatal mouse brain sections, in order to provide independent supporting evidence of brain expression of these 79 genes as shown for example for BACH1 (basic leucine zipper transcription factor 1) and TTC3 (tetratricopeptide repeat domain 3) (Figure 3B and 3C).
Additionally, we investigated the expression patterns of the 79 genes across the DS microarray experiments used for this meta-analysis and could identify brain-related signatures, for example, a clear up-regulation in brain tissues for the cluster containing C14orf147, IVSNS1ABP, B2M, TPJ1, SPARC, CTGF, COL4A1 and FSTL1 (Figure 3D) .
Novel Dosage Effects
To identify DS-relevant "novel" dosage effects we excluded from the 324 genes (i) HSA21 genes, (ii) genes that interacted with HSA21 genes, as well as (iii) genes that were associated with DS in the literature (Table 4). Remaining candidates (N = 62) comprised BDNF-related genes (SST), MAPK-pathway genes (KRAS, IGF1R, GNG11 and RAP1A), genes related with leukemia (SFRP1) and Rho-Proteins (DHCR7 and RAB21). SST was found as co-expressed in previous studies with TAC1  which is also significant in our meta-analysis and both showed a strong correlation across DS studies (Figure 4A).
Novel candidates are associated with neurodegenerative disorders including Alzheimer's disease (VSNL1), prion disease (SCRG1, HSPH1, HSPA5 (Figure 4B) and CTR9) and age-related degeneration (GAS6 and GNG11). Moreover, candidates could explain evident DS features (Additional file 1, Table S5): (i) genes related to neurogenesis and neurite outgrowth (LPAR1 , LIN7C, JARID2, GREM1, SERPINE2, IGFR1 and SPOCK1) that could be related with mental retardation or cognitive impairment, (ii) genes involved in synapsis (AGT, KRAS, ATP1A2, GNAI2, SST and LIN7C) (iii) cytoskeletal related proteins (KANK1 ; Figure 4C), CKAP2, CKAP4, HAT1, NEK7 and VAMP3), (iv) macular degeneration genes  or genes (HTRA1 and EFEMP1) associated with age-related visual problems , (v) genes (AGT, CNN3, FBN1, RBPJ, PON2, POSTN, RAP1A, WNK1 and STK39) that were related with cardiac impairments and could be candidates to explain this DS characteristic , and (vi) genes related with cancer (BTBD3 , DNAJB4 , FIBP  and GSTZ1 ) .
These examples show that the meta-analysis approach identified multiple additional genes that might be involved in DS pathology. In order to enable the community to check any particular gene of interest for DS relevance in the studies under analysis, we have agglomerated all information of the meta-analysis into a WEB-interface http://ds-geneminer.molgen.mpg.de/. Examples of possible views and information are shown in Figure 4.
The statistical meta-analysis approach was described previously by Rasche et al. . The score computed with meta-analysis correlates with entropy (Figure 1A) indicating the ability to identify general dosage effects across many experiments that might be of more phenotypic relevance than very specific ones. Additional file 4, Figures S3A and B provide an overview of the different sources of data, including two organisms (human and mouse), different tissues (brain, heart and others), different stages of development (adult, postnatal, embryonic) and different mouse models (Ts65DN, Ts1Cje, Tc1). It is per se interesting that, in spite of such heterogeneity, common dosage effects could be identified at all and it should be highlighted that whole-genome data was fairly robust across experiments. Additional file 4, Figure S3D shows the overall correlation of the quantitative values of PCR and microarray values averaged from all experiments with only few genes in the non-concordant sectors of the graph (red points).
The score used in this analysis allows detecting genes that could be either up- or down-regulated in different studies. An overview of the fold-changes for the genes across the different experiments is given in Additional file 1, Table S6. Because genes might change their expression level depending on the developmental state, tissue or because of other variables, we expected that this flexibility allows checking the hypothesis of random disturbances as well as the hypothesis of increased expression of HSA21 genes. We detected a clear enrichment of up-regulated genes on the q-terminal part of HSA21 (Figure 1D and Additional file 3, Figure S2). However, not a single region was identified but rather several smaller regions on HSA21 that agglomerate a large amount of significant dosage effects. This finding was also elaborated before (Korbel et al.  and Lyle R et al. ) using two independent data sets to characterize the molecular HSA21 regions in a set of DS-patients with partial duplications.
We studied 255 HSA21 genes matched with the probe sets from the microarrays. Of these only 77 showed consistent dosage effects (Figure 1). While 165 HSA21 genes had score values different from zero indicating response in some of the microarray studies, 90 HSA21 genes were not responsive at all and provide evidence for a strong mechanism of dosage compensation. On the other hand, these figures could also reflect the limitation of detecting reliable fold-changes of low magnitude with microarray technology. Furthermore, experiments covered only a limited amount of tissues so that it is likely that some genes were missed simply because they were not responsive in the tissues under analysis. However, having brain as the dominant whole-genome sample source this should ensure expression of most of the genes. Microarray data was focused on the Affymetrix platform in order to reduce variance arising from platform inconsistencies. We have also compared our results with additional studies including own previous research  and others  and found relevance of selected dosage effects with respect to other tissues as well (data not shown). Additional cross-validation was performed with an independent microarray data set . These authors compared human lymphoblastoid cell lines derived from DS patients and normal controls with a custom-made HSA21 array. Yahya-Graison et al.  divided the expression ratios in four classes: class I and class II genes were significantly up-regulated, while class III and class IV genes were either compensated or showed variable response. Our meta-analysis revealed a high-degree of concordance taking into account that the cell model, platform and the methodology used were completely different. The meta-analysis scores were significantly higher for class I and II genes than for class III and IV genes (P-value <0.01, Additional file 5, Figure S4). 25 out of 39 class I-II genes revealed a significant score in our meta-analysis (75%).
In this study we monitored molecular interactions of HSA21 genes that might function as drivers of dosage effects (Figure 2A). For example, (i) TJP1 (Tight junction protein ZO-1) interacts with two HSA21 genes, JAM2 and CLDN8, (ii) FOS (FBJ murine osteosarcoma viral oncogene homolog) interacts with HSA21 genes ETS2, SUMO3, RUNX1 and indirectly with ERG, (iii) RHOQ (ras homolog gene family, member Q) interacts directly with ITSN1 and TIAM1 and indirectly with SYNJ, and (iv) PIK3R1 interacts directly with IFNAR1 and indirectly with IFNAR2. It should be emphasized that current information on molecular interactions is far from complete, thus we either might miss important interactions not yet detected and/or we might count false positive interactions due to the high error rates of current annotations of interactions.
Several of the DS genes (N = 79) extrapolated to more general neurological phenotypes (Figure 3A). The dendrogram (Figure 3D) shows further interesting profiles of these genes in the DS samples under analysis: (i) differential gene expression in the cerebellum region versus whole "brain" or cerebrum areas which has been reported in other studies (e.g. Moldrich et al. ), (ii) different patterns of gene expression associated to particular developmental stages (P0, P15 and P30); these changes were reported before by Dauphinot el al. , and (iii) differences in ES studies.
We further analyzed human and mouse studies separately and found 182 significant dosage effects using only human and 107 dosage effects using only mouse data. The Venn diagram in Additional file 4, Figure S3C clearly shows the benefit in detecting additional dosage effects when mixing the two species. Overlapping dosage effects were detected for 29 genes with both analyses (Additional file 1, Table S7). Results for the human and mouse specific analyses can be found in Additional file 1, Tables S8 and S9. It should be noted here that comparisons between human and mouse using microarrays are inherently difficult and have limitations since the probes for the orthologous mouse and human genes do not correspond well. Furthermore, gene expression variation is generally higher in human individuals compared to mouse inbred strains. Nonetheless, the 107 genes found in the analysis of mouse data (derived from the different mouse models for trisomy 21) represent a core set of genes responsive across different DS mouse models and, thus, could be highly relevant for DS pathogenesis.
In addition to genes commonly related to DS, we have identified novel genes that can be associated with DS phenotypes, in particular with neural development and neurodegeneration. To our best knowledge, this study is the first meta-analysis of genome-wide transcript levels along with other data domains in DS research. The agglomerated data can be accessed through the WEB server at http://ds-geneminer.molgen.mpg.de and the identified dosage effects are a resource for further functional testing and therapeutic development.
We have identified a set of 324 genes with consistent dosage effects from 45 different experiments related to DS. Since the meta-analysis was enriched with brain experiments, we were able to detect a high fraction of genes related to neuro-development, synapsis and neuro-degeneration. Moreover, our results give more information about known and new pathways related to DS and also about 62 novel candidates. The results of the meta-analysis as well as the source data have been made accessible for the community through a WEB interface.
Material and methods
Selection and integration of DS resources
Data sets were selected from heterogeneous technical platforms, different model systems (human cell lines, human tissues, mouse models) and different developmental stages (Additional file 1, Table S1). For each gene and for each source we computed a numerical value that measures its dosage effect. Data categories were either qualitative or quantitative. Qualitative data incorporated a total of 30 published manuscripts including reviews and semi-quantitative studies as well as two SAGE studies [21, 58] and were summarised to one score point in order to avoid over-scoring. Here, a "1" referred to the case that the gene was found as DS relevant in one (or more) studies. Quantitative data from differential gene expression studies such as Affymetrix microarrays, RT-PCR, MALDI and other quantitatives techniques were evaluated in order to extract comparable information across the different studies. We considered Affymetrix studies that provided the raw data (CEL file level). Raw data were extracted from Gene Omnibus Expression (GEO, ), Array Express  or were retrieved from the author's web pages (in total 16 data sets including human tissues and four different mouse models (Ts65Dn, Ts1Cje, Tc1 and Ts + HSA21). Furthermore, we incorporated 18 RT-PCR and MALDI data sets for which the authors displayed the information for all genes under study (either significant or not).
Mapping of gene IDs
A central pre-requisiste of any meta-analysis approach is the consolidation of the different ID types, for example coming from different organisms and from different versions of chips. We used the Ensembl database (version 56) as the backbone annotation for all studies. IDs were mapped to human Ensembl gene IDs. Mapping and merging of the information was done within R and the BioConductor package. In total, information on 19,388 ENSEMBL genes was mapped.
Mapping SAGE IDs
Differential expressed tags were extracted from additional files of the studies. Identifiers (based on sequences) were cross-tagged with the information displayed in the updating tables (SAGEmap_Hs and SAGEmap_Mn) from the SAGE site ftp://ftp.ncbi.nlm.nih.gov/pub/sage/mappings.
Transcriptome data pre-processing and normalization
We incorporated only case-control studies in the meta-analysis in order to derive expression fold-changes. Affymetrix gene chip annotations were adapted from the latest genome annotation (version 12). Affymetrix data were normalized with GC RMA. For transcriptome case-control studies three pieces of information were stored for each gene; (i) the fold-change (DS vs. controls), (ii) the standard error of the fold-change from the replicated experiments in that study and (iii) the expression p-value (presence-call) that indicates whether or not the gene is expressed in the target samples under study. For RT-PCR and MALDI experiments we computed the fold-change of the mean expression (DS vs. controls) as well as the reported standard error of the ratio. When mean and standard variation for each group (DS and controls) was provided we calculated the ratios as well as their associated standard errors.
Scoring DS dosage effects across studies
In order to score the different categories of information such as binary counts and quantitative gene expression values, we summarized the scores of the individual experiments for each category. For microarray studies, the score of the i-th gene in the j-th study, sij, was computed as described in Rasche et al. :
Here rij is the fold-change, pij is the average detection p-value and eij is the standard error of the ratio derived from the experimental replicates of the study. Thus, the fold-change is weighted with its reproducibility across the experimental replicates and with the likelihood of the gene being expressed in the study's case or control samples.
For RT-PCR and MALDI studies we applied the following equation:
Here rij is the fold-change and eij is the standard error of the ratio.
The total score of the gene was computed as the sum across all individual study scores.
Sampling for significance
In order to assess the significance of the overall gene scores we generated random scores by re-sampling the scores 50,000 times with replacement within the same study. Using the random distribution as background we assigned as significant those genes that were above the 99.9 percentile of the background distribution.
Judging consistency of dosage effects
For each gene, entropy of the score distribution was computed in order to quantify the relevance of the gene across many experiments. Let sij be the score of the ith gene in the jth study, then Ei is a measure for the uniformity of the score distribution over the individual experiments:
High entropy is assigned to a gene if many experiments contribute to the overall score whereas low entropy is assigned if only a few experiments contribute to the overall score.
Gene Set Enrichment Analysis (GSEA, ) of the 324 genes was performed with respect to pre-defined human pathways agglomerated from 22 pathway resources from the ConsensusPathDB (, http://cpdb.molgen.mpg.de. Over-representation analysis of TF target sets was performed with Fisher's test based on annotation from TRANSFAC . Motif enrichment analyses were performed using AMADEUS  with significant genes as target sets and all the genes considered in the meta-analysis as background set.
Selection of independent brain experiments
In order to proof general brain relevance of the 324 genes, we collected DS-independent gene expression studies to decipher brain features, performed with Affymetrix technology and, with experiments deposited in GEO or ArrayExpress (Additional file 1, Table S4). Mostly, these experiments were performed in mouse tissues. For each study we collected one or more resulting gene lists that were evaluated using Gene Set Enrichment Analysis (GSEA, ) against the complete list of 19,388 genes ranked by score.
human chromosome 21
Polymerase Chain Reaction
real-time Polymerase Chain Reaction
Matrix-Assisted Laser Desorption/Ionization
Serial Analysis of Gene Expression
Embryonic Stem Cells
Gene Omnibus Expression
Gene Set Enrichment Analysis
Copy Number Variation
Transcriptor Factor Binding Site
Patterson D: Genetic mechanisms involved in the phenotype of Down syndrome. Ment Retard Dev Disabil Res Rev. 2007, 13 (3): 199-206. 10.1002/mrdd.20162.
Antonarakis SE, Epstein CJ: The challenge of Down syndrome. Trends Mol Med. 2006, 12 (10): 473-479. 10.1016/j.molmed.2006.08.005.
Rachidi M, Lopes C: Mental retardation in Down syndrome: from gene dosage imbalance to molecular and cellular mechanisms. Neurosci Res. 2007, 59 (4): 349-369. 10.1016/j.neures.2007.08.007.
Gitton Y, Dahmane N, Baik S, Ruiz i Altaba A, Neidhardt L, Scholze M, Herrmann BG, Kahlem P, Benkahla A, Schrinner S Yildirimman R, Herwig R, Lehrach H, Yaspo ML: A gene expression map of human chromosome 21 orthologues in the mouse. Nature. 2002, 420 (6915): 586-590. 10.1038/nature01270.
Canzonetta C, Mulligan C, Deutsch S, Ruf S, O'Doherty A, Lyle R, Borel C, Lin-Marq N, Delom F, Groet J, Schnappauf F, De Vita S, Averill S, Priestley JV, Martin JE, Shipley J, Denyer G, Epstein CJ, Fillat C, Estivill X, Tybulewicz VL, Fisher EM, Antonarakis SE, Nizetic D: DYRK1A-dosage imbalance perturbs NRSF/REST levels, deregulating pluripotency and embryonic stem cell fate in Down syndrome. Am J Hum Genet. 2008, 83 (3): 388-400. 10.1016/j.ajhg.2008.08.012.
Lockstone HE, Harris LW, Swatton JE, Wayland MT, Holland AJ, Bahn S: Gene expression profiling in the adult Down syndrome brain. Genomics. 2007, 90 (6): 647-660. 10.1016/j.ygeno.2007.08.005.
Antonarakis SE, Lyle R, Dermitzakis ET, Reymond A, Deutsch S: Chromosome 21 and down syndrome: from genomics to pathophysiology. Nat Rev Genet. 2004, 5 (10): 725-38. 10.1038/nrg1448.
Wiseman FK, Alford KA, Tybulewicz VL, Fisher EM: Down syndrome--recent progress and future prospects. Hum Mol Genet. 2009, 18 (R1): R75-83. 10.1093/hmg/ddp010.
Kahlem P, Sultan M, Herwig R, Steinfath M, Balzereit D, Eppens B, Saran NG, Pletcher MT, South ST, Stetten G, Lehrach H, Reeves RH, Yaspo ML: Transcript level alterations reflect gene dosage effects across multiple tissues in a mouse model of down syndrome. Genome Res. 2004, 14 (7): 1258-1267. 10.1101/gr.1951304.
Aït Yahya-Graison E, Aubert J, Dauphinot L, Rivals I, Prieur M, Golfier G, Rossier J, Personnaz L, Creau N, Bléhaut H, Robin S, Delabar JM, Potier MC: Classification of human chromosome 21 gene-expression variations in Down syndrome: impact on disease phenotypes. Am J Hum Genet. 2007, 81 (3): 475-91. 10.1086/520000.
Chou CY, Liu LY, Chen CY, Tsai CH, Hwa HL, Chang LY, Lin YS, Hsieh FJ: Gene expression variation increase in trisomy 21 tissues. Mamm Genome. 2008, 19 (6): 398-405. 10.1007/s00335-008-9121-1.
Gardiner K, Davisson MT, Crnic LS: Building protein interaction maps for Down's syndrome. Brief Funct Genomic Proteomic. 2004, 3 (2): 142-156. 10.1093/bfgp/3.2.142.
Shin JH, Gulesserian T, Weitzdoerfer R, Fountoulakis M, Lubec G: Derangement of hypothetical proteins in fetal Down's syndrome brain. Neurochem Res. 2004, 29 (6): 1307-1316.
Seregaza Z, Roubertoux PL, Jamon M, Soumireu-Mourat B: Mouse models of cognitive disorders in trisomy 21: a review. Behav Genet. 2006, 36 (3): 387-404. 10.1007/s10519-006-9056-9.
Gropp A, Kolbus U, Giers D: Systematic approach to the study of trisomy in the mouse. II. Cytogenet Cell Genet. 1975, 14 (1): 42-62. 10.1159/000130318.
Reeves RH, Irving NG, Moran TH, Wohn A, Kitt C, Sisodia SS, Schmidt C, Bronson RT, Davisson MT: A mouse model for Down syndrome exhibits learning and behaviour deficits. Nat Genet. 1995, 11 (2): 177-184. 10.1038/ng1095-177.
Sago H, Carlson EJ, Smith DJ, Kilbridge J, Rubin EM, Mobley WC, Epstein CJ, Huang TT: Ts1Cje, a partial trisomy 16 mouse model for Down syndrome, exhibits learning and behavioral abnormalities. Proc Natl Acad Sci USA. 1998, 95 (11): 6256-6261. 10.1073/pnas.95.11.6256.
Gahtan E, Auerbach JM, Groner Y, Segal M: Reversible impairment of long-term potentiation in transgenic Cu/Zn-SOD mice. Eur J Neurosci. 1998, 10 (2): 538-544. 10.1046/j.1460-9568.1998.00058.x.
FitzPatrick DR, Ramsay J, McGill NI, Shade M, Carothers AD, Hastie ND: Transcriptome analysis of human autosomal trisomy. Hum Mol Genet. 2002, 11 (26): 3249-3256. 10.1093/hmg/11.26.3249.
Mao R, Zielke CL, Zielke HR, Pevsner J: Global up-regulation of chromosome 21 gene expression in the developing Down syndrome brain. Genomics. 2003, 81 (5): 457-467. 10.1016/S0888-7543(03)00035-1.
Chrast R, Scott HS, Papasavvas MP, Rossier C, Antonarakis ES, Barras C, Davisson MT, Schmidt C, Estivill X, Dierssen M, Pritchard M, Antonarakis SE: The mouse brain transcriptome by SAGE: differences in gene expression between P30 brains of the partial trisomy 16 mouse model of Down syndrome (Ts65Dn) and normals. Genome Res. 2000, 10 (12): 2006-2021. 10.1101/gr.10.12.2006.
Saran NG, Pletcher MT, Natale JE, Cheng Y, Reeves RH: Global disruption of the cerebellar transcriptome in a Down syndrome mouse model. Hum Mol Genet. 2003, 12 (16): 2013-2019. 10.1093/hmg/ddg217.
Amano K, Sago H, Uchikawa C, Suzuki T, Kotliarova SE, Nukina N, Epstein CJ, Yamakawa K: Dosage-dependent over-expression of genes in the trisomic region of Ts1Cje mouse model for Down syndrome. Hum Mol Genet. 2004, 13 (13): 1333-1340. 10.1093/hmg/ddh154.
Rhodes DR, Chinnaiyan AM: Integrative analysis of the cancer transcriptome. Nat Genet. 2005, S31-37. 37 Suppl
Bertram L, McQueen MB, Mullin K, Blacker D, Tanzi RE: Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGene database. Nat Genet. 2007, 39 (1): 17-23. 10.1038/ng1934.
Rasche A, Al-Hasani H, Herwig R: Meta-analysis approach identifies candidate genes and associated molecular networks for type-2 diabetes mellitus. BMC Genomics. 2008, 9: 310-10.1186/1471-2164-9-310.
Sbai O, Devi TS, Melone MA, Feron F, Khrestchatisky M, Singh LP, Perrone L: RAGE-TXNIP axis is required for S100B-promoted Schwann cell migration, fibronectin expression and cytokine secretion. J Cell Sci. 2010, 123 (Pt24): 4332-9.
Parikh H, Carlsson E, Chutkow WA, Johansson LE, Storgaard H, Poulsen P, Saxena R, Ladd C, Schulze PC, Mazzini MJ, Jensen CB, Krook A, Björnholm M, Tornqvist H, Zierath JR, Ridderstråle M, Altshuler D, Lee RT, Vaag A, Groop LC, Mootha VK: TXNIP regulates peripheral glucose metabolism in humans. PLoS Med. 2007, 4 (5): e158.-
Yamawaki H, Haendeler J, Berk BC: Thioredoxin: a key regulator of cardiovascular homeostasis. Circ Res. 2003, 93 (11): 1029-1033. 10.1161/01.RES.0000102869.39150.23.
Austin C: Does oxidative damage contribute to the generation of leukemia?. Leuk Res. 2009, 33 (10): 1297-10.1016/j.leukres.2009.04.038.
Flicek P, Aken BL, Ballester B, Beal K, Bragin E, Brent S, Chen Y, Clapham P, Coates G, Fairley S, Fitzgerald S, Fernandez-Banet J, Gordon L, Gräf S, Haider S, Hammond M, Howe K, Jenkinson A, Johnson N, Kähäri A, Keefe D, Keenan S, Kinsella R, Kokocinski F, Koscielny G, Kulesha E, Lawson D, Longden I, Massingham T, McLaren W: Ensembl's 10th year. Nucleic Acids Res. D557-562. 38 Database
Dierssen M, de Lagran MM: DYRK1A (dual-specificity tyrosine-phosphorylated and -regulated kinase 1A): a gene with dosage effect during development and neurogenesis. ScientificWorldJournal. 2006, 6: 1911-1922.
Edwards H, Xie C, LaFiura KM, Dombkowski AA, Buck SA, Boerner JL, Taub JW, Matherly LH, Ge Y: RUNX1 regulates phosphoinositide 3-kinase/AKT pathway: role in chemotherapy sensitivity in acute megakaryocytic leukemia. Blood. 2009, 114 (13): 2744-2752.
Gardiner K: Transcriptional dysregulation in Down syndrome: predictions for altered protein complex stoichiometries and post-translational modifications, and consequences for learning/behavior genes ELK, CREB, and the estrogen and glucocorticoid receptors. Behav Genet. 2006, 36 (3): 439-453. 10.1007/s10519-006-9051-1.
Butler C, Knox AJ, Bowersox J, Forbes S, Patterson D: The production of transgenic mice expressing human cystathionine beta-synthase to study Down syndrome. Behav Genet. 2006, 36 (3): 429-38. 10.1007/s10519-006-9046-y.
Belichenko PV, Kleschevnikov AM, Salehi A, Epstein CJ, Mobley WC: Synaptic and cognitive abnormalities in mouse models of Down syndrome: exploring genotype-phenotype relationships. J Comp Neurol. 2007, 504 (4): 329-345. 10.1002/cne.21433.
Ronan A, Fagan K, Christie L, Conroy J, Nowak NJ, Turner G: Familial 4.3 Mb duplication of 21q22 sheds new light on the Down syndrome critical region. J Med Genet. 2007, 44 (7): 448-451. 10.1136/jmg.2006.047373.
Korbel JO, Tirosh-Wagner T, Urban AE, Chen XN, Kasowski M, Dai L, Grubert F, Erdman C, Gao MC, Lange K, Sobel EM, Barlow GM, Aylsworth AS, Carpenter NJ, Clark RD, Cohen MY, Doran E, Falik-Zaccai T, Lewin SO, Lott IT, McGillivray BC, Moeschler JB, Pettenati MJ, Pueschel SM, Rao KW, Shaffer LG, Shohat M, Van Riper AJ, Warburton D, Weissman S, et al: The genetic architecture of Down syndrome phenotypes revealed by high-resolution analysis of human segmental trisomies. Proc Natl Acad Sci USA. 2009, 106 (29): 12031-12036. 10.1073/pnas.0813248106.
Lyle R, Béna F, Gagos S, Gehrig C, Lopez G, Schinzel A, Lespinasse J, Bottani A, Dahoun S, Taine L, Doco-Fenzy M, Cornillet-Lefèbvre P, Pelet A, Lyonnet S, Toutain A, Colleaux L, Horst J, Kennerknecht I, Wakamatsu N, Descartes M, Franklin JC, Florentin-Arar L, Kitsiou S, Aït Yahya-Graison E, Costantine M, Sinet PM, Delabar JM, Antonarakis SE: Genotype-phenotype correlations in Down syndrome identified by array CGH in 30 cases of partial trisomy and partial monosomy chromosome 21. Eur J Hum Genet. 2009, 17 (4): 454-66. 10.1038/ejhg.2008.214.
Kamburov A, Wierling C, Lehrach H, Herwig R: ConsensusPathDB--a database for integrating human functional interaction networks. Nucleic Acids Res. 2009, D623-628. 37 Database
Subramanian A, Kuehn H, Gould J, Tamayo P, Mesirov JP: GSEA-P: a desktop application for Gene Set Enrichment Analysis. Bioinformatics. 2007, 23 (23): 3251-3253. 10.1093/bioinformatics/btm369.
Wingender E, Kel AE, Kel OV, Karas H, Heinemeyer T, Dietze P, Knuppel R, Romaschenko AG, Kolchanov NA: TRANSFAC, TRRD and COMPEL: towards a federated database system on transcriptional regulation. Nucleic Acids Res. 1997, 25 (1): 265-268. 10.1093/nar/25.1.265.
Linhart C, Halperin Y, Shamir R: Transcription factor and microRNA motif discovery: the Amadeus platform and a compendium of metazoan target sets. Genome Res. 2008, 18 (7): 1180-1189. 10.1101/gr.076117.108.
Choi KH, Elashoff M, Higgs BW, Song J, Kim S, Sabunciyan S, Diglisic S, Yolken RH, Knable MB, Torrey EF, Webster MJ: Putative psychosis genes in the prefrontal cortex: combined analysis of gene expression microarrays. BMC Psychiatry. 2008, 8: 87-10.1186/1471-244X-8-87.
Choi JW, Herr DR, Noguchi K, Yung YC, Lee CW, Mutoh T, Lin ME, Teo ST, Park KE, Mosley AN, Chun J: LPA receptors: subtypes and biological actions. Annu Rev Pharmacol Toxicol. 50: 157-186.
Kakinuma N, Zhu Y, Wang Y, Roy BC, Kiyama R: Kank proteins: structure, functions and diseases. Cell Mol Life Sci. 2009, 66 (16): 2651-2659. 10.1007/s00018-009-0038-y.
Allikmets R, Dean M: Bringing age-related macular degeneration into focus. Nat Genet. 2008, 40 (7): 820-821. 10.1038/ng0708-820.
Esbensen AJ: Health conditions associated with aging and end of life of adults with Down syndrome. Int Rev Res Ment Retard. 2010, 39 (C): 107-126.
Vis JC, Duffels MG, Winter MM, Weijerman ME, Cobben JM, Huisman SA, Mulder BJ: Down syndrome: a cardiovascular perspective. J Intellect Disabil Res. 2009, 53 (5): 419-25. 10.1111/j.1365-2788.2009.01158.x.
Damgaard T, Knudsen LM, Dahl IM, Gimsing P, Lodahl M, Rasmussen T: Regulation of the CD56 promoter and its association with proliferation, anti-apoptosis and clinical factors in multiple myeloma. Leuk Lymphoma. 2009, 50 (2): 236-246. 10.1080/10428190802699332.
Wang CC, Tsai MF, Dai TH, Hong TM, Chan WK, Chen JJ, Yang PC: Synergistic activation of the tumor suppressor, HLJ1, by the transcription factors YY1 and activator protein 1. Cancer Res. 2007, 67 (10): 4816-4826. 10.1158/0008-5472.CAN-07-0504.
Li W, Wang C, Juhn SK, Ondrey FG, Lin J: Expression of fibroblast growth factor binding protein in head and neck cancer. Arch Otolaryngol Head Neck Surg. 2009, 135 (9): 896-901. 10.1001/archoto.2009.121.
Hayes JD, Flanagan JU, Jowsey IR: Glutathione transferases. Annu Rev Pharmacol Toxicol. 2005, 45: 51-88. 10.1146/annurev.pharmtox.45.120403.095857.
Xavier AC, Ge Y, Taub JW: Down syndrome and malignancies: a unique clinical relationship: a paper from the 2008 william beaumont hospital symposium on molecular pathology. J Mol Diagn. 2009, 11 (5): 371-80. 10.2353/jmoldx.2009.080132.
Reymond A, Marigo V, Yaylaoglu MB, Leoni A, Ucla C, Scamuffa N, Caccioppoli C, Dermitzakis ET, Lyle R, Banfi S, Eichele G, Antonarakis SE, Ballabio A: Human chromosome 21 gene expression atlas in the mouse. Nature. 2002, 5;420 (6915): 582-6.
Moldrich RX, Dauphinot L, Laffaire J, Rossier J, Potier MC: Down syndrome gene dosage imbalance on cerebellum development. Prog Neurobiol. 2007, 82 (2): 87-94. 10.1016/j.pneurobio.2007.02.006.
Dauphinot L, Lyle R, Rivals I, Dang MT, Moldrich RX, Golfier G, Ettwiller L, Toyama K, Rossier J, Personnaz L, Antonarakis SE, Epstein CJ, Sinet PM, Potier MC: The cerebellar transcriptome during postnatal development of the Ts1Cje mouse, a segmental trisomy model for Down syndrome. Hum Mol Genet. 2005, 14 (3): 373-384.
Sommer CA, Pavarino-Bertelli EC, Goloni-Bertollo EM, Henrique-Silva F: Identification of dysregulated genes in lymphocytes from children with Down syndrome. Genome. 2008, 51 (1): 19-29. 10.1139/G07-100.
Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Muertter RN, Holko M, Ayanbule O, Yefanov A, Soboleva A: NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res. 2009, D885-890. 37 Database
Parkinson H, Kapushesky M, Kolesnikov N, Rustici G, Shojatalab M, Abeygunawardena N, Berube H, Dylag M, Emam I, Farne A, Holloway E, Lukk M, Malone J, Mani R, Pilicheva E, Rayner TF, Rezwan F, Sharma A, Williams E, Bradley XZ, Adamusiak T, Brandizi M, Burdett T, Coulson R, Krestyaninova M, Kurnosov P, Maguire E, Neogi SG, Rocca-Serra P, Sansone SA: ArrayExpress update--from an archive of functional genomics experiments to the atlas of gene expression. Nucleic Acids Res. 2009, D868-872. 37 Database
We want to express our gratitude to all researchers that made DS data available for the community. The free access to high quality experimental data is the necessary pre-requisite for all integrative studies. Furthermore, we apologize for all data sets that could not be integrated into the analysis because of specific constraints such as chip platforms, access to raw data etc. We thank Bernhard Herrmann for giving access to the in situ mouse brain images shown in Figure 3. We thank Marie-Laure Yaspo for discussions, James Adjaye for proof-reading of the manuscript and Reha Yildirimman and Atanas Kamburov for computational support. This work was funded by the European Commission within its 6th Framework Programme with the grant AnEUploidy (LSHG-CT-2006-037627), by the Max Planck Society and the Beatriu de Pinos postdoctoral fellowship (2008 BP-A 00184).
MV carried out the systematic revisions, collected the data for the meta-analysis and for the related studies. AR wrote the general code for the meta-analysis. AR and MV adjusted the code for DS study. AT created the browser which allows the results visualization, EMD carried out the transcription factor analysis. MV performed the promoter sequences' analysis and the further statistical analysis. RH conceived of the study, and participated in its design and coordination. MV, RH, HL and LAPJ contributed to the data interpretation and wrote the manuscript. All authors read and approved the final manuscript.