C6-amino-linker oligonucleotides (50 nucleotides in length) were obtained from MWG oligoset (MWG-Biotech AG, Ebersberg, Germany), and spike-in controls oligonucleotides were purchased from Ambion ArrayControl Spot (Ambion Inc., Austin TX). Oligonucleotides were printed on e-Surf Activated Slides (Life Line Lab S.r.l., Italy) with a SpottingArray 24 (PerkinElmer, Wellesley, MA) using 4 Stealth Micro Spotting Pins (Telechem International, Inc. Sunnyvale, CA) in 150 mM phosphate buffer pH 8,5 at 40% humidity. E-surf activated slides are obtained by adsorption on glass of a hydrophilic polymer containing N, Nacryloyloxysuccinimide (NAS). Oligonucleotides were printed at a final concentration of 10 pmol/μl. The coupling reaction was performed o/n in a saturated NaCl solution chamber with a 75% relative humidity. All oligonucleotides were printed in quadruplicates over 4 subarrays with a 2 × 2 print head. Spike-in controls oligonucleotides, negative control and buffer were printed in quadruplicates onto each subarray. The scheme is repeated 3 times on the entire slide surface resulting in 12 replicates for each gene element and 48 replicates for each control element.
ANA-1 cell line  was cultured and maintained at 37°C in a humidified incubator containing 20% O2, 5% CO2 and 75% N2. For hypoxic conditions, cells were incubated in a humidified anaerobic workstation incubator (BUG BOX, Ruskinn, UK) flushed with a mixture of 94% N2, 5% CO2 and 1%O2. Total RNA was extracted from ANA-1 cells grown under normoxic or hypoxic conditions for 18 hours, using Trizol (Invitrogen Life technologies, Irvine, CA) according to the manufacturer's protocol. The physical quality control of RNA integrity was carried out by electrophoresis using Agilent Bioanalyzer 2100 (Agilent Technologies, Waldbronn, Germany) and quantified by NanoDrop (NanoDrop Technologies, Wilmington, Delaware USA). Spike-in controls RNA were purchased from Ambion ArrayControl RNA Spikes. The RNA Spikes are a set of 8 purified RNA transcripts with sequence homology to the corresponding ArrayControl Spot. The ArrayControl sequences were selected from Escherichia coli genes that show no sequence similarity to mammalian genomes.
Sample labeling and microarray hybridization
15 μg of total RNA were converted in either Cy3- or Cy5-labeled cDNA probe using the Superscript indirect cDNA labelling kit (Invitrogen Life technologies, Irvine, CA). Spike RNA were added in appropriately diluted 2 μl mixture to total RNA and to oligodT primer, RNase-free water was used to bring the volume to 18 μl, and the reaction was denatured at 70° for 5 min and then chilled on ice. Amminoallyl-modified cDNA was generated in the presence of 5× first-strand buffer, 0.1 M DTT, dNTP mix (including amino-modified nucleotides), RNaseOUT™ (40 U/μl), SuperScript™ III RT (400 U/μl) in a final volume of 30 μl at 46°C for 3 hours. RNA template was hydrolyzed by the addition of 15 μl of 1 N NaOH followed by heating at 70°C for 10 min. Reactions were neutralized with 15 μl of 1 N HCl, and cDNA was purified on S.N.A.P. columns according to the manufacturer's instructions followed by ethanol precipitation. cDNA was lyophilized to dryness and resuspended in 5 μl of 2× coupling buffer. NHS ester of Cy3 or Cy5 dye (Amersham Pharmacia, GE Healthcare Little Chalfont, UK) in DMSO (dye from one tube was dissolved in 5 μl of DMSO) were added and reactions were incubated at room temperature in the dark for 1 h. Coupling reactions were quenched by the addition of 20 μl of 3 M sodium acetate pH 5.2, and unincorporated dye was removed using S.N.A.P. columns. The combined Cy3 and Cy5 probes were dried down in a speed-vac and then dissolved in 6 μl of RNase-free water. 10 μg of Cot-1 DNA, 10 μg of poly(A) and 4 μg of yeast tRNA were added, the mixture was denatured at 95°C for 3 min and then cooled down on ice for 1 min. 35% formamide, 3.5× SSC, 0.3% SDS and 2.5× Denhardt's were added to a final volume of 90 μl. Slides were blocked in an appropriate blocking solution, 100 mM ethanolamine, 0.2 M Tris, pH 9.0, at 50°C for 20 min and then washed in 4× SSC, 0.1% SDS for 20 min. Blocked slides were pre-hybridized at 42°C for 45 minutes with a pre-hybridization mixture (35% formamide, 4× SSC, 0.5% SDS, 2.5× Denhardt's, 20 ng/μl Salmon Sperm DNA) in the HS 400 hybridization station (Tecan Austria GmbH, Salzburg, Austria). Hybridizations were carried out at 42°C for 16 h automatically agitated every 5 min, followed by washing in (3 min each): 2× SSC and 0.1% SDS, 1× SSC and 0.5× SSC at room temperature.
Data acquisition, normalization and analysis
Arrays were scanned using a GenePix 4000B dual-color confocal laser scanner (Axon Instruments, Union City, CA) at 10-micron resolution. Images were processed, and signals from spotted arrays were quantitated using GenePix Pro 5.1 software (Axon Instruments). Array images that did not pass minimal quality control were discarded (median signal-to-background >3; median signal-to noise >3; mean of median background signal <200). Technically imperfect spots were removed either automatically by the GenePix software or through manual investigation of the array images. Such spots were flagged as 'absent' in the GenePix results files and they were not included in the analysis. To discard data from weak signals, spots with <50% of pixels >2 SD above median local background signal were flagged 'absent' too. Data from spots that not passed this criterion for one channel but with >95% of the pixels >2 SD above median local background signal in the other channel were kept. GenePix result files, including signal, background, standard deviation, pixel statistics and quality parameters on both channels have been imported in the statistical environment R  using Bioconductor software  for the subsequent normalization process. Background-subtracted fluorescence log-ratios were normalized within each array by using composite loess normalization  available in the Bioconductor package limma [23, 24]. Composite loess normalization corrects the expression log-ratios for intensity-based trends subtracting from each expression log-ratio the corresponding value of the loess curve. The loess curve is constructed by performing a series of local regressions, one local regression for each spike-in control spot on the corresponding MA-plot . Being R and G, the background-corrected red and green intensities for each spot, the expression log-ratio (M-value) corresponding to a spot is M = log2R - log2G, whereas the log-intensity (A-value) of each spot is defined as A = (log2R+log2G)/2, a measure of the overall brightness of the spot. All spike-in control spots (after filtering) have been included in each local estimate of the loess curve (this corresponds to set the parameter span equal to 1 in the implementation of the loess normalization algorithm in the package limma) to avoid a non-reliable representation of the overall trend within the sliding windows used for local regressions due to the low number of genes spotted on the array. Median percentile normalization was performed utilizing the "normalize to median or percentile" option in GeneSpring GX 7.3 (Silicon Genetics, Redwood City, CA).
In some cases, loess normalized M-values have also been scaled across a series of arrays. The need for scaling across arrays has been determined empirically in each instance, according to the experimental evidences on different classes of spots (basically, spike and non-spike genes). We used two different methods for scaling, both implemented in the package limma: the scale method [3, 5, 14], whose basic idea is simply to scale the M-values to have the same median-absolute-deviation (MAD) across arrays; and the quantile method , which ensures that the M-values have the same empirical distribution across arrays and across channels.
Analysis of variance (ANOVA)
Analysis of variance (ANOVA) is a procedure for constructing statistical tests by partitioning the total variance into different sources. ANOVA model consists in a separation of a complex variance term into its components . We create a fixed effect model with interaction terms to evaluate the main effects of the potential sources of variance. To confirm the loess normalization we performed ANOVA among data before and after the normalization process. The ANOVA model is:
logR(g, d, s) = μ+G(g)+D(d)+S(s)+GD(gd)+ ε(g, d, s)
where log R(gds) is the measured log ratio for spike g, concentration d, and array position s; μ is the average log ratio over the whole array, G(g) is main effect for spike characteristics, D(d) is the main effect for the spike RNA amount (concentration), S(s) is main effect for position on the array, GD(gd) is a term accounting for effects of the interaction between the spike characteristics and the concentration and ε(g, d, s) is stochastic error. The error is assumed to be independent and of zero mean. To satisfy these assumptions, the homogeneity of the variances was visually inspected by residual graphic analysis. Statistical analysis was performed with SPSS 13.0 (SSPS Inc., Chicago, IL).
System specificity and sensitivity in detecting differential gene expression were evaluated using receiver operating characteristic (ROC) curves . A ROC curve shows the relationship between the proportion of true positive (Sensitivity) and false positive (1-Specificity) classifications resulting from each possible decision threshold value in a two-class classification task . The area under the curve is a measure of test accuracy , and when applied to a gene expression profile, it provides an estimate of the probability that a gene is up- or down-regulated in a given group. The spike RNAs were added to the hybridization mixture of the arrays at pre-determined specific concentrations ranging from 500 to 1500 pg. Test sensitivity was calculated as the number of regulated genes correctly classified by the test divided by the number of regulated genes. False-positive rate is defined as the number of false positives genes from the group of non-regulated genes divided by the total number of non-regulated genes. Statistical analysis was performed with STATA 8.0 (StataCorp LP, College Station, TX).
Description of experiments
All microarray raw data were provided as additional files. We performed three types of microarray experiments. (i) In dilution experiments all the spike RNAs were added in the same quantity in both channels. We set up four different dilutions: 10 pg (additional file 1), 250 pg, 750 pg additional file 2) and 1000 pg (additional file 3). The experiment at 250 pg was performed in quadruplicate (additional file 4, 5, 6, 7). Data from these dilution experiments were used for Figure 1, Figure 2, Figure 3, Figure 4, Figure 7, Figure 8, Table 2, Table 3, Table 4 and Table 5. (ii) In range experiments two different mixtures were set up to cover a wide range of signal intensity. Every spike RNA was added in the same quantity in both channels to get a final ratio of 1, but in the same mixture the spikes were present at increasing concentrations. Mix 1 contains spikes at 5 pg, 10 pg, 50 pg, 100 pg, 500 pg and 1000 pg (additional file 8). Mix 2 contains spikes at 250 pg, 500 pg, 1000 pg, 1500 pg, 3000 pg and 5000 pg (additional file 9). Data from range experiments were used for data in Figure 5. (iii) ROC experiments were planned to compare expected with measured signal ratios. The spike RNAs were added in defined quantity to obtain ratios of 1 (500/500 pg), 1.5 (750/500 pg), 2 (1000/500 pg) and 3 (1500/500 pg) (additional file 10). Dye swap was performed to get reverse ratios (additional file 11). Data from ROC experiments were used in Figure 6.