Animals and RNA sample preparation
Male mice of the inbred strain C57BL/6JOxjr were bred locally using stocks from the Jackson laboratory. They are referred as C57BL/6 throughout the text. Mice were fed a normal carbohydrate (CHD) diet (S&K Universal Ltd, Hull, UK) or a 40% high fat diet (HFD) (Special Diet Services, Witham, UK) ad libitum as previously described . For the rat data set, we used biopsies of male inbred rats of the diabetic Goto-Kakizaki (GK) strain , the normoglycaemic Brown-Norway (BN) and Wistar-Kyoto (WKY, also referred as W in this study) strains, and WKY rats made diabetic (approximately 16 mM) by intravenous injection of a solution of Streptozotocin (Sigma-Aldrich, Poole, UK) at 75 mg/kg in citrate buffer (WKY-STZ rats also referred as STZ throughout the article). Rats of the GK and BN strains were from the Oxford colony (GK/Ox, BN/Ox) and WKY rats were purchased from a commercial supplier (Harlan, UK). All animals were kept under identical standard maintenance conditions on 12 h light/dark cycle. All experiments were carried out in accordance with national guidelines.
Organs used for gene expression profiling were chosen for their role in glucose homeostasis (liver) and diabetic complications (kidney). Total RNA was prepared from liver (mice and rats) and kidney (rat) biopsies as previously described .
Targets prepared from C57/Bl6 mice (n = 5 per diet group) were individually hybridised to both Affymetrix® GeneChip® Mouse Genome 430 2.0 arrays, which contain 45,101 probe sets, and Illumina® Sentrix® BeadChip Mouse-6 Expression arrays (Beta Version 1), which contain 46,120 distinct oligonucleotide sequences.
Three separate types of microarray were used to carry out rat gene expression profiling:
- Operon Rat v1.0 OpArray™ is a two-colour gene expression system, containing 5,717 oligonucleotides, almost entirely representative for unique well-documented genes. A full factorial dye-swap design was implemented, where a target from each animal's mRNA was hybridised on a separate array with all animals from different strains, and each array was repeated with the corresponding dyes switched, to correct for dye biases. A total of 108 slides were used for the liver experiment and 54 for the kidney, as three animals per group were used. RNA from WKY-STZ kidneys were not used on this array type.
- Affymetrix® GeneChip® Rat Expression Set 230A GeneChip® arrays contain 15,923 probe sets, corresponding to over 10,000 distinct annotated genes. A total of twelve microarrays were hybridised using samples from three animals for each of the four groups.
- Beta test version of the Illumina® Sentrix® BeadChip RatRef-12_V1_Eval Expression microarrays carry twelve arrays for every slide, and each array contained 22,612 distinct oligonucleotides. Technical replicates were used for all samples.
All experiments are MIAME compliant. Protocols and data are available through ArrayExpress http://www.ebi.ac.uk/arrayexpress/ under the accession E-MEXP-1195 (Rat kidney transcriptome on Affymetrix), E-MEXP-889 (Rat liver transcriptome on Affymetrix), E-TABM-500 (Rat kidney transcriptome on Operon), E-TABM-502 (Rat kidney transcriptome on Illumina), E-MEXP-1755 (C57BL/6J mouse liver transcriptome on Affymetrix).
Background removal and signal extraction techniques
These techniques are very platform-dependent because of the various background measurement techniques and scanning technologies. Different signal extraction techniques, especially for the Affymetrix platform, are expected to be less comparable than different between-array normalisations applied to the same signal extraction, as the extraction models glean distinct information from the raw foreground (and background) intensities, whereas the normalisation methods seek to minimise distribution differences through global corrections. For Operon arrays we used background subtraction, Normal and Exponential Convolution model (normexp) , and methods proposed by Kooperberg et al.  and Edwards . For Affymetrix arrays, we applied the microarray Suite version 5.0 (MAS 5.0) , the model-based expression indexes (MBEI) [31, 32] (Li-Wong method), the Robust Multi-Array Analysis (RMA) , the GC-RMA method , using the raw Perfect Match values with no background corrections . Finally, the Illumina platform does not directly measure any background or non-specific hybridisation control. However, this platform has been shown to have high precision (Kuhn et al. 2004) and is analysed, by default, with no background removal technique.
The variance stabilisation (vsn) method  was used for both within- and between-array calibration, and was used for data generated by all three platforms. We used scale transformation , quantile normalisation , local regression (loess) , including print-tip loess normalisation for Operon  and cubic spline fitting . All calculations used were either conducted in the R Language and Environment for Statistical Computing (R)  or the Illumina® BeadStudio® software. All the R normalisations were implemented in the "LIMMA" , "affy"  or "vsn"  R packages.
Data validation by quantitative RT-PCR
Quantitative Real-Time Polymerase Chain Reaction (qRT-PCR) was performed for a total of 17 genes with kidney samples from all four rat models. Genes were selected primarily due to differential expression in either the Affymetrix or Operon-generated rat kidney microarray data sets. Experiments were conducted using samples from the same animals as profiled with microarrays, and technical triplicates were used for all genes. Actin was used as the control "house-keeping" gene.
Gene annotations in the array systems
We labelled as cross-platform "target matches", sets where the oligonucleotide sequence of a probe aligned identically to the sequence spanned by all probes of an Affymetrix probe set. Dai et al. observed that Affymetrix probe set design for each microarray could not evolve as the genome and transcriptome information improved over time, and that, in some cases, probes in the same probe set derived from distinct genes . They created new annotation files based on public annotations, and claim 30–50% discrepancy in genes previously identified as differentially-expressed. Due to the redundancy in most other databases, alignments to Ensembl genes were used. The new probe sets are formed by identifying all perfect match (PM) probes which have a unique identical match amongst Ensembl genes, and forcing the new probe sets to contain at least three probes.
Publicly available gene annotations were used http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF. For Illumina and Operon, no such annotations existed in the literature. However, as on both these platforms oligonucleotides in a set are identical, we aligned all probes against Ensembl genes http://oct2005.archive.ensembl.org, with the version used for the Affymetrix probe sets, to ensure the same genome builds were consistently used (Mouse NCBI m34 and Rat RGSC 3.4).
Prior to analysis, the data was filtered by removing a user-specified proportion of oligonucleotides with the lowest mean log2 intensity in the comparison of interest. Only matches where at least one oligonucleotide from each platform under investigation remained after filtering were used.
As different signal extractions and normalisations produce outputs in absolute or logarithmic values (in varying bases), all expression intensities and fold changes from all platforms were converted into logarithmic (base 2) values after normalisation. This transformation converts multiplicative effects (such as fold changes in gene expression) into additive effects, which increases ease of both analysis and interpretation . Pearson correlations were used throughout this study. The correlation in intensity is less indicative than the correlation in fold change, as different platforms will have differing hybridisation efficiencies that may be sample dependent, or vary more through different normalisation techniques, but the intensity should be proportional to the abundance of mRNA in the sample, so fold changes should be conserved.
We also investigated whether the platforms produce the same genes as most worthy of investigation, if one only considers the "top" genes ranked by either most significant p value or highest absolute fold change. Although any overlap is likely to be very highly pointwise significant when compared to the null hypothesis of no relation between the platforms, whether the agreement is sufficient to be biologically useful is difficult to assess.
When comparing all three platforms, three pairwise tables would have to be created. In order to illustrate all intra-platform concordances simultaneously, a new plot which was labelled a "dartboard plot" was specifically designed (Figures 3 and 4). The platforms in large block capitals have their lists entered radially from the circumference, and apply until the thick black lines, the other platforms are to either side of the dotted line and their lists move out from the dotted line. The segments, corresponding to using all or part of the dataset, are colour-coded so that red marks the maximum possible value for that segment, i.e. the minimum of the two list sizes, and white represents 50% or fewer matches. Note that segments on either side of a thick black line use the same data.
It must be emphasised that when compiling these concordance lists, only oligonucleotides or genes which match to the other platform or platforms are included, implying there are no missing values, so that the above statistics can be applied and the lists directly compared for their technical performance. However, this is an important concern for experimental reasons; if two platforms show high concordance for genes assayed on both, but one measures the expression of many more genes, the two will not be of equivalent biological value. For the mouse experiment, the Affymetrix and Illumina microarrays had very similar oligonucleotide numbers, while in the rat the Operon platform had fewer oligonucleotides and the Affymetrix GeneChip utilised was designed in a prior generation of feature size to the Illumina BeadChip, so fewer probe sets were present. Hence for a valid comparison, only matching oligonucleotides were used. In general, similar generation Affymetrix and Illumina microarrays contain similar oligonucleotide numbers.