Quality control in microarray assessment of gene expression in human airway epithelium
© Raman et al; licensee BioMed Central Ltd. 2009
Received: 13 May 2009
Accepted: 24 October 2009
Published: 24 October 2009
Microarray technology provides a powerful tool for defining gene expression profiles of airway epithelium that lend insight into the pathogenesis of human airway disorders. The focus of this study was to establish rigorous quality control parameters to ensure that microarray assessment of the airway epithelium is not confounded by experimental artifact. Samples (total n = 223) of trachea, large and small airway epithelium were collected by fiberoptic bronchoscopy of 144 individuals and hybridized to Affymetrix microarrays. The pre- and post-chip quality control (QC) criteria established, included: (1) RNA quality, assessed by RNA Integrity Number (RIN) ≥ 7.0; (2) cRNA transcript integrity, assessed by signal intensity ratio of GAPDH 3' to 5' probe sets ≤ 3.0; and (3) the multi-chip normalization scaling factor ≤ 10.0.
Of the 223 samples, all three criteria were assessed in 191; of these 184 (96.3%) passed all three criteria. For the remaining 32 samples, the RIN was not available, and only the other two criteria were used; of these 29 (90.6%) passed these two criteria. Correlation coefficients for pairwise comparisons of expression levels for 100 maintenance genes in which at least one array failed the QC criteria (average Pearson r = 0.90 ± 0.04) were significantly lower (p < 0.0001) than correlation coefficients for pairwise comparisons between arrays that passed the QC criteria (average Pearson r = 0.97 ± 0.01). Inter-array variability was significantly decreased (p < 0.0001) among samples passing the QC criteria compared with samples failing the QC criteria.
Based on the aberrant maintenance gene data generated from samples failing the established QC criteria, we propose that the QC criteria outlined in this study can accurately distinguish high quality from low quality data, and can be used to delete poor quality microarray samples before proceeding to higher-order biological analyses and interpretation.
The assessment of gene expression of the human transcriptome using microarray technology is a powerful tool for identifying genes and gene expression patterns involved in mechanisms of normal organ function and the pathogenesis of disease [1–3]. Microarray technology is ideal for studies of the human airway epithelium in health and disease in that the airway is one of the few internal organs where it is possible to repetitively sample sufficient quantities of pure populations of parenchymal cells from healthy individuals as well as individuals with lung disease [4–11]. In this regard, we and several other groups have used human gene expression microarrays to assess the expression of genes in the human airway epithelium, cell populations easily attainable via fiberoptic bronchoscopy [4, 9, 12–15].
While it is easy to obtain the cells, the output from microarray data critically depends on the quality of the RNA and the cRNA derivatives hybridized to the microarray [16–27]. Although several different cutoff criteria for RNA integrity and microarray data quality have been proposed, they are not consistently applied. In this context, the focus of this study is to establish rigorous quality control (QC) criteria to ensure high quality data from arrays that is comparable and reproducible among different investigators and laboratories. Our strategy is based on the concept that the quality of expression data can be efficiently assessed using three discreet QC metrics computed on the sample and chip level, and that application of these metrics can ensure uniformly high quality microarray data. Using Affymetrix Human Genome U133 Plus 2.0 arrays to sample a total of 223 samples of tracheal, and large and small airway epithelium from 144 individuals [healthy non-smokers, healthy smokers, symptomatic smokers, smokers with lone emphysema with normal spirometry, and smokers with COPD (GOLD I - III)], we have established pre- and post-chip QC criteria based on empirical observations of our data in conjunction with published suggestions that include: (1) RNA quality, assessed by RNA Integrity Number; (2) cRNA transcript integrity, assessed by signal intensity ratio of the glyceraldehyde-3-phosphate dehydrogenase (GAPDH) 3' to 5' probe sets; and (3) a defined upper limit for the multi-chip normalization scaling factor. Of the 223 samples, all three criteria were assessed in 191; of these 184 (96.3%) passed all three criteria. For the remaining 32 samples, the RIN was not available, and only the other two criteria were used; of these 29 (90.6%) passed these two criteria. Expression data for 100 maintenance gene probe sets on the array demonstrates that among the samples failing QC criteria, there is greater variability among reported expression levels for maintenance genes compared to randomly selected samples passing the QC criteria. The QC criteria proposed in this study should provide a useful guideline for future studies using microarrays to assess mRNA levels in human airway epithelial samples, and should be adaptable to assessment of microarray data from other cell populations.
Some of the results of these studies have been previously reported in the form of an abstract .
Demographic of the Study Population and Biologic Samples1
Lone emphysema with normal spirometry6
42 ± 7
42 ± 9
43 ± 10
43 ± 7
44 ± 6
44 ± 6
36 ± 6
39 ± 6
41 ± 10
49 ± 7
52 ± 8
28 ± 16
28 ± 18
28 ± 16
14 ± 4
16 ± 9
21 ± 13
31 ± 18
38 ± 23
Pulmonary function parameters4
111 ± 16
105 ± 11
109 ± 11
108 ± 11
109 ± 12
109 ± 12
116 ± 6
113 ± 7
110 ± 109
102 ± 11
93 ± 23
111 ± 18
101 ± 26
105 ± 21
108 ± 13
109 ± 13
109 ± 14
112 ± 16
112 ± 13
108 ± 20
97 ± 12
72 ± 22
83 ± 7
82 ± 6
80 ± 7
82 ± 6
81 ± 5
81 ± 5
80 ± 6
81 ± 3
81 ± 13
79 ± 4
61 ± 9
106 ± 17
99 ± 14
104 ± 13
100 ± 8
102 ± 12
100 ± 12
106 ± 4
108 ± 4
104 ± 19
93 ± 13
105 ± 22
110 ± 9
101 ± 18
101 ± 17
94 ± 7
96 ± 11
96 ± 11
92 ± 14
95 ± 13
94 ± 18
65 ± 8
73 ± 19
Average # of cells recovered (×106)
100 ± 0.2
100 ± 0.7
100 ± 0.6
100 ± 0.2
100 ± 0.7
100 ± 0.5
100 ± 0.0
100 ± 0.6
100 ± 0.4
99 ± 0.8
99 ± 1.7
0.1 ± 0.2
0.3 ± 0.7
0.3 ± 0.6
0.1 ± 0.2
0.3 ± 0.7
0.2 ± 0.5
0.4 ± 0.6
0.4 ± 0.4
0.6 ± 0.8
1.5 ± 1.7
49 ± 7.1
55 ± 3.9
77 ± 5.6
27 ± 8.2
49 ± 9.0
72 ± 6.7
26 ± 2.8
47 ± 16
76 ± 5.4
73 ± 8.7
69 ± 2.8
6.6 ± 4.0
12 ± 4.0
6.8 ± 3.5
8.8 ± 4.4
11 ± 4.1
7.1 ± 3.0
12 ± 4.6
14 ± 1.2
5.9 ± 3.0
9.7 ± 7.2
12 ± 2.9
29 ± 8.6
20 ± 3.4
9.1 ± 3.4
39 ± 5.2
24 ± 5.6
9.9 ± 3.3
37 ± 5.4
17 ± 10
10 ± 2.5
9.9 ± 4.8
8.2 ± 2.3
15 ± 6.0
13 ± 3.8
7.3 ± 3.6
25 ± 11
15 ± 7.7
11 ± 5.6
25 ± 1.0
22 ± 6.0
7.8 ± 1.1
7.3 ± 2.4
9.6 ± 1.8
Establishment and Testing of Quality Control Criteria
The overall strategy was to utilize the data on 223 samples to establish prospectively applicable QC criteria that would ensure high quality expression microarray data for biological interpretation in our ongoing studies. The QC criteria were selected as rigorous and objective quality control metrics at three distinct stages of the microarray workflow, and were applied to all 223 samples hybridized to microarray in this study; for the RIN assessment, the n = 191 (32 samples were unavailable for RIN analysis because the samples were hybridized to microarray prior to the development of the Bioanalyzer RIN software). For the GAPDH 3'/5' signal intensity ratio and scaling factor criteria, all 223 samples were included.
Of the 223 samples, all three criteria were assessed in 191; of these 184 (96.3%) passed all three criteria. For the remaining 32 samples, the RIN was not available, and only the other two criteria were used; of these 29 (90.6%) passed these two criteria. Only 10 (4.5%) failed at least one QC criterion, and were therefore considered to have failed QC. The overall breakdown of samples failing QC was: 2 large airway samples (1 healthy non-smoker and 1 healthy smoker) and 8 small airway samples (1 healthy smoker, 4 symptomatic smokers, and 3 smokers with COPD). The greatest source of failure was the scaling factor criterion, which contributed to 70% of the overall failures. All of the 10 samples failing the QC criteria failed the RIN and/or scaling factor criterion, indicating that these metrics may be the most sensitive to technical variance, and therefore are central to assessing overall array quality. While 7 samples failed by one criterion each, 1 sample failed by both the RIN and GAPDH 3'/5' ratio criteria, and 2 samples failed by both the RIN and scaling factor criteria, suggesting that the quality control parameters exert correlated effects on array performance.
GAPDH 3'/5' Signal Intensity Ratio
Multi-chip Normalization Scaling Factor
The scaling factor was used as an overall index of the microarray hybridization, washing, and scanning process. Scaling factor values for all 223 samples computed at a target intensity value of 500 were examined. The criterion of scaling factor values ≤ 10.0 was established (Figure 2). Seven out of the 223 samples (3.1%) had scaling factor values above the acceptable cutoff. The scaling factor values were not significantly dependent upon the phenotype or biologic origin of the sample (p > 0.1 by ANOVA), with n = 5 small airway samples (1 healthy smoker, 2 symptomatic smokers, 2 smokers with COPD) and n = 2 large airway samples (1 healthy nonsmoker, 1 healthy smoker) failing on the basis of scaling factor >10.0.
Classification of Quality Control Failures by Criterion 1
+ GAPDH 5'/3'
+ Scaling factor
Maintenance Gene Expression Levels
Similarly, the coefficient of variation for all probe sets was greater for microarrays that failed QC compared to that for microarrays that passed. Two datasets of 9 microarrays each were compared giving a mean coefficient of variation of 34 ± 0.1% for the arrays that passed QC and 43 ± 0.1% for the arrays that failed QC. The impact on discovery of biological differences (for example impact of smoking on gene expression profile ), was assessed by power calculations. If two groups of 15 smokers and 15 non-smokers were compared, the required true difference of means for detection with p < 0.05 with and power of 0.95 rises from 0.46 with arrays that pass QC to 0.58 with arrays that failed QC (i.e., small biological effects become more difficult to detect).
To examine potential causes of the variation in maintenance gene expression levels unrelated to the QC criteria, differences among the subjects were assessed. The 223 airway epithelial samples acquired for this study were derived from 144 individuals, as it was possible for a single individual to undergo bronchial brushing at one or more of the three target sites: trachea, large airway, and small airway. By independent linear regression, there was no correlation of gene expression level for the 100 maintenance genes (r2<0.05 for all genes) with age (average 45 ± 8.8) across the 144 individuals from whom airway epithelium was derived. None of the genes showed strong correlation (r2<0.15) with smoking history (average pack-yr 30 ± 18). Correlation analysis of expression levels with pulmonary function parameters showed no relationship (r2<0.09 for all genes with all parameters).
Impact of QC Failures on Global Lung Biology
Epithelial samples (n = 223 total) of trachea, large airway and small airway were obtained from healthy subjects and from subjects with lung disease, including smokers and non-smokers, to assess quality control criteria for microarray analysis. Using Affymetrix Human Genome U133 Plus 2.0 arrays, a tripartite QC cutoff was established consisting of: (1) RNA quality, assessed by RNA Integrity Number (RIN) ≥ 7.0 using Agilent 2100 Bioanalyzer software; (2) cRNA transcript integrity, assessed by signal intensity ratio ≤ 3.0 of GAPDH 3' to 5' probe sets; and (3) the multi-chip normalization scaling factor ≤ 10.0. Of the 223 samples, 10 failed one or more of the QC criteria in a way that did not depend on phenotype of the subject or location of sampling. By using the QC cutoff criteria, the inter-array variability, as assessed by the coefficient of variation in the expression levels for 100 maintenance genes, decreased significantly. These QC criteria should be applicable to minimize experimental variation in gene expression microarray experiments.
RNA Quality as Assessed by RIN
We have previously utilized the 28s/18s rRNA peak ratio, as calculated by electropherogram, to verify quality of RNA samples prior to microarray hybridization . However, the 28s/18s ratio does not always provide a sufficient basis for distinguishing high quality from low quality RNA for microarray experiments [21, 26, 27, 32, 39–41]. For example, in an analysis of the effects of technical variability on gene expression in unfixed snap frozen vs formalin-fixed paraffin-embedded (FFPE) pelleted human bone marrow stromal cells, despite all RNA samples having equivalent and comparable 28s/18s ratios as visualized by computerized gel electrophoresis, more than twice as many genes were identified as expressed in snap frozen cells than in formalin-fixed paraffin-embedded cells, reflecting possible RNA quality effects in play that were not captured by quantitative assessment of the rRNA subunit peak heights .
Since the implementation of the Agilent Bioanalyzer RIN software, we have relied on the RIN as the primary indicator of RNA integrity, based on published data showing that the RIN accounts for numerous properties of the RNA degradation process to provide an unambiguous and comprehensive index of the overall quality of the starting material [21, 41, 43, 44]. We found that the RNA quality in this study, as assessed by RIN, was generally good with a failure rate of 2.8% based on RIN ≥ 7.0. The low percentage of failures probably reflects rigorous training and standard operating procedures that ensure that epithelial cells are homogenized in Trizol in less than 60 minutes from the time of bronchial brushing. Using a single technician for this process with space, equipment and reagents that are not used for other purposes is also critical. The increased interest in using clinical specimens for research has led to widespread establishment of human tissue banks. In many cases, the RNA for microarray studies is extracted from tissues samples that may have been kept at room temperature and/or undergone repeated thawing and freezing, thereby affecting the quality of the RNA [24, 32, 45–47]. For example, microarray experiments involving pancreatic tumor tissue have had to discard the majority of the extracted RNA samples, due to the RNAse-rich content of the organ and the rapid degradation of the RNA material [48, 49]. For those types of samples with possible RNA degradation, consistent application of the RIN ≥ 7.0 cutoff is useful for obtaining high quality gene expression microarray data.
Illustrating the predictive power of the RIN as a pre-chip criterion, linear regression modeling and ordinary least squares linear regression have shown that the scaling factor and GAPDH 3'/5' signal intensity ratio are negatively correlated with the RIN value . Interestingly, the tandem failures by two samples in the present study by the RIN and scaling factor criteria, and by one sample by the RIN criterion and GAPDH 3'/5' signal intensity ratio criteria, are in concordance with the concept that poor RNA quality adversely affects synthesis of full-length cRNA as well as the hybridization efficiency of probe-target binding [19, 33, 34, 39, 50–52]. Since failure of the RIN test predicts failure at downstream steps, the application of this cutoff prior to in vitro transcription reactions and hybridization has the potential to save substantial costs in wasted reagents and technical time.
Published recommendations for an acceptable range of scaling factors computed at the same target intensity value vary in numerical fold cutoffs, or alternately, suggest all values within 2 standard deviations from the mean in either direction [16, 34, 35]. However, because the GeneChip Scanner 3000 7G used in this study, and generally employed by most institutional microarray core facilities, can resolve 65,535 levels of fluorescence in 16 bits of resolution (allowing for detection of very low levels of fluorescence), scaling factors for arrays can theoretically, and in practice, range well into the hundreds. Since a mutable scaling factor range can be continually subject to fluctuation as new samples are added to ongoing studies, and skewed by the presence of even one or two outlying chips with extremely high scaling factors, we chose a finite upper limit of 10.0 for the scaling factor criterion. As 97% of the scaling factor values for the samples examined in this study were ≤ 10.0, this is a practical and attainable cutoff that can accurately identify outlying poor quality samples.
In gene expression profiling studies of samples obtained from biopsies, cell sorting, or laser capture microdissection, yields of cellular RNA are often small quantities (e.g., ng) and require specialized amplification methods to generate sufficient biotinylated cRNA for array hybridization [53, 54]. In these types of studies and others examining in vivo tissue from which sample RNA is limiting and alternate technical procedures are utilized, the scaling factor metric can be useful to assess the impact of technical artifact, and the quality of the expression data. For example, in an analysis of small sample RNAs from rat liver, significantly increased scaling factor values indicated that the amplification technique used contributed to technical variability in the form of a substantial decrease in the percent of transcripts detected on the array . In contrast, in a study of small amounts of RNA derived from breast cancer tissue from mastectomy specimens, consistent scaling factor values across all amplified samples confirmed the validity and comparability of the expression data .
Quality Control for Expression Microarray Analysis
Despite large amounts of published lung gene expression data, there is often little attention focused on microarray quality control, with the consequent risk of skewing the data by including poor quality arrays in the analysis [35, 57–63]. Further, different effects of RNA quality on specific ontological categories can complicate the extraction of biological information from microarrays of varying quality. For example, in an analysis of the effects of RNA integrity on gene expression in breast cancer samples, it was found that specific categories of genes such as those related to deoxyribonuclease activity, regulation of cell adhesion, and NADH dehydrogenase activity, were most affected by RNA quality .
One methodology for testing data integrity is that of unsupervised hierarchical sample clustering based on Spearman correlations-based distance metric [64, 65]. The resulting clusters are inspected manually for clustering of samples by non-biological parameters, such as the dates of sample collection and RNA extraction, the batch of in vitro transcription and amplification reagent used, and the date of array hybridization. These factors may contribute to batch effects, where the overall intensity of a batch of microarrays more closely resembles the batch than the rest of the group of arrays [60, 66]. While these clustering methods provide insight into experimental variability, they provide no quantitative guidelines for eliminating microarrays from analysis and it is sometimes difficult to determine if the clusters have any relationship to biological variability.
Another strategy often used for differentiating high quality from low quality microarray data is based on outlier status of any given sample in an experiment. Software packages such as dChip (DNA-chip analyzer) and Probe Profiler can identify intensity outliers of a sample in a group of microarrays, and take into account such features of the array hybridization such as brightness, saturation, dynamic range, and background [65–67]. The caveat is that all chips must be from biologically comparable origins and that only a small number of experimental outliers must exist.
QC Criteria Presented in this Study
The current study provides an efficient and simple approach for quality assessment of gene expression microarray data. It emphasizes good experimental execution and discarding unsatisfactory microarrays rather than salvaging data through complex statistical analyses of array data of variable quality. We provide a standardized tripartite criteria specifically addressing starting RNA quality, integrity of the cRNA transcript, and hybridization efficiency. Each parameter has been assigned a threshold value, outside of which samples are readily identifiable as being low quality and can be eliminated or re-hybridized before proceeding to analysis. All measures are available through Agilent Bioanalyzer software and the Affymetrix GCOS report automatically generated after array washing and scanning. Although the Agilent Bioanalyzer and Affymetrix platforms are widely used, analogous criteria may be applied for alternate methodologies. For example, assessment of the relative signal for probes representing the 3' and 5' ends of any mRNA could be included as QC for any microarray platform.
In the context of data sharing via public repositories, the criteria presented in this study has the benefit of including two parameters that are guaranteed to be available for any Affymetrix data deposited in GEO. The initial processing by GCOS of CEL files produces a Quality Report containing the 3'/5' GAPDH signal intensity ratio and a multi-chip normalization scaling factor for the array. The GCOS software is available for free download from Affymetrix and can be applied by all investigators. In this way, two of the three QC criteria discussed in this paper provide a consistent quality control approach to not only current data, but also to previously published, archived data. Even though the RIN criterion as applied here requires specialized equipment and software, the RIN can be indirectly predicted from the 3'/5' ratio which is extracted from the CEL files deposited in GEO .
In the context that minimizing undesirable technical variation allows for more accurate analysis of gene expression and increased power for significance testing, we propose that the simple method described here, consisting of a universally available set of three criteria, can ensure that microarray data reflects biological differences as opposed to experimental variability.
After signing informed consent, subjects were evaluated in the Weill Cornell NIH Clinical and Translational Science Center and Department of Genetic Medicine Clinical Research Facility under protocols approved by the Weill Cornell Medical College Institutional Review Board. All individuals were assessed by standard history, physical exam, complete blood count, coagulation studies, liver function tests, HIV-1 test, urine studies, chest X-ray, EKG, and pulmonary function tests. All individuals were assessed for smoking status with urine nicotine and cotinine levels, and blood carboxyhemoglobin levels. A total of 223 airway epithelial samples in this study were derived from three sites: trachea, large airway (2nd-3rd order bronchi) and small airway (10th-12th order bronchi) in five phenotypic groups (Table 1): healthy non-smokers (trachea chea n = 17, large airway n = 21, small airway n = 35), healthy smokers (trachea n = 15, large airway n = 32, small airway n = 44), symptomatic smokers (trachea n = 3, large airway n = 4, small airway n = 10), smokers with lone emphysema with normal spirometry (small airway n = 22) and smokers with COPD GOLD stages I-III (small airway n = 20). Healthy non-smokers had no symptoms referable to the lungs, normal lung function and normal chest X-ray, and all laboratory tests within normal limits. The criteria for healthy smokers were identical to that of healthy non-smokers except urine nicotine and cotinine and blood carboxyhemoglobin levels confirmed current smoking status. Symptomatic smokers were similar to healthy smokers except they had cough or sputum score of 3 or greater, or dyspnea score on the Modified Medical Research Council (MMRC) dyspnea scale of 2 or greater [68–71]. The lone emphysema with normal spirometry phenotype was defined by normal FEV1/FVC, reduced DLCO, and evidence of emphysema on quantitative CT scan (>1% of lung with <-950 Hounsfield units ). Smokers with established COPD included current smokers who met the Global Initiative for Chronic Obstructive Lung Disease (GOLD) criteria for GOLD I, II, and III . An independent data set from n = 11 individuals with COPD was available from a technician training program in the Weill Cor-nell Medical College Department of Genetic Medicine. The small airway epithelium samples from these subjects, which failed QC criteria, were compared to 11 matched small airway epithelium samples from individuals with COPD that passed QC criteria (see Additional file 1).
Sampling of Airway Epithelium and RNA extraction
Fiberoptic bronchoscopy was performed to obtain pure populations of tracheal, large and small airway epithelium by using methods previously described [4, 29, 73]. Briefly, after mild sedation with meperidine and midazolam and routine anesthesia of the vocal cords and bronchial airways with topical lidocaine, a fiberoptic bronchoscope (Pentax, EB-1530T3) was taken proximal to desired collection location. A 2.0 mm disposable brush is used for brushing immediately distal to the location of the bronchscope (for trachea or large airway) or by advancing 7 to 10 cm further into the 10th to 12th generation branching for small airway. Epithelium was collected by gently gliding the brush back and forth 5 to 10 times in 8 to 10 different locations in the same general area. Cells were detached by immersing the brush into 5 ml of ice-cold bronchial epithelial basal medium (BEBM, Clonetics, Walkersville, MD) and flicking five to ten times. An aliquot of 0.5 ml was used for differential cell count and the remainder (4.5 ml) was centrifuged at 6,000 rpm for 10 minutes within less than 60 minutes from the time of bronchial brushing. Pelleted airway epithelial cells were lysed with the TRIzol reagent (InVitrogen, Carlsbad, CA), and after chloroform extraction the RNA was purified directly from the aqueous phase using the RNeasy MinElute RNA isolation kit (Qiagen, Valencia, CA). For each sample, 1 μl of RNA was used for quantification of yield by NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE) and quality assessment by Agilent 2100 Bioanalyzer software. The samples were stored in RNA Secure (Ambion, Austin, TX) at -80°C until time of biotin-labeled cRNA preparation.
Double stranded cDNA was synthesized from 1.0 to 2.0 μg of total RNA using the GeneChip One-Cycle cDNA Synthesis Kit, followed by cleanup of the double stranded product with the GeneChip Sample Cleanup Module. The GeneChip IVT Labeling Kit was used for the 16 hr in vitro transcription reaction and the Genechip Sample Cleanup Module was used for cleanup of the biotin-labeled cRNA (all kits from Affymetrix, Santa Clara, CA). Final yield of biotin-labeled cRNA was confirmed by NanoDrop spectrophotometric analysis. For each sample, 10 μg of biotin-labeled cRNA was fragmented, and hybridized to the Human Genome U133 Plus 2.0 array (54,675 probe sets) according to Affymetrix protocols, processed by the Affymetrix GeneChip Fluidics Station 450 and scanned with an Affymetrix GeneChip Scanner 3000 7G http://www.affymetrix.com/support/technical/manual/expression_manual.affx, as previously described . Captured images were analyzed using Microarray Suite version 5.0 (MAS 5.0) algorithm (Affymetrix). The data was normalized per array using GeneSpring version 7.3 software (Agilent Technologies, Palo Alto, CA), by dividing the raw data by the 50th percentile of all measurements on that array. All microarray data has been deposited at the Gene Expression Omnibus (GEO) site (http://www.ncbi.nlm.nih.gov/geo/; accession number GSE11906).
Quality Control Parameters
The selection of the three QC criteria was targeted towards addressing quality control in the three integral stages of the microarray process: (1) extraction of the starting RNA material; (2) synthesis of cDNA and antisense biotin-labeled cRNA target; and (3) the array hybridization efficiency.
An RNA Integrity Number (RIN) for each RNA sample in this study was generated by an Agilent Bioanalyzer algorithm that uses a Bayesian approach to train and select a prediction model incorporating features extracted from an electropherogram including pre-region, 5S-region, fast-region, 18S-fragment, inter-region, 28S-fraction, precursor-region, and post-region [41, 44]. RIN values range from 1 to 10, with 1 indicating a high level of degradation and 10 indicating fully intact RNA. RIN was assessed on 180 of 223 epithelial RNA samples. The 43 RNA samples not assessed by RIN had been processed and hybridized to microarray before the development of the RIN software and residual RNA was unavailable for testing. Published suggestions of a RIN cutoff value to distinguish poor quality from good quality RNA samples vary from 3.9 to 7.8 [19, 26, 32, 40, 50, 74]. Based on literature indicating a substantial increase in the rate of false positives on the array when the starting RNA had a RIN value of <7.0, an acceptance criterion of RIN ≥ 7.0 was established . Available RIN values for the 180 RNA preparations were assessed by this criteria and passing or failing samples were grouped by phenotype, and each phenotype was separated by biologic origin.
GAPDH 3'/5' signal intensity ratio
Per Affymetrix guidelines, the ratio of the 3' to 5' signal intensity values can be used as a method of quality control for the array data [23, 35, 36, 75]. As the GeneChip system utilizes polyadenylation complementary oligonucleotides as a primer for reverse transcription of the starting RNA template, inefficiency of first strand cDNA synthesis and/or in vitro transcription of cRNA can result in under representation of the 5' moiety of the transcript [34, 52]. In accordance with recommendations by Affymetrix and others, an acceptance criterion of GAPDH 3'/5' ratio ≤ 3.0 was established [16, 34, 35]. To accomplish this, for each sample hybridized to microarray, a GeneChip Operating Software report file was generated using the Affymetrix GeneChip Operating Software (GCOS), a software system that automates the acquisition of data by GeneChip fluidics stations and scanners, and provides workflow tracking of experiment, image and analysis data. Among the QC metrics summarized in the report file are the signal intensity values for the 3' and 5' probe sets for the GAPDH gene. The ratio of 3' to 5' signal intensities for the GAPDH probe set was extracted from the GCOS report file for each of 223 samples and those with GAPDH 3'/5' ratio > 3.0 were scored as failures.
Multi-chip normalization scaling factor
According to Affymetrix microarray guidelines, comparable scaling factors between arrays in a given experiment are critical to minimizing differences in overall signal intensities, thereby allowing for more reliable detection of biologically relevant changes [35, 52]. Based on the distribution of data for 223 samples, a criterion of scaling factor ≤ 10.0 was established, above which samples were considered to demonstrate poor hybridization and labeling efficiency. Scaling factor values for all 223 samples were assessed against this acceptable level. To accomplish this, for each sample hybridized to a microarray, the Affymetrix GeneChip Scanner 3000 7G was set to a target intensity value of 500 and the GCOS image analysis software extracted pixel values from the raw image file, producing a CEL file containing fluorescence intensities for each probe. A CHP file was then generated from each CEL file through GCOS consolidation of all probe pairs interrogating a gene into a single signal value and an Absent/Marginal/Present call for the probe set. The creation of CHP files from CEL files generated a scaling factor for each array which was applied to normalize signal intensity thereby permitting comparisons among arrays. The scaling factor is the multiplication factor applied to the trimmed mean of probe set intensities to equalize this value to the target intensity value. Scaling factor values were extracted from the GCOS report file for all 223 samples.
Analysis of Maintenance Gene Expression Levels
Samples that failed any one of the three criteria described were considered to have failed the QC criteria, while those samples that passed all three criteria were considered to have passed the QC criteria. To confirm the validity of this quality assessment strategy, expression levels were determined for a set of 100 constitutively expressed maintenance genes and differences in gene expression profile for these genes were compared between the samples failing the quality control criteria and the samples passing the QC criteria. The set of control genes was selected by Affymetrix using the a priori knowledge that they exhibit relatively low signal variation over different sample types and are consistently called Present in a large number of different tissues and cell lines (list available at the NetAffx Analysis Center, http://www.affymetrix.com/support/technical/technotes/hgu133_p2_technote.pdf). In the present study, for notational convenience, we use the term "gene" in place of "probe set", as each one of the 100 probe sets represents a different gene.
To examine potential causes for variation in QC criterion values between samples, the effects of differences in phenotype or biologic origin of the sample were assessed by ANOVA.
In regard to the maintenance genes we used Pearson's correlation to assess correlation in expression levels for the 100 maintenance genes among the 10 samples (2 large airway epithelium, 8 small airway epithelium) that failed the QC criteria and 24 randomly selected samples that passed QC criteria (8 tracheal epithelium, 8 large airway epithelium, and 8 small airway epithelium; see Results) . The identities of the 24 samples passing QC criteria were randomly generated using a random number generator and sample origin was evenly distributed among trachea (n = 8), large airway (n = 8), and small airway (n = 8). Significance of the difference in correlation coefficients for pairwise correlations where at least one sample failed QC criteria and for pairwise correlations where both samples passed QC criteria was assessed by nonparametric analysis. Coefficient of variation analysis was used to determine variability in expression levels for each of the 100 maintenance genes across the 10 samples failing QC criteria and across 10 samples passing QC criteria. For this analysis, the 10 samples passing QC criteria were randomly selected using a random number generator, with the stipulation that these samples were matched in origin with the 10 samples failing QC criteria (i.e., 2 large airway epithelium, 8 small airway epithelium). For the purposes of the coefficient of variation analysis, the data set of 10 samples passing QC criteria was termed "pass", and the data set of 10 samples failing QC criteria was termed "fail". Significance of the difference in coefficients of variation of gene expression levels across the "pass" and "fail" data sets was determined by Mann-Whitney U test.
To compare gene expression profiles in samples that passed QC to those that failed QC, principal components analysis was carried out using Partek® Genomics Suite software (version 6.8 Copyright© 2008) for 11 COPD subjects who failed chip quality control and 11 COPD subjects who passed (matched for gender, age, ethnicity, and smoking history). Affymetrix HG-U133 Plus 2.0 CEL files were imported into Partek using the Robust Multi-chip Average (RMA) method. All 54,675 log2-transformed small airway gene expression data were mapped to principal components to preserve the variation of this data, projected in 3 dimensions, and plotted. In order to identify the specific probe sets that were differentially expressed between the two groups, microarray data were processed using the MAS5 algorithm (Affymetrix Microarray Suite Version 5 software), which takes into account the perfect match and mismatch probes. MAS5-processed data were normalized using GeneSpring by setting measurements <0.01 to 0.01 and by normalizing per chip to the median expression value on that array and, per gene to the median expression value for each gene across all arrays. Genes that were significantly differentially expressed between the two groups were selected according to the following criteria: (1) P call of "Present" in 20% of samples; (2) magnitude of fold change in average expression value for pass QC vs fail QC of >1.5; and (3) p < 0.01 using a t test with a Benjamini-Hochberg correction to limit the false positive rate .
We thank Angeliki Kazeros, Renat Shaykhiev, and Ralf Hubner for helpful discussions; Jenny Xiang in the Microarray Core Facility for help with the chip evaluation; and N. Mohamed for help in preparing this manuscript. These studies were supported, in part, by R01 HL074326; and P50 HL084936.
- Lockhart DJ, Barlow C: Expressing what's on your mind: DNA arrays and the brain. Nat Rev Neurosci. 2001, 2: 63-68. 10.1038/35049070.View ArticlePubMedGoogle Scholar
- Schulze A, Downward J: Navigating gene expression using microarrays--a technology review. Nat Cell Biol. 2001, 3: E190-E195. 10.1038/35087138.View ArticlePubMedGoogle Scholar
- Stears RL, Martinsky T, Schena M: Trends in microarray analysis. Nat Med. 2003, 9: 140-145. 10.1038/nm0103-140.View ArticlePubMedGoogle Scholar
- Harvey BG, Heguy A, Leopold PL, Carolan BJ, Ferris B, Crystal RG: Modification of gene expression of the small airway epithelium in response to cigarette smoking. J Mol Med. 2007, 85: 39-53. 10.1007/s00109-006-0103-z.View ArticlePubMedGoogle Scholar
- Meyer KC: Bronchoalveolar lavage as a diagnostic tool. Semin Respir Crit Care Med. 2007, 28: 546-560. 10.1055/s-2007-991527.View ArticlePubMedGoogle Scholar
- Ning W, Li CJ, Kaminski N, Feghali-Bostwick CA, Alber SM, Di YP, Otterbein SL, Song R, Hayashi S, Zhou Z: Comprehensive gene expression profiles reveal pathways related to the pathogenesis of chronic obstructive pulmonary disease. Proc Natl Acad Sci USA. 2004, 101: 14895-14900. 10.1073/pnas.0401168101.PubMed CentralView ArticlePubMedGoogle Scholar
- Ning W, Lee J, Kaminski N, Feghali-Bostwick CA, Watkins SC, Pilewski JM, Peters DG, Hogg JC, Choi AM: Comprehensive analysis of gene expression on GOLD-2 Versus GOLD-0 smokers reveals novel genes important in the pathogenesis of COPD. Proc Am Thorac Soc. 2006, 3: 466-10.1513/pats.200603-031MS.View ArticlePubMedGoogle Scholar
- Reynolds HY: Use of bronchoalveolar lavage in humans--past necessity and future imperative. Lung. 2000, 178: 271-293. 10.1007/s004080000032.View ArticlePubMedGoogle Scholar
- Spira A, Beane JE, Shah V, Steiling K, Liu G, Schembri F, Gilman S, Dumas YM, Calner P, Sebastiani P: Airway epithelial gene expression in the diagnostic evaluation of smokers with suspect lung cancer. Nat Med. 2007, 13: 361-366. 10.1038/nm1556.View ArticlePubMedGoogle Scholar
- Walters EH, Gardiner PV: Bronchoalveolar lavage as a research tool. Thorax. 1991, 46: 613-618. 10.1136/thx.46.9.613.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang IM, Stepaniants S, Boie Y, Mortimer JR, Kennedy B, Elliott M, Hayashi S, Loy L, Coulter S, Cervino S: Gene expression profiling in patients with chronic obstructive pulmonary disease and lung cancer. Am J Respir Crit Care Med. 2008, 177: 402-411. 10.1164/rccm.200703-390OC.View ArticlePubMedGoogle Scholar
- Beane J, Sebastiani P, Liu G, Brody JS, Lenburg ME, Spira A: Reversible and permanent effects of tobacco smoke exposure on airway epithelial gene expression. Genome Biol. 2007, 8: R201-10.1186/gb-2007-8-9-r201.PubMed CentralView ArticlePubMedGoogle Scholar
- Pierrou S, Broberg P, O'Donnell RA, Pawlowski K, Virtala R, Lindqvist E, Richter A, Wilson SJ, Angco G, Moller S: Expression of genes involved in oxidative stress responses in airway epithelial cells of smokers with chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2007, 175: 577-586. 10.1164/rccm.200607-931OC.View ArticlePubMedGoogle Scholar
- Spira A, Beane J, Shah V, Liu G, Schembri F, Yang X, Palma J, Brody JS: Effects of cigarette smoke on the human airway epithelial cell transcriptome. Proc Natl Acad Sci USA. 2004, 101: 10143-10148. 10.1073/pnas.0401422101.PubMed CentralView ArticlePubMedGoogle Scholar
- Ammous Z, Hackett NR, Butler MW, Raman T, Dolgalev I, O'Connor TP, Harvey BG, Crystal RG: Variability in Small Airway Epithelial Gene Expression Among Normal Smokers. Chest. 2008, 6: 1344-1353.View ArticleGoogle Scholar
- Expression profiling--best practices for data generation and interpretation in clinical trials. Nat Rev Genet. 2004, 5: 229-237. 10.1038/nrg1297.
- Auer H, Lyianarachchi S, Newsom D, Klisovic MI, Marcucci G, Kornacker K: Chipping away at the chip bias: RNA degradation in microarray analysis. Nat Genet. 2003, 35: 292-293. 10.1038/ng1203-292.View ArticlePubMedGoogle Scholar
- Carter DE, Robinson JF, Allister EM, Huff MW, Hegele RA: Quality assessment of microarray experiments. Clin Biochem. 2005, 38: 639-642. 10.1016/j.clinbiochem.2005.04.010.View ArticlePubMedGoogle Scholar
- Copois V, Bibeau F, Bascoul-Mollevi C, Salvetat N, Chalbos P, Bareil C, Candeil L, Fraslon C, Conseiller E, Granci V: Impact of RNA degradation on gene expression profiles: assessment of different methods to reliably determine RNA quality. J Biotechnol. 2007, 127: 549-559. 10.1016/j.jbiotec.2006.07.032.View ArticlePubMedGoogle Scholar
- Cronin M, Ghosh K, Sistare F, Quackenbush J, Vilker V, O'Connell C: Universal RNA reference materials for gene expression. Clin Chem. 2004, 50: 1464-1471. 10.1373/clinchem.2004.035675.View ArticlePubMedGoogle Scholar
- Imbeaud S, Graudens E, Boulanger V, Barlet X, Zaborski P, Eveno E, Mueller O, Schroeder A, Auffray C: Towards standardization of RNA quality assessment using user-independent classifiers of microcapillary electrophoresis traces. Nucleic Acids Res. 2005, 33: e56-10.1093/nar/gni054.PubMed CentralView ArticlePubMedGoogle Scholar
- Lee J, Hever A, Willhite D, Zlotnik A, Hevezi P: Effects of RNA degradation on gene expression analysis of human postmortem tissues. FASEB J. 2005, 19: 1356-1358. 10.1096/fj.04-2591hyp.View ArticlePubMedGoogle Scholar
- Popova T, Mennerich D, Weith A, Quast K: Effect of RNA quality on transcript intensity levels in microarray analysis of human post-mortem brain tissues. BMC Genomics. 2008, 9: 91-10.1186/1471-2164-9-91.PubMed CentralView ArticlePubMedGoogle Scholar
- Reis-Filho JS, Westbury C, Pierga JY: The impact of expression profiling on prognostic and predictive testing in breast cancer. J Clin Pathol. 2006, 59: 225-231. 10.1136/jcp.2005.028324.PubMed CentralView ArticlePubMedGoogle Scholar
- Shi L, Tong W, Goodsaid F, Frueh FW, Fang H, Han T, Fuscoe JC, Casciano DA: QA/QC: challenges and pitfalls facing the microarray community and regulatory agencies. Expert Rev Mol Diagn. 2004, 4: 761-777. 10.1586/14737220.127.116.111.View ArticlePubMedGoogle Scholar
- Strand C, Enell J, Hedenfalk I, Ferno M: RNA quality in frozen breast cancer samples and the influence on gene expression analysis--a comparison of three evaluation methods using microcapillary electrophoresis traces. BMC Mol Biol. 2007, 8: 38-10.1186/1471-2199-8-38.PubMed CentralView ArticlePubMedGoogle Scholar
- Wilkes T, Laux H, Foy CA: Microarray data quality - review of current developments. OMICS. 2007, 11: 1-13. 10.1089/omi.2006.0001.View ArticlePubMedGoogle Scholar
- Raman T, O'Connor TP, Hackett NR, Wang W, Harvey B-G, Crystal RG: Establishment of quality control criteria to minimize experimental variability in microarray assessment. Am J Respir Crit Care Med. 2008, 177: A205-Google Scholar
- Danel C, Erzurum SC, McElvaney NG, Crystal RG: Quantitative assessment of the epithelial and inflammatory cell populations in large airways of normals and individuals with cystic fibrosis. Am J Respir Crit Care Med. 1996, 153: 362-368.View ArticlePubMedGoogle Scholar
- Harvey BG, O'Connor TP, Salit J, Raman T, Crystal RG: Differences in gene expression of upper vs lower lobe small airway epithelium in individuals with an early emphysema phenotype and predominant upper lobe emphysema. Am J Respir Crit Care Med. 2008, 177: A960-Google Scholar
- Madabusi LV, Latham GJ, Andruss BF: RNA extraction for arrays. Methods Enzymol. 2006, 411: 1-14. 10.1016/S0076-6879(06)11001-0.View ArticlePubMedGoogle Scholar
- Ribeiro-Silva A, Zhang H, Jeffrey SS: RNA extraction from ten year old formalin-fixed paraffin-embedded breast cancer samples: a comparison of column purification and magnetic bead-based technologies. BMC Mol Biol. 2007, 8: 118-10.1186/1471-2199-8-118.PubMed CentralView ArticlePubMedGoogle Scholar
- Thompson KL, Pine PS, Rosenzweig BA, Turpaz Y, Retief J: Characterization of the effect of sample quality on high density oligonucleotide microarray data using progressively degraded rat liver RNA. BMC Biotechnol. 2007, 7: 57-10.1186/1472-6750-7-57.PubMed CentralView ArticlePubMedGoogle Scholar
- Kohlmann A, Schoch C, Dugas M, Rauhut S, Weninger F, Schnittger S, Kern W, Haferlach T: Pattern robustness of diagnostic gene expression signatures in leukemia. Genes Chromosomes Cancer. 2005, 42: 299-307. 10.1002/gcc.20126.View ArticlePubMedGoogle Scholar
- Larsson O, Sandberg R: Lack of correct data format and comparability limits future integrative microarray research. Nat Biotechnol. 2006, 24: 1322-1323. 10.1038/nbt1106-1322.View ArticlePubMedGoogle Scholar
- Staal FJ, Cario G, Cazzaniga G, Haferlach T, Heuser M, Hofmann WK, Mills K, Schrappe M, Stanulla M, Wingen LU: Consensus guidelines for microarray gene expression analyses in leukemia from three European leukemia networks. Leukemia. 2006, 20: 1385-1392. 10.1038/sj.leu.2404274.View ArticlePubMedGoogle Scholar
- Kriegova E, Arakelyan A, Fillerova R, Zatloukal J, Mrazek F, Navratilova Z, Kolek V, du Bois RM, Petrek M: PSMB2 and RPL32 are suitable denominators to normalize gene expression profiles in bronchoalveolar cells. BMC Mol Biol. 2008, 9: 69-10.1186/1471-2199-9-69.PubMed CentralView ArticlePubMedGoogle Scholar
- Skrypina NA, Timofeeva AV, Khaspekov GL, Savochkina LP, Beabealashvilli RS: Total RNA suitable for molecular biology analysis. J Biotechnol. 2003, 105: 1-9. 10.1016/S0168-1656(03)00140-8.View ArticlePubMedGoogle Scholar
- Dumur CI, Nasim S, Best AM, Archer KJ, Ladd AC, Mas VR, Wilkinson DS, Garrett CT, Ferreira-Gonzalez A: Evaluation of quality-control criteria for microarray gene expression analysis. Clin Chem. 2004, 50: 1994-2002. 10.1373/clinchem.2004.033225.View ArticlePubMedGoogle Scholar
- Fleige S, Pfaffl MW: RNA integrity and the effect on the real-time qRT-PCR performance. Mol Aspects Med. 2006, 27: 126-139. 10.1016/j.mam.2005.12.003.View ArticlePubMedGoogle Scholar
- Schroeder A, Mueller O, Stocker S, Salowsky R, Leiber M, Gassmann M, Lightfoot S, Menzel W, Granzow M, Ragg T: The RIN: an RNA integrity number for assigning integrity values to RNA measurements. BMC Mol Biol. 2006, 7: 3-10.1186/1471-2199-7-3.PubMed CentralView ArticlePubMedGoogle Scholar
- Scicchitano MS, Dalmas DA, Bertiaux MA, Anderson SM, Turner LR, Thomas RA, Mirable R, Boyce RW: Preliminary comparison of quantity, quality, and microarray performance of RNA extracted from formalin-fixed, paraffin-embedded, and unfixed frozen tissue samples. J Histochem Cytochem. 2006, 54: 1229-1237. 10.1369/jhc.6A6999.2006.View ArticlePubMedGoogle Scholar
- Hawtin P, Hardern I, Wittig R, Mollenhauer J, Poustka A, Salowsky R, Wulff T, Rizzo C, Wilson B: Utility of lab-on-a-chip technology for high-throughput nucleic acid and protein analysis. Electrophoresis. 2005, 26: 3674-3681. 10.1002/elps.200500166.View ArticlePubMedGoogle Scholar
- Mueller O, Lightfoot S, Schröder A: RNA Integrity Number (RIN) Standardization of RNA Quality Control. Tech. Rep. 5989-1165EN, Agilent Technologies, Application Note. 2004, last accessed September 25 2009p, [http://www.chem.agilent.com/en-us/Search/Library/_layouts/Agilent/PrimaryDocumentViewer.ashx?whid=37507]Google Scholar
- Breit S, Nees M, Schaefer U, Pfoersich M, Hagemeier C, Muckenthaler M, Kulozik AE: Impact of pre-analytical handling on bone marrow mRNA gene expression. Br J Haematol. 2004, 126: 231-243. 10.1111/j.1365-2141.2004.05017.x.View ArticlePubMedGoogle Scholar
- Huang J, Qi R, Quackenbush J, Dauway E, Lazaridis E, Yeatman T: Effects of ischemia on gene expression. J Surg Res. 2001, 99: 222-227. 10.1006/jsre.2001.6195.View ArticlePubMedGoogle Scholar
- Russo G, Zegar C, Giordano A: Advantages and limitations of microarray technology in human cancer. Oncogene. 2003, 22: 6497-6507. 10.1038/sj.onc.1206865.View ArticlePubMedGoogle Scholar
- Frazier ML, Mars W, Florine DL, Montagna RA, Saunders GF: Efficient extraction of RNA from mammalian tissue. Mol Cell Biochem. 1983, 56: 113-122. 10.1007/BF00227211.View ArticlePubMedGoogle Scholar
- Hembree MJ, Prasadan K, Manna P, Preuett B, Spilde T, Bhatia A, Kobayashi H, Buckingham B, Snyder CL, Gittes GK: Semiquantitative polymerase chain reaction in RNase-producing tissues: Analysis of the developing pancreas. J Pediatr Surg. 2001, 36: 1629-1632. 10.1053/jpsu.2001.27934.View ArticlePubMedGoogle Scholar
- Jones L, Goldstein DR, Hughes G, Strand AD, Collin F, Dunnett SB, Kooperberg C, Aragaki A, Olson JM, Augood SJ: Assessment of the relationship between pre-chip and post-chip quality measures for Affymetrix GeneChip expression data. BMC Bioinformatics. 2006, 7: 211-10.1186/1471-2105-7-211.PubMed CentralView ArticlePubMedGoogle Scholar
- Atz M, Walsh D, Cartagena P, Li J, Evans S, Choudary P, Overman K, Stein R, Tomita H, Potkin S: Methodological considerations for gene expression profiling of human brain. J Neurosci Methods. 2007, 163: 295-309. 10.1016/j.jneumeth.2007.03.022.View ArticlePubMedGoogle Scholar
- Wilson CL, Miller CJ: Simpleaffy: a BioConductor package for Affymetrix Quality Control and data analysis. Bioinformatics. 2005, 21: 3683-3685. 10.1093/bioinformatics/bti605.View ArticlePubMedGoogle Scholar
- Affymetrix Technical note: GeneChip Eukaryotic Small Sample Preparation Technical Note. last accessed September 25, 2009, [http://jcp.bmj.com/cgi/data/57/12/1278/DC1/1]
- Luzzi V, Mahadevappa M, Raja R, Warrington JA, Watson MA: Accurate and reproducible gene expression profiles from laser capture microdissection, transcript amplification, and high density oligonucleotide microarray analysis. J Mol Diagn. 2003, 5: 9-14.PubMed CentralView ArticlePubMedGoogle Scholar
- McClintick JN, Jerome RE, Nicholson CR, Crabb DW, Edenberg HJ: Reproducibility of oligonucleotide arrays using small samples. BMC Genomics. 2003, 4: 4-10.1186/1471-2164-4-4.PubMed CentralView ArticlePubMedGoogle Scholar
- King C, Guo N, Frampton GM, Gerry NP, Lenburg ME, Rosenberg CL: Reliability and reproducibility of gene expression measurements using amplified RNA from laser-microdissected primary breast tissue with oligonucleotide arrays. J Mol Diagn. 2005, 7: 57-64.PubMed CentralView ArticlePubMedGoogle Scholar
- Becker KG: The sharing of cDNA microarray data. Nat Rev Neurosci. 2001, 2: 438-440. 10.1038/35077580.View ArticlePubMedGoogle Scholar
- Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC: Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet. 2001, 29: 365-371. 10.1038/ng1201-365.View ArticlePubMedGoogle Scholar
- Frueh FW: Impact of microarray data quality on genomic data submissions to the FDA. Nat Biotechnol. 2006, 24: 1105-1107. 10.1038/nbt0906-1105.View ArticlePubMedGoogle Scholar
- Heber S, Sick B: Quality assessment of Affymetrix GeneChip data. OMICS. 2006, 10: 358-368. 10.1089/omi.2006.10.358.View ArticlePubMedGoogle Scholar
- Irizarry RA, Warren D, Spencer F, Kim IF, Biswal S, Frank BC, Gabrielson E, Garcia JG, Geoghegan J, Germino G: Multiple-laboratory comparison of microarray platforms. Nat Methods. 2005, 2: 345-350. 10.1038/nmeth756.View ArticlePubMedGoogle Scholar
- Ji H, Davis RW: Data quality in genomics and microarrays. Nat Biotechnol. 2006, 24: 1112-1113. 10.1038/nbt0906-1112.PubMed CentralView ArticlePubMedGoogle Scholar
- Wennmalm K, Wahlestedt C, Larsson O: The expression signature of in vitro senescence resembles mouse but not human aging. Genome Biol. 2005, 6: R109-10.1186/gb-2005-6-13-r109.PubMed CentralView ArticlePubMedGoogle Scholar
- Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA. 1998, 95: 14863-14868. 10.1073/pnas.95.25.14863.PubMed CentralView ArticlePubMedGoogle Scholar
- Konradi C: Gene expression microarray studies in polygenic psychiatric disorders: applications and data analysis. Brain Res Brain Res Rev. 2005, 50: 142-155. 10.1016/j.brainresrev.2005.05.004.View ArticlePubMedGoogle Scholar
- Johnson WE, Li C, Rabinovic A: Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007, 8: 118-127. 10.1093/biostatistics/kxj037.View ArticlePubMedGoogle Scholar
- Hershey AD, Burdine D, Liu C, Nick TG, Gilbert DL, Glauser TA: Assessing quality and normalization of microarrays: case studies using neurological genomic data. Acta Neurol Scand. 2008, 118: 29-41. 10.1111/j.1600-0404.2007.00979.x.View ArticlePubMedGoogle Scholar
- Mahler DA, Wells CK: Evaluation of clinical methods for rating dyspnea. Chest. 1988, 93: 580-586. 10.1378/chest.93.3.580.View ArticlePubMedGoogle Scholar
- Jones PW, Quirk FH, Baveystock CM: The St George's Respiratory Questionnaire. Respir Med. 1991, 85 (Suppl B): 25-31. 10.1016/S0954-6111(06)80166-6.View ArticlePubMedGoogle Scholar
- Jones PW, Quirk FH, Baveystock CM, Littlejohns P: A self-complete measure of health status for chronic airflow limitation. The St. George's Respiratory Questionnaire. Am Rev Respir Dis. 1992, 145: 1321-1327.View ArticlePubMedGoogle Scholar
- Heijdra YF, Pinto-Plata VM, Kenney LA, Rassulo J, Celli BR: Cough and phlegm are important predictors of health status in smokers without COPD. Chest. 2002, 121: 1427-1433. 10.1378/chest.121.5.1427.View ArticlePubMedGoogle Scholar
- Pauwels RA, Buist AS, Calverley PM, Jenkins CR, Hurd SS: Global strategy for the diagnosis, management, and prevention of chronic obstructive pulmonary disease. NHLBI/WHO Global Initiative for Chronic Obstructive Lung Disease (GOLD) Workshop summary. Am J Respir Crit Care Med. 2001, 163: 1256-1276.View ArticlePubMedGoogle Scholar
- Hackett NR, Heguy A, Harvey BG, O'Connor TP, Luettich K, Flieder DB, Kaplan R, Crystal RG: Variability of antioxidant-related gene expression in the airway epithelium of cigarette smokers. Am J Respir Cell Mol Biol. 2003, 29: 331-343. 10.1165/rcmb.2002-0321OC.View ArticlePubMedGoogle Scholar
- Weis S, Llenos IC, Dulay JR, Elashoff M, Martinez-Murillo F, Miller CL: Quality control for microarray analysis of human brain samples: The impact of postmortem factors, RNA characteristics, and histopathology. J Neurosci Methods. 2007, 165: 198-209. 10.1016/j.jneumeth.2007.06.001.View ArticlePubMedGoogle Scholar
- Mills JC, Gordon JI: A new approach for filtering noise from high-density oligonucleotide microarray datasets. Nucleic Acids Res. 2001, 29: E72-10.1093/nar/29.15.e72.PubMed CentralView ArticlePubMedGoogle Scholar
- Tomita H, Vawter MP, Walsh DM, Evans SJ, Choudary PV, Li J, Overman KM, Atz ME, Myers RM, Jones EG: Effect of agonal and postmortem factors on gene expression profile: quality control in microarray analyses of postmortem human brain. Biol Psychiatry. 2004, 55: 346-352. 10.1016/j.biopsych.2003.10.013.PubMed CentralView ArticlePubMedGoogle Scholar
- Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc. 1995, B57: 289-300.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.