Volume 13 Supplement 8
The International Conference on Intelligent Biology and Medicine (ICIBM) Genomics
Literature aided determination of data quality and statistical significance threshold for gene expression studies
 Lijing Xu^{1},
 Cheng Cheng^{2},
 E Olusegun George^{1, 3} and
 Ramin Homayouni^{1, 4}Email author
DOI: 10.1186/1471216413S8S23
© Xu et al.; licensee BioMed Central Ltd. 2012
Published: 17 December 2012
Abstract
Background
Gene expression data are noisy due to technical and biological variability. Consequently, analysis of gene expression data is complex. Different statistical methods produce distinct sets of genes. In addition, selection of expression pvalue (EPv) threshold is somewhat arbitrary. In this study, we aimed to develop novel literature based approaches to integrate functional information in analysis of gene expression data.
Methods
Functional relationships between genes were derived by Latent Semantic Indexing (LSI) of Medline abstracts and used to calculate the function cohesion of gene sets. In this study, literature cohesion was applied in two ways. First, LiteratureBased Functional Significance (LBFS) method was developed to calculate a pvalue for the cohesion of differentially expressed genes (DEGs) in order to objectively evaluate the overall biological significance of the gene expression experiments. Second, Literature Aided Statistical Significance Threshold (LASST) was developed to determine the appropriate expression pvalue threshold for a given experiment.
Results
We tested our methods on three different publicly available datasets. LBFS analysis demonstrated that only two experiments were significantly cohesive. For each experiment, we also compared the LBFS values of DEGs generated by four different statistical methods. We found that some statistical tests produced more functionally cohesive gene sets than others. However, no statistical test was consistently better for all experiments. This reemphasizes that a statistical test must be carefully selected for each expression study. Moreover, LASST analysis demonstrated that the expression pvalue thresholds for some experiments were considerably lower (p < 0.02 and 0.01), suggesting that the arbitrary pvalues and false discovery rate thresholds that are commonly used in expression studies may not be biologically sound.
Conclusions
We have developed robust and objective literaturebased methods to evaluate the biological support for gene expression experiments and to determine the appropriate statistical significance threshold. These methods will assist investigators to more efficiently extract biologically meaningful insights from high throughput gene expression experiments.
Background
Gene expression data are complex, noisy, and subject to inter and intralaboratory variability [1, 2]. Moreover, because tens of thousands of measurements are made in a typical experiment, the likelihood of false positives (type I error) is high. One way to address these issues is to increase replicates in the experiments. However this is generally cost prohibitive. Therefore, quality control of gene expression experiments with limited sample size is important for identification of true DEGs. Although the completion of the Microarray Quality Control (MAQC) project provides a framework to assess microarray technologies, others have pointed out that it does not sufficiently address inter and intraplatform comparability and reproducibility [3–5].
Even with reliable gene expression data, statistical analysis of microarray experiments remains challenging to some degree. Jeffery and coworkers found a large discrepancy between gene lists generated by 10 different feature selection methods, including significance analysis of microarrays (SAM), analysis of variance (ANOVA), Empirical Bayes, and tstatistics [6]. Several studies have focused on finding robust methods for identification of DEGs [7–15]. However, as more methods become available, it is increasingly difficult to determine which method is most appropriate for a given experiment. Hence, it is necessary to objectively compare and evaluate different gene selection methods [6, 16–18], which result in different number of DEGs and different false discovery rate (FDR) estimates [19].
FDR is determined by several factors such as proportion of DEGs, gene expression variability, and sample size [20]. Controlling for FDR can be too stringent, resulting in a large number of false negatives [21–23]. Therefore, determination of an appropriate threshold is critical for effectively identifying truly differentially expressed genes, while minimizing both false positives and false negatives. A recent study, using a cross validation approach showed that optimal selection of FDR threshold could provide good performance on model selection and prediction [24]. Although many researchers have made considerable progress in improving FDR estimation and control [25–27], as well as other significance criteria [28–31], the instability resulted from high level of noise in microarray gene expression experiments cannot be completely eliminated. There is therefore a great need to make meaningful statistical significance and FDR thresholds by incorporating biological function.
Recently, Chuchana et al. integrated gene pathway information into microarray data to determine the threshold for identification of DEGs [32]. By comparing a few biological parameters such as total number of networks and common genes among pathways, they determined the statistical threshold by the amount of biological information obtained from the DEGs [32]. This study seems to be the first attempt to objectively determine the threshold of DEGs based on biological function. However, there are several limitations of this study. First, the method relied on Ingenuity pathway analysis which may be biased toward well studied genes and limited by human curation. Second, the threshold selection is iteratively defined. Finally, the approach is manual, which is not realistic for large scale genomewide applications.
A number of groups have developed computational methods to measure functional similarities among genes using annotation in Gene Ontology and other curated databases [33–38]. For example, Chabalier et al., showed that each gene can be represented as a vector which contains a set of GO terms [34]. Each term was assigned a different weight according to the number of genes annotated by this term and the total number of annotated genes in the collection. Thus, GObased similarity of gene pairs was calculated using a vector space model. Other studies not only focused on using GO annotations to calculate genegene functional similarities but also to determine the functional coherence of a gene set. Recently, Richards et al utilized the topological properties of a GObased graph to estimate the functional coherence of gene sets [38]. They developed a set of metrics by considering both the enrichment of GO terms and their semantic relationships. This method was shown to be robust in identifying coherent gene sets compared with random sets obtained from microarray datasets.
Previously, we developed a method which utilizes Latent Semantic Indexing (LSI), a variant of the vector space model of information retrieval, to determine the functional relationships between genes from Medline abstracts [39]. This method was shown to be robust and accurate in identifying both explicit and implicit gene relationships using a hand curated set of genes. More recently, we applied this approach to determine the functional cohesion of gene sets using the biomedical literature [40]. We showed that the LSI derived gene set cohesion was consistent across >6000 GO categories. We also showed that this literature based method could be used to compare the cohesion of gene sets obtained from microarray experiments [40]. Subsequently, we applied this method to evaluate various microarray normalization procedures [41]. In the present study, we aimed to develop and test a robust literaturebased method for evaluating the overall quality, as determined by functional cohesion, of microarray experiments. In addition, we describe a novel method to use literature derived functional cohesion to determine the threshold for expression pvalue and FDR cutoffs in microarray analysis.
Methods
Genedocument collection and similarity matrix generation
All titles and abstracts of the Medline citations crossreferenced in the mouse, rat and human Entrez Gene entries as of 2010 were concatenated to construct genedocuments and genegene similarity scores were calculated by LSI, as previously described [39, 40, 42]. Briefly, a termbygene matrix was created for mouse and human genes where the entries of the matrix were the logentropy of terms in the document collection. Then, a truncated singular value decomposition (SVD) of that matrix was performed to create a lower dimension (reduced rank) matrix. Genes were then represented as vectors in the reduced rank matrix and the similarity between genes was calculated by the cosine of the vector angles. Genetogene similarity was calculated using the first 300 factors, which has good performance for large document collections [43].
Calculation of literaturebased functional significance (LBFS)
Literature aided statistical significance threshold (LASST)
(1) Specify an increasing sequence of EPv statistical significance thresholds α_{1}, ⋯, α_{m} and generate DEG sets at these specified significance levels.
(2) For each DEG set generated in (1), estimate the LCI using the subsampling procedure described above, to obtain pairs (α_{i}, L_{i}), i = 1, 2, ⋯, m.
(3) Choose an integer m_{0} (3 by default) and perform twopiece linear fits to the curve as follows: For k = m_{0}, m_{0}+1, ⋯, mm_{0}, fit a straight line by lease square to the points (α_{j}, L_{j}), j = 1, 2, ⋯, k (the left piece) to obtain intercept and slope ${\widehat{\beta}}_{0k}^{L}$, ${\widehat{\beta}}_{1k}^{L}$. Similarly fit a straight line to the points (α_{j}, L_{j}), j = k+1, 2, ⋯, m (the right piece) to obtain intercept and slope ${\widehat{\beta}}_{0k}^{R}$, ${\widehat{\beta}}_{1k}^{R}$. Compute ${V}_{k}=\left({\widehat{\beta}}_{0k}^{L}{\widehat{\beta}}_{0k}^{R}\right)+\left({\widehat{\beta}}_{1k}^{R}{\widehat{\beta}}_{1k}^{L}\right)$.
(4) Let k* be the first local maxima of V_{k} (k == m_{0}, m_{0}+1, ⋯, mm_{0}), that is, ${k}^{*}=\text{min}\left\{j:Vj\ge {V}_{j+1}\right\}$.
(5) Take the k*_{th} entry on the α sequence specified in (1) as the EPv significance cutoff.
Microarray data analysis
To test the performance of our approach, we randomly chose three publicly available microarray datasets from Gene Expression Omnibus (GEO): 1) interleukin2 responsive (IL2) genes [44]; 2) PGC1beta related (PGC1beta) genes [45]; 3) Endothelin1 responsive (ET1) genes [46]. To be able to compare across these datasets, we focused only on experiments using the Affymetrix Mouse 4302 platform. All datasets (.cel files) were imported into GeneSpring GX 11 and processed using MAS5 summarization and quantile normalization. Probes with all absent calls were removed from subsequent analysis. As discussed earlier, the content and literature cohesion of a DEG set can largely depend on the statistical test. For this reason, four popular statistical tests including empirical Bayes approach [47], student tTest, Welch tTest and MannWhitney test were performed to identify DEGs with a statistical significance level 0.05.
Results
Comparison of various statistical tests using LBFS
Literature based functional significance (LBFS) of gene sets generated by four statistical tests for three different microarray experiments.
LCI  LBFS  

Gene list  PGC1beta  IL2  ET1  PGC1beta  IL2  ET1 
Welch tTest  0.34  0.34  0.17  7.08E06  0.0004  0.45 
MannWhitney  0.2  0.2  0.13  0.118  0.0075  1 
Student tTest  0.38  0.38  0.1  1.24E07  0.071  1 
Empirical Bayes  0.4  0.19  0.05  1.36E08  0.11  1 
Determination of EPv threshold using LASST
In the above analysis, DEGs were selected using an arbitrary statistical threshold of p<0.05, as is the case for many published expression studies. However, in reality, there is no biological reason why this threshold is selected for experiments. Once the appropriate statistical test was chosen by application of LBFS above, we tested if literature cohesion could be applied to determine the EPv cutoff. We developed another method called Literature Aided Statistical Significance Threshold (LASST) which determines the EPv by a twopiece linear fit of the LCI curves as a function of EPv as described in Methods. LASST was applied to pvalues produced by Empirical Bayes for PGC1beta experiment and Welch ttest for the IL2 and ET1 experiments. DEGs were produced at each point on a grid of unequallyspaced statistical significance levels (α = 0.001, 0.003, 0.005,⋯). In computing the LCI, the LPv level was set to 0.05, and the size of the gene subsets from the DEG pool was set to 50 in the subsampling procedure as described in Methods. The LCI of a DEG set was plotted against various α levels of the EPv (Figure 2). Interestingly, application of LASST determined an EPv significance threshold of 0.01 (corresponding LCI 0.55) for PGC1beta dataset and 0.02 (LCI 0.315) for IL2 dataset. None of the DEG sets from the ET1 experiment had appreciable LCI, which remained consistently low across the α levels (Figure 2). Thus, an EPv threshold could not be determined using the LCI approach for ET1 dataset. These results are consistent with what we observed above (Table 1).
Number of significant genes identified by student ttest after correction for multiple hypotheses testing
# of tests  # of genes with p <0.05  Storey pFDR q<0.1  BH FDR <0.1  Bonferroni FWER <0.1  Westfall Young Permutation  

IL2  20558  5001  5955  3827  32  95 
PGC1beta  17633  2618  1  1  1  1 
ET1  20477  1559  0  0  0  0 
Discussion
Although microarray technology has become common and affordable, analysis and interpretation of microarray data remains challenging. Experimental design and quality of the data can severely affect the results and conclusions drawn from a microarray experiment. Using our approach, we found that some datasets (e.g., PGC1beta) produced more functionally cohesive gene sets than others (e.g., ET1). There can be many biological or technological reasons for the lack of cohesion in any microarray dataset. For instance, it is possible that the experimental perturbation (or signaling pathway) simply did not alter mRNA expression levels in that system as hypothesized. It is also possible that the data are noisy due to technical or biological variations, which result in false differential expression. Although our method will not identify the causes of this variation, it can help in assessment of the overall quality of the experiment and provide feedback to the investigators in order to adjust the experimental procedures. For example, after observing a low LBFS value, the investigator may choose to remove outlier samples or add more replicates into the study design.
It is important to note that a low cohesion value could be due to a lack of information in the biomedical literature. In other words, it is possible that the microarray experiment has uncovered new gene associations which have not been previously reported in the literature. This issue would affect any method that relies on human curated databases or natural language processing of biomedical literature. However, our LSI method presents a unique advantage over other approaches because it extracts both explicit and implicit gene associations, based on weighted term usage patterns in the literature. Consequently, gene associations are ranked based on their conceptual relationships and not specific interactions documented in the literature. Thus, we posit that LSI is particularly suited for analysis of discovery oriented genomic studies which are geared toward identifying new gene associations. Further work is necessary to be able to determine exactly how (whether explicitly or implicitly) a subset of functionally cohesive genes are related to one another in the LSI model.
A major challenge in microarray analysis involves selection of the appropriate statistical tests, which have different assumptions about the data distribution and result in different DEG sets. For instance, parametric methods are based on the assumption that the observations adhere to a normal distribution. The assumption of normality is rarely satisfied in microarray data even after normalization. Nonparametric methods are distribution free and do not make any assumptions of the population from which the samples are drawn. However, nonparametric tests lack statistical power with small samples, which is often the case in microarray studies. In this study, we found that although MannWhitney nonparametric test identified the largest number of DEGs for PGC1beta experiment, the DEGs were not functionally significant (Table 1). Also, we found that some tests were selectively better for some experiments. For example, the Empirical Bayes method produced the best results for the PGC1beta experiment, while the Welch ttest produced the best results for the IL2 experiment. Taken together, we demonstrate that our method allows an objective and literature based method to evaluate the appropriateness of different statistical tests for a given experiment.
Several groups have developed methods to assess functional cohesion or refine feature selection by incorporating biological information from either the primary literature or curated databases [38, 48–50]. To our knowledge, a literature based approach to evaluate the overall quality of microarray experiments has not been reported. Although we did not extensively compare our approach with these methods, we performed a preliminary comparison with a well known Gene Set Enrichment Analysis (GSEA) method [49]. GSEA calculates the enrichment pvalue for biological pathways in curated databases for a given set of DEGs. Presumably, if a microarray experiment is biologically significant, then higher number of relevant pathways should be enriched. Indeed, we found that GSEA identified 410, 309 and 283 enriched pathway gene sets with FDR <0.25 for PGC1beta, IL2 and ET1experiments, respectively. These results correlated well with our LBFS findings which showed that DEGs obtained from PGC1beta and IL2 were more functionally significant than ET1. However, GSEA identified a substantial number of enriched pathways for ET1. One issue is that GSEA only focuses on gene subsets and not the entire DEG list. Thus, it does not evaluate the overall cohesion or functional significance of the DEG list. In addition, since GSEA relies on human curated databases such as GO and KEGG, it is susceptible to curation biases, which favor wellknown genes and pathways and contain limited information on other genes.
Assuming that microarray experiment is of high quality and an appropriate statistical test has been selected for a microarray experiment, selection of the expression pvalue cutoff still remains arbitrary for nearly all published studies. In our work, we found a positive correlation between literature cohesion index and EPv (Figure 2). Based on the distribution of LCI with respect to EPv, we devised a method (called LASST) which empirically determined the EPv cutoff value. Not surprisingly, we found that different EPv cutoffs should be used for the different microarray experiments that we examined. Indeed, we found that application of LASST resulted in a smaller pvalue threshold and substantially smaller number of DEGs for both IL1 and PGC1beta experiments. Therefore, LASST enables researchers to narrow their gene lists and focus on biologically important genes for further experimentation.
Finally, another major challenge for microarray analysis is the propensity for high false discovery rate (FDR) caused by multiple hypothesis testing. Correction of multiple hypothesis testing including family wise error rate (FWER) are often too stringent which may lead to a large number of false negatives. As with EPv cutoff concerns above, setting the FDR threshold at levels 0.01, 0.05, or 0.1 does not have any biological meaning [29]. For instance, no false positive error correction method produced adequate DEGs for PCG1beta and ET1 experiments. However, our analysis showed that PGC1beta dataset was biologically very cohesive (Table 1). This suggests that applying FDR correction to this dataset would produce a very large number of false negatives. Another important finding of our study is that the false positive error correction procedures appear to be sensitive to DEG size. For instance, using student ttest IL2 dataset consisted of 5001 DEGs with a pvalue <0.05, whereas the Storey FDR method produced 5955 at q<0.1. However, our literature based analysis revealed that the IL2 dataset produced less biologically cohesive DEGs than the PGC1beta dataset, which showed only 1 gene with q<0.1. In the future, it will be important to expand these preliminary observations to a larger of set of microarray experiments and to determine the precise relationships between false positive correction methods and biological significance.
Conclusions
In this study, we developed a robust methodology to evaluate the overall quality of microarray experiments, to compare the appropriateness of different statistical methods, and to determine the expression pvalue thresholds using functional information in the biomedical literature. Using our approach, we showed that the quality, as measured by the biological cohesion of DEGs, can vary greatly between microarray experiments. In addition, we demonstrate that the choice of statistical test should be carefully considered because different tests produce different DEGs with varying degrees of biological significance. Importantly, we also demonstrated that procedures that control false positive rates are often too conservative and favor larger DEG sets without considering biological significance. The methods developed herein can better facilitate analysis and interpretation of microarray experiments. Moreover, these methods provide a biological metric to filter the vast amount of publicly available microarray experiments for subsequent metaanalysis and systems biology research.
Abbreviations
 ANOVA:

analysis of variance
 DEGs:

differentially expressed genes
 EPv:

expression pvalue
 ET1:

Endothelin1 responsive
 FDR:

False Discovery Rate
 GCA:

geneset cohesion analysis
 GCAT:

Geneset Cohesion Analysis Tool
 GEO:

Gene Expression Omnibus
 IL2:

interleukin2 responsive
 LASST:

Literature aided statistical significance thresholds
 LBFS:

literaturebased functional significance
 LCI:

literature cohesion index
 LPv:

literature cohesion pvalue
 LSI:

Latent Semantic Indexing
 MAQC:

Microarray Quality Control
 PGC1beta:

PGC1beta related
 SAM:

significance analysis of microarrays
 SVD:

singular value decomposition
Declarations
Acknowledgements
We thank Dr. Kevin Heinrich (Computable Genomix, Memphis, TN) for providing the genegene association data. This work was supported by The Assisi Foundation of Memphis and The University of Memphis Bioinformatics Program.
This article has been published as part of BMC Genomics Volume 13 Supplement 8, 2012: Proceedings of The International Conference on Intelligent Biology and Medicine (ICIBM): Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/13/S8.
Authors’ Affiliations
References
 Luo J, Schumacher M, Scherer A, Sanoudou D, Megherbi D, Davison T, Shi T, Tong W, Shi L, Hong H, Zhao C, Elloumi F, Shi W, Thomas R, Lin S, Tillinghast G, Liu G, Zhou Y, Herman D, Li Y, Deng Y, Fang H, Bushel P, Woods M, Zhang J: A comparison of batch effect removal methods for enhancement of prediction performance using MAQCII microarray gene expression data. Pharmacogenomics J. 2010, 10: 278291. 10.1038/tpj.2010.57.PubMed CentralView ArticlePubMedGoogle Scholar
 Scherer A: Batch Effects and Noise in Microarray Experiments: Sources and Solutions. Wiley Series Probability Statistics. 2009View ArticleGoogle Scholar
 Chen JJ, Hsueh HM, Delongchamp RR, Lin CJ, Tsai CA: Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data. BMC Bioinformatics. 2007, 8: 41210.1186/147121058412.PubMed CentralView ArticlePubMedGoogle Scholar
 Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ, de Longueville F, Kawasaki ES, Lee KY, Luo Y, Sun YA, Willey JC, Setterquist RA, Fischer GM, Tong W, Dragan YP, Dix DJ, Frueh FW, Goodsaid FM, Herman D, Jensen RV, Johnson CD, Lobenhofer EK, Puri RK, Schrf U, ThierryMieg J, Wang C, Wilson M, Wolber PK, et al: The MicroArray Quality Control (MAQC) project shows interand intraplatform reproducibility of gene expression measurements. Nat Biotechnol. 2006, 24: 11511161. 10.1038/nbt1239.View ArticlePubMedGoogle Scholar
 Shi L, Campbell G, Jones WD, Campagne F, Wen Z, Walker SJ, Su Z, Chu TM, Goodsaid FM, Pusztai L, Shaughnessy JD, Oberthuer A, Thomas RS, Paules RS, Fielden M, Barlogie B, Chen W, Du P, Fischer M, Furlanello C, Gallas BD, Ge X, Megherbi DB, Symmans WF, Wang MD, Zhang J, Bitter H, Brors B, Bushel PR, Bylesjo M, et al: The MicroArray Quality Control (MAQC)II study of common practices for the development and validation of microarraybased predictive models. Nat Biotechnol. 2010, 28: 827838. 10.1038/nbt.1665.View ArticlePubMedGoogle Scholar
 Jeffery IB, Higgins DG, Culhane AC: Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data. BMC Bioinformatics. 2006, 7: 35910.1186/147121057359.PubMed CentralView ArticlePubMedGoogle Scholar
 Kadota K, Konishi T, Shimizu K: Evaluation of two outlierdetectionbased methods for detecting tissueselective genes from microarray data. Gene Regul Syst Bio. 2007, 1: 915.PubMed CentralPubMedGoogle Scholar
 Kadota K, Nakai Y, Shimizu K: Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity. Algorithms Mol Biol. 2009, 4: 710.1186/1748718847.PubMed CentralView ArticlePubMedGoogle Scholar
 Pearson RD: A comprehensive reanalysis of the Golden Spike data: towards a benchmark for differential expression methods. BMC Bioinformatics. 2008, 9: 16410.1186/147121059164.PubMed CentralView ArticlePubMedGoogle Scholar
 Jung K, Friede T, Beiszbarth T: Reporting FDR analogous confidence intervals for the log fold change of differentially expressed genes. BMC Bioinformatics. 2011, 12: 28810.1186/1471210512288.PubMed CentralView ArticlePubMedGoogle Scholar
 Hu J, Xu J: Density based pruning for identification of differentially expressed genes from microarray data. BMC Genomics. 2010, 11 (Suppl 2): S310.1186/1471216411S2S3.PubMed CentralView ArticlePubMedGoogle Scholar
 Wille A, Gruissem W, Buhlmann P, Hennig L: EVE (external variance estimation) increases statistical power for detecting differentially expressed genes. Plant J. 2007, 52: 561569. 10.1111/j.1365313X.2007.03227.x.View ArticlePubMedGoogle Scholar
 Elo LL, Katajamaa M, Lund R, Oresic M, Lahesmaa R, Aittokallio T: Improving identification of differentially expressed genes by integrative analysis of Affymetrix and Illumina arrays. Omics. 2006, 10: 369380. 10.1089/omi.2006.10.369.View ArticlePubMedGoogle Scholar
 Lai Y: On the identification of differentially expressed genes: improving the generalized Fstatistics for Affymetrix microarray gene expression data. Comput Biol Chem. 2006, 30: 321326. 10.1016/j.compbiolchem.2006.06.002.View ArticlePubMedGoogle Scholar
 Kim RD, Park PJ: Improving identification of differentially expressed genes in microarray studies using information from public databases. Genome Biol. 2004, 5: R7010.1186/gb200459r70.PubMed CentralView ArticlePubMedGoogle Scholar
 Murie C, Woody O, Lee AY, Nadon R: Comparison of small n statistical tests of differential expression applied to microarrays. BMC Bioinformatics. 2009, 10: 4510.1186/147121051045.PubMed CentralView ArticlePubMedGoogle Scholar
 Bullard JH, Purdom E, Hansen KD, Dudoit S: Evaluation of statistical methods for normalization and differential expression in mRNASeq experiments. BMC Bioinformatics. 2010, 11: 9410.1186/147121051194.PubMed CentralView ArticlePubMedGoogle Scholar
 Dozmorov MG, Guthridge JM, Hurst RE, Dozmorov IM: A comprehensive and universal method for assessing the performance of differential gene expression analyses. PLoS One. 2010, 5:Google Scholar
 Slikker W: Of genomics and bioinformatics. Pharmacogenomics J. 2010, 10: 245246. 10.1038/tpj.2010.59.View ArticlePubMedGoogle Scholar
 Pawitan Y, Michiels S, Koscielny S, Gusnanto A, Ploner A: False discovery rate, sensitivity and sample size for microarray studies. Bioinformatics. 2005, 21: 30173024. 10.1093/bioinformatics/bti448.View ArticlePubMedGoogle Scholar
 Ishwaran H, Rao JS, Kogalur UB: BAMarraytrade mark: Java software for Bayesian analysis of variance for microarray data. BMC Bioinformatics. 2006, 7: 5910.1186/14712105759.PubMed CentralView ArticlePubMedGoogle Scholar
 Ploner A, Calza S, Gusnanto A, Pawitan Y: Multidimensional local false discovery rate for microarray studies. Bioinformatics. 2006, 22: 556565. 10.1093/bioinformatics/btk013.View ArticlePubMedGoogle Scholar
 Jiao S, Zhang S: The tmixture model approach for detecting differentially expressed genes in microarrays. Funct Integr Genomics. 2008, 8: 181186. 10.1007/s1014200700716.View ArticlePubMedGoogle Scholar
 Graf AC, Bauer P: Model selection based on FDRthresholding optimizing the area under the ROCcurve. Stat Appl Genet Mol Biol. 2009, 8: Article31Google Scholar
 Lu X, Perkins DL: Resampling strategy to improve the estimation of number of null hypotheses in FDR control under strong correlation structures. BMC Bioinformatics. 2007, 8: 15710.1186/147121058157.PubMed CentralView ArticlePubMedGoogle Scholar
 Pounds S, Cheng C: Improving false discovery rate estimation. Bioinformatics. 2004, 20: 17371745. 10.1093/bioinformatics/bth160.View ArticlePubMedGoogle Scholar
 Xie Y, Pan W, Khodursky AB: A note on using permutationbased false discovery rate estimates to compare different analysis methods for microarray data. Bioinformatics. 2005, 21: 42804288. 10.1093/bioinformatics/bti685.View ArticlePubMedGoogle Scholar
 Cheng C: An adaptive significance threshold criterion for massive multiple hypothesis testing. Optimality: The Second Erich L. Lehmann Symposium, Institute of Mathematical Statistics, Beachwood, OH, USA. 2006, 49: 5176.View ArticleGoogle Scholar
 Cheng C, Pounds SB, Boyett JM, Pei D, Kuo ML, Roussel MF: Statistical significance threshold criteria for analysis of microarray gene expression data. Stat Appl Genet Mol Biol. 2004, 3: Article36Google Scholar
 Dudoit S, van der Laan MJ, Pollard KS: Multiple testing. Part I. Singlestep procedures for control of general type I error rates. Stat Appl Genet Mol Biol. 2004, 3: Article13Google Scholar
 Genovese CWL: Operating characteristics and extensions of the false discovery rate procedure. Journal of the Royal Statistical Society, Series B. 2002, 64: 499517. 10.1111/14679868.00347.View ArticleGoogle Scholar
 Chuchana P, Holzmuller P, Vezilier F, Berthier D, Chantal I, Severac D, Lemesre JL, Cuny G, Nirde P, Bucheton B: Intertwining threshold settings, biological data and database knowledge to optimize the selection of differentially expressed genes from microarray. PLoS One. 2010, 5: e1351810.1371/journal.pone.0013518.PubMed CentralView ArticlePubMedGoogle Scholar
 Wang JZ, Du Z, Payattakool R, Yu PS, Chen CF: A new method to measure the semantic similarity of GO terms. Bioinformatics. 2007, 23: 12741281. 10.1093/bioinformatics/btm087.View ArticlePubMedGoogle Scholar
 Chabalier J, Mosser J, Burgun A: A transversal approach to predict gene product networks from ontologybased similarity. BMC Bioinformatics. 2007, 8: 23510.1186/147121058235.PubMed CentralView ArticlePubMedGoogle Scholar
 Huang da W, Sherman BT, Tan Q, Collins JR, Alvord WG, Roayaei J, Stephens R, Baseler MW, Lane HC, Lempicki RA: The DAVID Gene Functional Classification Tool: a novel biological modulecentric algorithm to functionally analyze large gene lists. Genome Biol. 2007, 8: R18310.1186/gb200789r183.PubMed CentralView ArticlePubMedGoogle Scholar
 Schlicker A, Domingues FS, Rahnenfuhrer J, Lengauer T: A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinformatics. 2006, 7: 30210.1186/147121057302.PubMed CentralView ArticlePubMedGoogle Scholar
 Ruths T, Ruths D, Nakhleh L: GS2: an efficiently computable measure of GObased similarity of gene sets. Bioinformatics. 2009, 25: 11781184. 10.1093/bioinformatics/btp128.PubMed CentralView ArticlePubMedGoogle Scholar
 Richards AJ, Muller B, Shotwell M, Cowart LA, Rohrer B, Lu X: Assessing the functional coherence of gene sets with metrics based on the Gene Ontology graph. Bioinformatics. 2010, 26: i7987. 10.1093/bioinformatics/btq203.PubMed CentralView ArticlePubMedGoogle Scholar
 Homayouni R, Heinrich K, Wei L, Berry MW: Gene clustering by latent semantic indexing of MEDLINE abstracts. Bioinformatics. 2005, 21: 104115. 10.1093/bioinformatics/bth464.View ArticlePubMedGoogle Scholar
 Xu L, Furlotte N, Lin Y, Heinrich K, Berry MW, George EO, Homayouni R: Functional Cohesion of Gene Sets Determined by Latent Semantic Indexing of PubMed Abstracts. PLoS One. 2011, 6: e1885110.1371/journal.pone.0018851.PubMed CentralView ArticlePubMedGoogle Scholar
 Furlotte N, Xu L, Williams RW, Homayouni R: Literaturebased Evaluation of Microarray Normalization Procedures. BIBM 2011. 2011, 608612.Google Scholar
 Berry MW, Browne M: Understanding Search Engines: Mathematical Modeling and Text Retrieval. SIAM, Philadelphia. 1999Google Scholar
 Landauer TK, Laham D, Derr M: From paragraph to graph: latent semantic analysis for information visualization. Proc Natl Acad Sci USA. 2004, 101 (Suppl 1): 52145219.PubMed CentralView ArticlePubMedGoogle Scholar
 Zhang Z, Martino A, Faulon JL: Identification of expression patterns of IL2responsive genes in the murine T cell line CTLL2. J Interferon Cytokine Res. 2007, 27: 991995. 10.1089/jir.2006.0169.View ArticlePubMedGoogle Scholar
 Vianna CR, Huntgeburth M, Coppari R, Choi CS, Lin J, Krauss S, Barbatelli G, Tzameli I, Kim YB, Cinti S, Shulman GI, Spiegelman BM, Lowell BB: Hypomorphic mutation of PGC1beta causes mitochondrial dysfunction and liver insulin resistance. Cell Metab. 2006, 4: 453464. 10.1016/j.cmet.2006.11.003.PubMed CentralView ArticlePubMedGoogle Scholar
 Vallender TW, Lahn BT: Localized methylation in the key regulator gene endothelin1 is associated with cell typespecific transcriptional silencing. FEBS Lett. 2006, 580: 45604566. 10.1016/j.febslet.2006.07.017.View ArticlePubMedGoogle Scholar
 Smyth GK: Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004, 3: Article3Google Scholar
 Raychaudhuri S, Altman RB: A literaturebased method for assessing the functional coherence of a gene group. Bioinformatics. 2003, 19: 396401. 10.1093/bioinformatics/btg002.PubMed CentralView ArticlePubMedGoogle Scholar
 Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP: Gene set enrichment analysis: a knowledgebased approach for interpreting genomewide expression profiles. Proc Natl Acad Sci USA. 2005, 102: 1554515550. 10.1073/pnas.0506580102.PubMed CentralView ArticlePubMedGoogle Scholar
 Tian L, Greenberg SA, Kong SW, Altschuler J, Kohane IS, Park PJ: Discovering statistically significant pathways in expression profiling studies. Proc Natl Acad Sci USA. 2005, 102: 1354413549. 10.1073/pnas.0506577102.PubMed CentralView ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.