- Research article
- Open Access
A simple optimization can improve the performance of single feature polymorphism detection by Affymetrix expression arrays
© Horiuchi et al; licensee BioMed Central Ltd. 2010
- Received: 18 February 2010
- Accepted: 20 May 2010
- Published: 20 May 2010
High-density oligonucleotide arrays are effective tools for genotyping numerous loci simultaneously. In small genome species (genome size: < ~300 Mb), whole-genome DNA hybridization to expression arrays has been used for various applications. In large genome species, transcript hybridization to expression arrays has been used for genotyping. Although rice is a fully sequenced model plant of medium genome size (~400 Mb), there are a few examples of the use of rice oligonucleotide array as a genotyping tool.
We compared the single feature polymorphism (SFP) detection performance of whole-genome and transcript hybridizations using the Affymetrix GeneChip® Rice Genome Array, using the rice cultivars with full genome sequence, japonica cultivar Nipponbare and indica cultivar 93-11. Both genomes were surveyed for all probe target sequences. Only completely matched 25-mer single copy probes of the Nipponbare genome were extracted, and SFPs between them and 93-11 sequences were predicted. We investigated optimum conditions for SFP detection in both whole genome and transcript hybridization using differences between perfect match and mismatch probe intensities of non-polymorphic targets, assuming that these differences are representative of those between mismatch and perfect targets. Several statistical methods of SFP detection by whole-genome hybridization were compared under the optimized conditions. Causes of false positives and negatives in SFP detection in both types of hybridization were investigated.
The optimizations allowed a more than 20% increase in true SFP detection in whole-genome hybridization and a large improvement of SFP detection performance in transcript hybridization. Significance analysis of the microarray for log-transformed raw intensities of PM probes gave the best performance in whole genome hybridization, and 22,936 true SFPs were detected with 23.58% false positives by whole genome hybridization. For transcript hybridization, stable SFP detection was achieved for highly expressed genes, and about 3,500 SFPs were detected at a high sensitivity (> 50%) in both shoot and young panicle transcripts. High SFP detection performances of both genome and transcript hybridizations indicated that microarrays of a complex genome (e.g., of Oryza sativa) can be effectively utilized for whole genome genotyping to conduct mutant mapping and analysis of quantitative traits such as gene expression levels.
- Perfect Match
- Affymetrix GeneChip
- Young Panicle
- Single Feature Polymorphism
- Nipponbare Genome
High-density oligonucleotide arrays are currently the most widely used high-throughput technology for whole-genome gene expression studies. Array technology also makes it possible to genotype thousands of nucleotide polymorphisms (NPs) efficiently , and detected NPs are called single feature polymorphisms (SFPs) . In addition to their utility in measuring gene expression levels, whole-genome DNA hybridization to expression arrays has various applications in small genome species; mutant mapping in yeast (genome size: ~12 Mb) and Arabidopsis (genome size: ~125 Mb) by bulk segregant analysis [2–7]; quantitative trait loci extreme array mapping in Arabidopsis[4, 8, 9]; and comparative genomics in yeast , Arabidopsis, malaria mosquitoes (genome size: ~278 Mb) , in the human malarial parasite (genome size: ~23 Mb) , and in Mycobacterium tuberculosis (genome size: ~4.4 Mb) . However, in large genome species, such as barley (genome size: ~5.2 Gb) , Xenopus (genome size: ~3.1 Gb) , and maize (genome size: ~2.5 Gb) , whole-genome DNA hybridization to expression arrays has not worked out well because of cross-hybridization. Although applications of oligonucleotide expression arrays were limited in large genome species, complementary RNA (cRNA) from their transcripts was used to detect SFPs in barley [15, 18, 19], maize , wheat (genome size: ~17 Gb) , and cowpea (genome size: ~600 Mb) .
Rice is important as both a food and model plant for the grasses. The genome size of rice (389 Mb) is relatively small amongst crop species, but is larger than that of malaria mosquitoes, which have the largest genome used in successful studies of whole-genome hybridization. Affymetrix supplies a 3'-expression array for rice, the Affymetrix GeneChip® Rice Genome Array (Santa Clara, CA, USA). A trial SFP detection using whole-genome hybridization by the rice array was reported by Kumar et al., and more than 70% SFP detection sensitivity at about 10% estimated FDR (False Discovery Rate) was verified by sequencing probe targets . SFP detection using rice transcripts was reported by Kim et al., andthey detected 1,208 SFPs, and 60 out of 62 predicted SFPs were verified by sequencing predicted SFP-containing amplicons . However, because the number of sequenced targets was biased to SFP-predicted ones, it was estimated that the sensitivity would be higher than the true value. Genomes of two rice strains have been fully sequenced. One is japonica cultivar Nipponbare, which has been sequenced by a BAC-by-BAC approach , and the other is indica cultivar 93-11, which has been sequenced by a shotgun approach . Nucleotide differences in coding and 3'-untranslated (UT) regions of genes between the two strains were 3.0 single nucleotide polymorphisms (SNPs)/kb and 4.5 SNPs/kb, respectively . One of the 10 probes was expected to detect an SFP in the 93-11 genome, because each probe was 25-mer long, most rice GeneChip® probes were designed to target the coding and 3'-UT regions of japonica transcripts, 4 SNPs/kb were expected in the target region of the 93-11 genome on an average. The two fully sequenced rice strains provided us with an opportunity for detailed analyses of SFP detection efficiency.
First, we searched all probe target sequences in Nipponbare and 93-11 genomes and predicted SFPs between them. Second, we investigated optimized experimental conditions to detect these SFPs by whole-genome hybridization. Several statistical methods for SFP detection were compared using whole-genome hybridization data of Nipponbare and 93-11 cultivars. Effects of several background corrections were also examined for maximum SFP detection performance. Third, SFP detection efficiency by cRNA hybridization was also investigated by applying our recently proposed method for simultaneously detecting nucleotide and expression polymorphisms (SNEP) using the Affymetrix GeneChip® array . Finally, a comparison of benefits and limitations of SFP detection by whole-genome and transcript hybridization approaches was made.
BLASTN analysis of Affymetrix GeneChip® array probes
Summary of the BLASTN search of probe target sequences on Nipponbare and 93-11 genomes.
No hit or single hit but not perfect match to Nipponbare genome
Multiple hits to the Nipponbare genome
Unique probes in the Nipponbare genomea
SFP probes for the 93-11 genome in the unique probes for the Nipponbare genomea
Sets with more than six unique probes in the Nipponbare genomea
SFP probes to the 93-11 genome in the selected setsb
SFP detection by whole genome DNA hybridization
The rice genome, which is about 389 Mb in size , may generate much more noise relative to the true signal intensity in genomic DNA (gDNA) hybridization, compared with other organisms with less complex genomes such as those of yeast and Arabidopsis. Thus, it is important to investigate conditions that maximize hybridization signal intensity differences between completely matched and NP containing targets of probes on the Affymetrix GeneChip® arrays. There are two types of probes on the Affymetrix GeneChip® arrays; PM and MM. Although the MM probes were designed with single complementary substitutions at the 13th base (midpoint) of each PM probe to represent non-specific hybridization signal values, many studies have indicated that some MM probe intensities are frequently higher than those of the corresponding PM probes (designated: MM > PM) [27, 28]. The MM probe sequence is an ideal NP containing a sequence against a complete match target sequence of a PM probe. MM signal intensity would show an ideal SFP intensity for a PM probe. This study thus began as an effort to investigate conditions to maximize the difference between MM and PM intensities for a complete match target.
SFP detection by gDNA hybridization.
SFP detection using SNEP by rice transcript cRNA hybridization
Number of SFPs detected by transcript cRNA hybridization
Total probe sets
Effects of low expressed gene elimination on SFP detection performances in transcript hybridization by SNEP.
Young panicle All
Young panicle All
Total probe sets
In both shoot and young panicle ROC curves, FPR seemed to have a minimum at around 0.1 (Figure 6). Numbers of falsely called SFPs in shoot and young panicle data were 931 and 954 at a significance level of 10-6, respectively (Table 3). Even at a higher threshold, at a significance level of 10-18, 447 and 520 falsely called SFPs were observed and 352 probes were common in both the shoot and panicle. Because most of the decrease in intensity of falsely called SFPs appeared to be similar to that of real NPs, we investigated this in detail. One possible cause of a falsely called SFP was gene duplication in the 93-11 genome. In the 266 probe sets with common falsely called SFPs, more than half of these target genes appeared to be multiple copies in the 93-11 genome by BLASTN searching, and false-positive probes did not perfectly hit all multiple targets. Another possible cause of falsely called SFPs is alternative splicing or structural differences between the expressed gene and the model used to design the probes. To confirm the influence of alternative splicing and structural differences on SFP detection, we performed SNEP analysis between Nipponbare shoot and young panicle data, and found a total of 92 falsely called SFPs at a significance level of 10-18 (data not shown). In other words, intron-targeting probes were called as SFPs by SNEP when the expression level of a gene was different. Although 3,393 and 3,513 SFPs were correctly called from shoot and young panicle data, respectively, 2,812 SFPs were identical due to their similar expression profiles (Table 3). A disadvantage of SFP detection by transcript hybridization is that the ability to detect SFPs from RNA data depends on the expression level of a gene in a particular tissue type. To achieve a large number of SFPs, various RNA profiling data from different tissues would be required. In this study, about 40% of correctly called SFPs were detected by transcript hybridization and the remaining were detected by whole-genome hybridization (Table 3).
The distribution of highly expressed genes and correctly called SFPs in the Nipponbare genome was investigated in a manner similar to that of whole-genome hybridization, as indicated in the previous section. The distribution of probes of genes expressed highly in both the shoot and young panicle was the same as that of the unique probes used in this study (Additional file 2). The distribution of correctly called SFPs in both tissues corresponded with that of expressed genes.
Analysis of false-negative and -positive SFPs by whole-genome hybridization
Many new applications of oligonucleotide arrays have been developed in recent years. In this study, we describe a method to seek optimal conditions for SFP detection by both genome and mRNA hybridizations using the differences between PM and MM probe signal intensities of completely matched targets. These optimizations greatly improve SFP detection performances in both whole-genome and transcript hybridizations. This simple method is applicable to any other Affymetrix arrays of any species. Especially, for large genome species, this method will be useful to evaluate the possibility of SFP detection.
Under the optimized conditions, SFP detection performances in both genome and mRNA hybridizations were evaluated using the whole genome sequences of the two sequenced strains of O. sativa Nipponbare and 93-11. Sensitivity (38.9%) and FPR (22.4%) of SFP detection by whole-genome hybridization was less than the reported sensitivity (57%) and FPR (13%) of SFP detection by genome hybridization of Arabidopsis. Following the Arabidopsis protocols , 40 μg of labeled product was obtained from 300 ng of rice gDNA. The labeled product (40 μg) was used in Arabidopsis whole-genome hybridization experiments, and the rice optimal concentration was ~20 times lower than that of Arabidopsis, considering their genome sizes. Although a lower target concentration was used for rice whole-genome hybridization and single copy probes were selected for analysis, PM and MM probe intensities could not be sufficiently differentiated. These limitations affected SFP detection. The GC content of rice genes was found to be higher than that of Arabidopsis[38, 39]. Thus, binding affinities of many rice probes are stronger than those of Arabidopsis, making it more difficult to effectively detect SFPs. In fact, for rice whole-genome hybridization, it was difficult to separate MM probe intensities in the Affymetrix Rice Genome array from PM probe intensities, although overall PM probe intensities were maintained at the same level as those in the GeneChip®Arabidopsis ATH1 Genome Array for Arabidopsis whole-genome hybridization (data not shown).
Our results from the whole-genome approach suggested that false-positive probes were clustered due to amplification polymorphism caused by nearby SNPs. Thus, these false positives can be used as genetic markers if there is a real NP within 200 bp around them.
Using shoot and young panicle transcripts, higher SFP detection sensitivities (51% and 54%) were observed with a similar FPR of 21%. Comparing our previous results, a sensitivity of 65% and FPR of 10% for 1,901 probe sets from the canonical rice data of the young panicle , the SFP detection performances from our present results appears inferior. The differences between previous canonical and present data are as follows: all probe sets of canonical data consisted of 11 probes (in the present case, probe sets with more than 6 probes); all probes in the canonical data set were single copies in both Nipponbare and 93-11 genomes (in the present case, single copy only in the Nipponbare genome); and probe sets that consisted of probes entirely polymorphic to the 93-11 genome were eliminated in the canonical data probe sets (also included in this study). These differences made it difficult for SNEP to detect SFPs, because SNEP detects SFPs as outliers.
Using sequence analysis, in the focused 41,525 probe sets, 17,912 probe sets were expected to have SFP probes in the 93-11 genome (Table 1). However, a substantial number of predicted SFPs were excluded for SNEP analysis of transcripts, because probe intensity with NP was affected only when the gene was expressed at a high level. Even for SFPs containing probe sets with highly expressed gene, 6~7% of sets consisted of more than half the SFP probes and failed to detect SFPs as outliers in a set. This proportion of highly diverse genes was higher than that expected from the random occurrence of mutations. This is not an unusual phenomenon because differences in gene evolution rates are often observed. Most NPs in highly divergent genes were undetected by SNEP because of the difficulty in distinguishing outliers of log10-intensity differences in a set. On the other hand, several issues aside from NP, such as alternative splicing or gene duplication, lead to SFPs. Gene duplication can result from unequal crossing over in chromosomal duplication, the outcomes of which can be quite different. Difference in exon-intron structure between duplicated genes in the 93-11 genome could also lead to these misclassifications. Some false positive SFPs detected by transcript hybridizations were attributed to various forms of genetic diversity such as copy number of a gene and alternative splicing. In other words, SNEP detects differences not only in nucleotide sequences and expression levels of a gene, but also in the gross structure of a gene.
More than 27,000 SFPs could be detected by whole-genome hybridization, and about 3,500 SFPs and precise expression polymorphisms could be simultaneously detected by SNEP using transcript hybridization. A technology for simultaneous genotyping by polymorphic markers densely covering the whole genome can be utilized for various applications. Analysis of gene expression levels as a quantitative trait (expression Quantitative Trait Locus: eQTL) is a promising application of SFPs. In eQTL studies of yeast, genotypes of segregants were determined by about 3,000 SFPs from gDNA hybridization, and gene expression levels were determined by another type of microarray with spotted PCR products of the genome [40–42]. The first global eQTL study in a plant was performed in Arabidopsis using 211 recombinant inbred lines, in which genotypes of 540 SFP markers and gene expression levels were evaluated by Affymetrix expression arrays . In barley, using Affymetrix expression arrays and 139 doubled haploid lines, more than 2,000 genetic markers were identified and underwent eQTL analysis with 512 unique segregation patterns in the population . Because these eQTL studies used experimental populations with a limited number of recombinations, the number of genetic markers required for eQTL analysis was not so large. To perform eQTL in rice using an experimental population with homozygous genotypes between japonica and indica, application of SNEP using transcript hybridization would provide a sufficient number of genetic markers and a robust estimation of gene expression levels.
Recent revolutionary developments in sequencing technologies have challenged microarray technologies [45, 46]. However, the Affymetrix GeneChip® array analysis by bulk gDNA hybridization is a cost-effective option for mapping a gene by bulk segregant analysis or QTL extreme mapping, because the rice genome is more than 2.5 times larger than that of Arabidopsis and the required number of reads by a sequencer should be proportional to the genome size.
In this study, we describe a method to seek optimal conditions for SFP detection by both genome and mRNA hybridizations using the differences between PM and MM probe signal intensities of completely matched targets. The optimizations allowed a more than 20% increase in true SFP detection in whole-genome hybridization and a large improvement of SFP detection performance in transcript hybridization. Significance analysis of the microarray for log-transformed raw intensities of PM probes gave the best performance in whole genome hybridization, and 22,936 true SFPs were detected with 23.58% false positives by whole genome hybridization. For transcript hybridization, stable SFP detection was achieved for highly expressed genes, and about 3,500 SFPs were detected at a high sensitivity (> 50%) in both shoot and young panicle transcripts. High SFP detection performances of both genome and transcript hybridizations indicated that microarrays of a complex genome (e.g., of Oryza sativa) can be effectively utilized for whole genome genotyping to conduct mutant mapping and analysis of quantitative traits such as gene expression levels.
BLASTN analysis of Affymetrix GeneChip® array probes
To detect all SFPs with the Affymetrix GeneChip® array in Nipponbare and 93-11 genomes, target sequences for all 628,725 PM probes were searched in both genomes using BLASTN version 2.2.8  (total 371 Mb Nipponbare genome from GenBank/EMBL/DDBJ accession: AP008207 to AP008218 and total 479 Mb 93-11 genome including unmapped contigs of 105 Mb from version 2003-08-01 BGI) under the following conditions: expectation value was 20, match score was 1, mismatch score was -3, cost to open a gap open was 5, and cost to extend a gap was 2. The score of a complete match to the probe target sequence is 25. The score of a single mismatch, depending on the mismatch position, between the target sequence and corresponding probe is from 24 to 21. When the mismatch is in the distal three bases, BLASTN counts a continuous match; however, at inner positions, the score is 21. The score of a single insertion in the target sequence is from 24 to 18, in the same manner. We summarized BLASTN search results of scores greater than or equal to 18. Using this search, if a probe sequence hit only a single region in the genome we considered the probe to be present as a single copy in the genome.
DNA/RNA preparation and microarray experiments
gDNA from leaves of two rice subspecies, O. sativa L. ssp. japonica cv. Nipponbare and ssp. indica cv. 93-11, was isolated using the Qiagen DNeasy Plant Mini Kit (QIAGEN GmbH, Hilden, Germany). Purified DNA (300 ng) was labeled according to Arabidopsis Protocols , and reaction products were hybridized to the Affymetrix GeneChip® arrays according to the Affymetrix standard protocol for RNA . Total RNA was extracted from 3~4-week old shoots of both Nipponbare and 93-11 cultivars using the QIAGEN RNeasy Plant Mini Kit according to the manufacturer's protocol. Labeled cRNA was prepared and hybridized to the Affymetrix GeneChip® arrays according to the manufacturer's guidelines . The Affymetrix GeneChip® arrays were scanned with an Affymetrix GeneChip® Scanner 3000, and raw CEL files were generated by the Affymetrix GeneChip® Operating Software version 1.3. To investigate SFP detection performances of the two rice cultivars, four and five biological replicates of the hybridization as well as data read were carried out for independent gDNA and transcript samples, respectively. These array data were submitted to the Gene Expression Omnibus at http://www.ncbi.nlm.nih.gov/geo/, GSE16341. Gene expression data from 2-cm long young panicles of the two rice subspecies could be obtained as GSE16265.
Statistical analysis of SFPs using array data
For SFP detection by whole-genome hybridization, all statistical analyses were performed with the freely available statistical package R. The raw CEL intensity files were analyzed using a series of methods implemented by the software Bioconductor . Background correction and normalization algorithms are available in the affy and gcrma packages. The log10-transformed intensity value of each feature was extracted and subjected to data analysis for SFP calls using ANOVA, SAM in package "siggenes" , and SNEP http://www.ism.ac.jp/~fujisawa/SNEP/. SNEP was originally developed for transcript hybridization; however, it can also be used for SFP detection by whole-genome hybridization after modification of the SNEP script. PM probe intensities were transformed to log10 values; these are randomly grouped together in 500 PM probes, and SFPs were called in each group. This script is available at the SNEP site. SFP detection by transcript hybridization was performed using SNEP, as described previously .
Calculation of probe binding affinity on the Affymetrix GeneChip® array
The binding stability (ΔG) of a PM probe to a complete match target at a wash temperature of 50°C was calculated according to the values of the nearest-neighbor thermodynamic parameters for DNA . The binding stability was used for analysis of false-negative SFPs.
We thank Shinto Eguchi at the Institute of Statistical Mathematics for his valuable discussions and comments on this work. This work was supported by the Bio-diversity Research Project of the Transdisciplinary Research Integration Center, Research Organization of Information and Systems; and by a Grant-in-Aid for Scientific Research on Priority Area (Grant 18075009) from the Ministry of Education, Culture, Sports, Science and Technology.
- Winzeler E, Richards D, Conway A, Goldstein A, Kalman S, McCullough M, McCusker J, Stevens D, Wodicka L, Lockhart D, Davis R: Direct allelic variation scanning of the yeast genome. Science. 1998, 281: 1194-7. 10.1126/science.281.5380.1194.PubMedView ArticleGoogle Scholar
- Borevitz J, Liang D, Plouffe D, Chang H, Zhu T, Weigel D, Berry C, Winzeler E, Chory J: Large-scale identification of single-feature polymorphisms in complex genomes. Genome Res. 2003, 13: 513-23. 10.1101/gr.541303.PubMed CentralPubMedView ArticleGoogle Scholar
- Gong J, Waner D, Horie T, Li S, Horie R, Abid K, Schroeder J: Microarray-based rapid cloning of an ion accumulation deletion mutant in Arabidopsis thaliana. Proc Natl Acad Sci USA. 2004, 101: 15404-9. 10.1073/pnas.0404780101.PubMed CentralPubMedView ArticleGoogle Scholar
- Hazen S, Borevitz J, Harmon F, Pruneda-Paz J, Schultz T, Yanovsky M, Liljegren S, Ecker J, Kay S: Rapid array mapping of circadian clock and developmental mutations in Arabidopsis. Plant Physiol. 2005, 138: 990-7. 10.1104/pp.105.061408.PubMed CentralPubMedView ArticleGoogle Scholar
- Hazen S, Schultz T, Pruneda-Paz J, Borevitz J, Ecker J, Kay S: LUX ARRHYTHMO encodes a Myb domain protein essential for circadian rhythms. Proc Natl Acad Sci USA. 2005, 102: 10387-92. 10.1073/pnas.0503029102.PubMed CentralPubMedView ArticleGoogle Scholar
- Brauer M, Christianson C, Pai D, Dunham M: Mapping novel traits by array-assisted bulk segregant analysis in Saccharomyces cerevisiae. Genetics. 2006, 173: 1813-6. 10.1534/genetics.106.057927.PubMed CentralPubMedView ArticleGoogle Scholar
- Demogines A, Smith E, Kruglyak L, Alani E: Identification and dissection of a complex DNA repair sensitivity phenotype in Baker's yeast. PLoS Genet. 2008, 4: e1000123-10.1371/journal.pgen.1000123.PubMed CentralPubMedView ArticleGoogle Scholar
- Rus A, Baxter I, Muthukumar B, Gustin J, Lahner B, Yakubova E, Salt D: Natural variants of At HKT1 enhance Na+ accumulation in two wild populations of Arabidopsis. PLoS Genet. 2006, 2: e210-10.1371/journal.pgen.0020210.PubMed CentralPubMedView ArticleGoogle Scholar
- Wolyn D, Borevitz J, Loudet O, Schwartz C, Maloof J, Ecker J, Berry C, Chory J: Light-response quantitative trait loci identified with composite interval and eXtreme array mapping in Arabidopsis thaliana. Genetics. 2004, 167: 907-17. 10.1534/genetics.103.024810.PubMed CentralPubMedView ArticleGoogle Scholar
- Winzeler E, Castillo-Davis C, Oshiro G, Liang D, Richards D, Zhou Y, Hartl D: Genetic diversity in yeast assessed with whole-genome oligonucleotide arrays. Genetics. 2003, 163: 79-89.PubMed CentralPubMedGoogle Scholar
- Borevitz J, Hazen S, Michael T, Morris G, Baxter I, Hu T, Chen H, Werner J, Nordborg M, Salt D: Genome-wide patterns of single-feature polymorphism in Arabidopsis thaliana. Proc Natl Acad Sci USA. 2007, 104: 12057-62. 10.1073/pnas.0705323104.PubMed CentralPubMedView ArticleGoogle Scholar
- Turner T, Hahn M, Nuzhdin S: Genomic islands of speciation in Anopheles gambiae. Plos Biol. 2005, 3: e285-10.1371/journal.pbio.0030285.PubMed CentralPubMedView ArticleGoogle Scholar
- Kidgell C, Volkman S, Daily J, Borevitz J, Plouffe D, Zhou Y, Johnson J, Le Roch K, Sarr O, Ndir O: A systematic map of genetic variation in Plasmodium falciparum. PLoS Pathog. 2006, 2: e57-10.1371/journal.ppat.0020057.PubMed CentralPubMedView ArticleGoogle Scholar
- Tsolaki A, Hirsh A, DeRiemer K, Enciso J, Wong M, Hannan M, Goguet de la Salmoniere Y, Aman K, Kato-Maeda M, Small P: Functional and evolutionary genomics of Mycobacterium tuberculosis: insights from genomic deletions in 100 strains. Proc Natl Acad Sci USA. 2004, 101: 4865-70. 10.1073/pnas.0305634101.PubMed CentralPubMedView ArticleGoogle Scholar
- Rostoks N, Borevitz J, Hedley P, Russell J, Mudie S, Morris J, Cardle L, Marshall D, Waugh R: Single-feature polymorphism discovery in the barley transcriptome. Genome Biol. 2005, 6: R54-10.1186/gb-2005-6-6-r54.PubMed CentralPubMedView ArticleGoogle Scholar
- Chain F, Ilieva D, Evans B: Single-species microarrays and comparative transcriptomics. PLoS ONE. 2008, 3: e3279-10.1371/journal.pone.0003279.PubMed CentralPubMedView ArticleGoogle Scholar
- Gore M, Bradbury P, Hogers R, Kirst M, Verstege E, Van Oeveren J, Peleman J, Buckler E, Van Eijk M: Evaluation of target preparation methods for single-feature polymorphism detection in large complex plant genomes. Crop Sci. 2007, 47: S-148. 10.2135/cropsci2007.02.0085tpg.View ArticleGoogle Scholar
- Cui X, Xu J, Asghar R, Condamine P, Svensson J, Wanamaker S, Stein N, Roose M, Close T: Detecting single-feature polymorphisms using oligonucleotide arrays and robustified projection pursuit. Bioinformatics. 2005, 21: 3852-8. 10.1093/bioinformatics/bti640.PubMedView ArticleGoogle Scholar
- Luo Z, Potokina E, Druka A, Wise R, Waugh R, Kearsey M: SFP genotyping from affymetrix arrays is robust but largely detects cis-acting expression regulators. Genetics. 2007, 176: 789-800. 10.1534/genetics.106.067843.PubMed CentralPubMedView ArticleGoogle Scholar
- Bhat P, Lukaszewski A, Cui X, Xu J, Svensson J, Wanamaker S, Waines J, Close T: Mapping translocation breakpoints using a wheat microarray. Nucleic Acids Res. 2007, 35: 2936-43. 10.1093/nar/gkm148.PubMed CentralPubMedView ArticleGoogle Scholar
- Das S, Bhat P, Sudhakar C, Ehlers J, Wanamaker S, Roberts P, Cui X, Close T: Detection and validation of single feature polymorphisms in cowpea (Vigna unguiculata L. Walp) using a soybean genome array. BMC Genomics. 2008, 9: 107-10.1186/1471-2164-9-107.PubMed CentralPubMedView ArticleGoogle Scholar
- Kumar R, Qiu J, Joshi T, Valliyodan B, Xu D, Nguyen H: Single feature polymorphism discovery in rice. PLoS ONE. 2007, 2: e284-10.1371/journal.pone.0000284.PubMed CentralPubMedView ArticleGoogle Scholar
- Kim S, Bhat P, Cui X, Walia H, Xu J, Wanamaker S, Ismail A, Wilson C, Close T: Detection and validation of single feature polymorphisms using RNA expression data from a rice genome array. BMC Plant Biol. 2009, 9: 65-10.1186/1471-2229-9-65.PubMed CentralPubMedView ArticleGoogle Scholar
- International Rice Genome Sequencing Project: The map-based sequence of the rice genome. Nature. 2005, 436: 793-800. 10.1038/nature03895.View ArticleGoogle Scholar
- Yu J, Wang J, Lin W, Li S, Li H, Zhou J, Ni P, Dong W, Hu S, Zeng C: The genomes of Oryza sativa: A history of duplications. Plos Biol. 2005, 3: e38-10.1371/journal.pbio.0030038.PubMed CentralPubMedView ArticleGoogle Scholar
- Fujisawa H, Horiuchi Y, Harushima Y, Takada T, Eguchi S, Mochizuki T, Sakaguchi T, Shiroishi T, Kurata N: SNEP: Simultaneous detection of nucleotide and expression polymorphisms using Affymetrix GeneChip. BMC Bioinformatics. 2009, 10: 131-10.1186/1471-2105-10-131.PubMed CentralPubMedView ArticleGoogle Scholar
- Irizarry R, Hobbs B, Collin F, Beazer-Barclay Y, Antonellis K, Scherf U, Speed T: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003, 4: 249-64. 10.1093/biostatistics/4.2.249.PubMedView ArticleGoogle Scholar
- Naef F, Lim D, Patil N, Magnasco M: DNA hybridization to mismatched templates: a chip study. Phys Rev E Stat Nonlin Soft Matter Phys. 2002, 65: 040902-PubMedView ArticleGoogle Scholar
- GeneChip Expression Analysis Technical Manual. [http://www.affymetrix.com/support/downloads/manuals/expression_analysis_technical_manual.pdf]
- Tusher V, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA. 2001, 98: 5116-21. 10.1073/pnas.091062498.PubMed CentralPubMedView ArticleGoogle Scholar
- Borevitz J: Genotyping and mapping with high-density oligonucleotide arrays. Methods Mol Biol. 2006, 323: 137-45.PubMedGoogle Scholar
- Statistical algorithms description document. [http://www.affymetrix.com/support/technical/whitepapers/sadd_whitepaper.pdf]
- Wu Z, Irizarry R, Gentleman R, Martinez-Murillo F, Spencer F: A model-based background adjustment for oligonucleotide expression arrays. J Am Stat Assoc. 2004, 99: 909-917. 10.1198/016214504000000683.View ArticleGoogle Scholar
- Bolstad B, Irizarry R, Astrand M, Speed T: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003, 19: 185-93. 10.1093/bioinformatics/19.2.185.PubMedView ArticleGoogle Scholar
- Tang T, Lu J, Huang J, He J, McCouch S, Shen Y, Kai Z, Purugganan M, Shi S, Wu C: Genomic variation in rice: genesis of highly polymorphic linkage blocks during domestication. PLoS Genet. 2006, 2: e199-10.1371/journal.pgen.0020199.PubMed CentralPubMedView ArticleGoogle Scholar
- Feltus F, Wan J, Schulze S, Estill J, Jiang N, Paterson A: An SNP resource for rice genetics and breeding based on subspecies indica and japonica genome alignments. Genome Res. 2004, 14: 1812-9. 10.1101/gr.2479404.PubMed CentralPubMedView ArticleGoogle Scholar
- SantaLucia J, Hicks D: The thermodynamics of DNA structural motifs. Annu Rev Biophys Biomol Struct. 2004, 33: 415-40. 10.1146/annurev.biophys.32.110601.141800.PubMedView ArticleGoogle Scholar
- Wong GK, Wang J, Tao L, Tan J, Zhang J, Passey DA, Yu J: Compositional gradients in Gramineae genes. Genome Res. 2002, 12: 851-856. 10.1101/gr.189102.PubMed CentralPubMedView ArticleGoogle Scholar
- Yu J, Hu S, Wang J, Wong G, Li S, Liu B, Deng Y, Dai L, Zhou Y, Zhang X: A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science. 2002, 296: 79-92. 10.1126/science.1068037.PubMedView ArticleGoogle Scholar
- Brem R, Yvert G, Clinton R, Kruglyak L: Genetic dissection of transcriptional regulation in budding yeast. Science. 2002, 296: 752-5. 10.1126/science.1069516.PubMedView ArticleGoogle Scholar
- Yvert G, Brem R, Whittle J, Akey J, Foss E, Smith E, Mackelprang R, Kruglyak L: Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nat Genet. 2003, 35: 57-64. 10.1038/ng1222.PubMedView ArticleGoogle Scholar
- Brem R, Kruglyak L: The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc Natl Acad Sci USA. 2005, 102: 1572-7. 10.1073/pnas.0408709102.PubMed CentralPubMedView ArticleGoogle Scholar
- West M, Kim K, Kliebenstein D, van Leeuwen H, Michelmore R, Doerge R, St Clair D: Global eQTL mapping reveals the complex genetic architecture of transcript-level variation in Arabidopsis. Genetics. 2007, 175: 1441-50. 10.1534/genetics.106.064972.PubMed CentralPubMedView ArticleGoogle Scholar
- Potokina E, Druka A, Luo Z, Wise R, Waugh R, Kearsey M: Gene expression quantitative trait locus analysis of 16 000 barley genes reveals a complex pattern of genome-wide transcriptional regulation. Plant J. 2008, 53: 90-101. 10.1111/j.1365-313X.2007.03315.x.PubMedView ArticleGoogle Scholar
- Simon S, Zhai J, Nandety R, McCormick K, Zeng J, Mejia D, Meyers B: Short-read sequencing technologies for transcriptional analyses. Annu Rev Plant Biol. 2009, 60: 305-33. 10.1146/annurev.arplant.043008.092032.PubMedView ArticleGoogle Scholar
- Lister R, Gregory B, Ecker J: Next is now: new technologies for sequencing of genomes, transcriptomes, and beyond. Curr Opin Plant Biol. 2009, 12: 107-18. 10.1016/j.pbi.2008.11.004.PubMed CentralPubMedView ArticleGoogle Scholar
- Altschul S, Madden T, Schäffer A, Zhang J, Zhang Z, Miller W, Lipman D: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-402. 10.1093/nar/25.17.3389.PubMed CentralPubMedView ArticleGoogle Scholar
- Gentleman R, Carey V, Bates D, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004, 5: R80-10.1186/gb-2004-5-10-r80.PubMed CentralPubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.