We found a highly significant overlap in cis -eQTL genes identified by microarray profiling and ASE genes identified by NSR-seq. Improving overlap with increasing additive QTL effect sizes and/or LBP demonstrates that, as expected, both approaches reliably detect allele-specific expression despite a number of key differences in the technologies. For example, the majority of microarray gene-expression platforms, including the platform used here, rely on 3'-biased amplification protocols and thus position microarray probes near the 3'-end of the gene. With NSR-seq we monitored the entire transcript, including introns, which we previously found to improve sensitivity  presumably because most intronic reads correspond to unprocessed pre-mRNA or degradation products. Any allele-specific events outside the microarray probe region will thus skew the comparison. Unknown SNPs can also cause disagreement: 1) they can lead to artifactual genetic associations if they are within probe regions , 2) can bias allelic representation in NSR-seq if located within priming sites, and 3) can lead to a bias towards aligning NSR-seq reads to the reference genome (though this last scenario is expected to be rare, since it requires having both a known and an unknown SNP in the same read; consistent with this expectation, visual inspection of Figure 3 reveals little bias). False-positive SNPs accounted for 28% of extreme cases where NSR-seq disagreed with microarrays. Although this is a substantial overestimate of total effect since selection of these genes was biased toward strong disagreement, it highlights the importance of high-quality SNP maps for both methods.
While our results mostly agree with those from more traditional approaches using hundreds of microarrays to measure gene expression among F2 mice, there are several distinct advantages of the NSR-seq approach. First, NSR-seq results are not limited by a fixed set of probes, and because of this we were able to find allele-specific instances of splicing and antisense transcription that were invisible to our microarrays. Second, identification of cis- eQTL does not depend on arbitrary genomic distance cutoffs with NSR-seq, in contrast to microarray studies which will inevitably misclassify some trans -eQTL as cis, and vice versa. Third, the NSR-seq approach can be applied to any outbred diploid or polyploid species, even those for which microarrays are not readily available. Finally, NSR-seq can be applied to a single F1 individual, with a single sequencing run costing several thousand dollars, as opposed to applying microarrays to each of hundreds of F2 individuals, saving a great deal of time and expense (though pooling F2 individuals may more closely agree with cis -eQTL from microarrays, if genetic interactions in the F2 population are not captured in the F1). On the other hand, the greatest disadvantage of NSR-seq is its inability to detect trans -eQTL. Measuring ASE in the context applied here where animals were pooled at the RNA level cannot be used to map the genomic region that contains the causal effect. Sequencing or microarray profiling many samples will always be required for mapping QTL. A second disadvantage is that in outbred species with short LD blocks (such as human), the causal cis -eQTL polymorphism will often not be in LD with any transcribed SNPs, and in these cases pooling will not reveal ASE. Third, pooling RNA from different individuals could introduce biases. These can be minimized by keeping track of the mass of RNA from each individual or by pooling by equal mass. It is also possible that one or a few individuals in the pool have significantly different expression levels and contribute unbiasly to the ASE signal. We show that this is very unlikely given our strong agreement on ASE in F1 biological replicates (Figure 1a), but ruling it out for the entire pool would require testing individuals. Nevertheless, pooling human samples has already been shown to reveal many cis -eQTL . To detect cis -eQTL where the causal polymorphism is not in LD with any transcribed SNPs, NSR-seq can be performed on individual samples, as long as genotype phasing is known . A method able to quickly and efficiently identify cis -eQTL genome-wide will find applications in many areas. The dependence of cis -eQTL on environmental conditions--a subject not previously studied, in large part due to its prohibitive cost--can now be studied efficiently and comprehensively. The method can be applied to hybrids between distinct species (as has already been done with low-throughput pyrosequencing for Drosophila hybrids) to reveal all cis -acting gene expression differences, and inform us of the importance of cis -regulation in evolution. Since the action of positive selection can now be inferred solely from cis -eQTL (H. Fraser; personal communication), selection on gene expression can be measured by NSR/RNA-seq in a wide range of species. And finally, cis -acting polymorphisms have been shown to be highly enriched for SNPs associated with human disease risk in genome-wide association studies, so compiling catalogs of genes affected by cis -eQTL in various tissues, populations, and disease states could be extremely useful for inferring which disease associations are likely due to cis -acting effects on gene expression, and even more importantly, which genes are perturbed by the disease-associated variants (see Additional File 1, Supplement).