A tissue-specific landscape of sense/antisense transcription in the mouse intestine
- Ulrich C Klostermeier†1,
- Matthias Barann†1,
- Michael Wittig1,
- Robert Häsler1,
- Andre Franke1,
- Olga Gavrilova1,
- Benjamin Kreck1,
- Christian Sina1, 2,
- Markus B Schilhabel1,
- Stefan Schreiber1, 2Email author and
- Philip Rosenstiel1Email author
© Klostermeier et al; licensee BioMed Central Ltd. 2011
Received: 28 October 2010
Accepted: 10 June 2011
Published: 10 June 2011
The intestinal mucosa is characterized by complex metabolic and immunological processes driven highly dynamic gene expression programs. With the advent of next generation sequencing and its utilization for the analysis of the RNA sequence space, the level of detail on the global architecture of the transcriptome reached a new order of magnitude compared to microarrays.
We report the ultra-deep characterization of the polyadenylated transcriptome in two closely related, yet distinct regions of the mouse intestinal tract (small intestine and colon). We assessed tissue-specific transcriptomal architecture and the presence of novel transcriptionally active regions (nTARs). In the first step, signatures of 20,541 NCBI RefSeq transcripts could be identified in the intestine (74.1% of annotated genes), thereof 16,742 are common in both tissues. Although the majority of reads could be linked to annotated genes, 27,543 nTARs not consistent with current gene annotations in RefSeq or ENSEMBL were identified. By use of a second independent strand-specific RNA-Seq protocol, 20,966 of these nTARs were confirmed, most of them in vicinity of known genes. We further categorized our findings by their relative adjacency to described exonic elements and investigated regional differences of novel transcribed elements in small intestine and colon.
The current study demonstrates the complexity of an archetypal mammalian intestinal mRNA transcriptome in high resolution and identifies novel transcriptionally active regions at strand-specific, single base resolution. Our analysis for the first time shows a strand-specific comparative picture of nTARs in two tissues and represents a resource for further investigating the transcriptional processes that contribute to tissue identity.
A transcriptome is the complete set of transcripts in a cell, a tissue or a whole organism at a given point in time, and may be altered by developmental stage or environmental stimuli. Transcriptome plasticity is conferred not only by altering the concentration levels of transcripts, but also by complex changes in the architecture of transcripts (splice isoforms, editing, transcription start and termination sites). Measuring the transcriptome is a key point in the decipherment of molecular constituents and in understanding functional elements of the genome, and leads to a better insight into cellular dynamics, for example during development or disease. In the past various technologies have been reported to deduce and quantify the transcriptome, including hybridization- and sequence-based methods. Sequence-based data was intensively used for transcript annotation projects in order to get insight into the complexity of the transcriptome, including expressed sequence tag (EST) projects , functional annotation of the mouse (FANTOM) [2–4] and encyclopedia of DNA elements (ENCODE) , which represent milestones in our understanding of the transcriptionally landscape in humans and mammalian model organisms.
Emerging next generation sequencing (NGS) technologies allow for an ultra-deep and highly parallel sequencing of complete transcriptomes of individual cells or tissues under study and overcome several limitations of previous technologies . RNA-Seq has been applied to various organisms [7–10] including mouse, highlighting accurate detection of gene expression , observation of complex alternative splicing patterns [12, 13] and detection of novel transcriptionally active regions (nTARs) in the genome . Although the mouse has already been in the scope of RNA-Seq studies, only a few individual tissues or cell types were analyzed, including embryonic stem cells [15, 16], oocytes , myoblasts , brain [19, 20], muscle, liver  and heart . Deep annotation of the intestinal mRNA sequence space is still missing, although microarray studies suggested a high complexity of region-specific expression patterns [22, 23] and disturbances of intestinal homeostasis are linked to a broad variety of diseases (e.g. infections, idiopathic inflammatory bowel disease and intestinal malignancies) [24, 25]. In this study, we introduce a two-step RNA-Seq approach using two different library preparation protocols on the SOLiD platform to characterize the full complexity of an archetypal mammalian intestinal mRNA transcriptome. The method aims specifically to identify novel transcribed elements as well as to describe their orientation in relation to known transcripts.
Generation of RNA-Seq data
Summary mapping statistics small intestine and colon
cDNA fragmented small intestine
cDNA fragmented colon
RNA fragmented small intestine
RNA fragmented colon
mapped reads [%]
uniquely mapped reads
uniquely mapped reads [%]
Distribution of reads along the 5'-3' axis
Detection and Quantification of RefSeq transcripts in the intestine
Many transcripts with highest expression rates in small intestine belong to immune-related processes (e.g. defensin α6, lysozyme 1) or nutrient function (e.g. fatty-acid binding protein 2, cysteine rich protein 1), in colon most abundant transcripts include effectors of electrolyte transport (e.g. carboanhydrase 1) and mucosal protection (anterior gradient 2, serine protease inhibitor Kazal-type 4) (a complete list of expression levels is provided in Additional file 2). In total, the detection level of observed transcripts spans several orders of magnitude, a strong fraction of genes show detection levels between 1-10 FPKM (small intestine: 37.20%, colon: 38.87%), about 85% (small intestine: 84.72%, colon 85.83%) of all detected genes showed expression levels between 0.1 FPKM and 100 FPKM (Figure 2C).
For a more general view on differences between the investigated tissues, we performed gene ontology (GO) analysis on subsets of tissue-specific transcripts (i.e. not supported by at least 5 reads and > 2 SPs out of approx. 28 mio. uniquely mapped reads in one of the tissue libraries, but present in the other library). Interestingly, we found a significant enrichment for the GO term cell-cell signaling, ion transport and immune response in both the subsets of colon- and small intestine-specific transcripts, whereas genes supporting the term metabolic processes that would be expected in both tissues were significantly depleted in both samples. Although few of the transcripts underlying each term overlap, the findings strengthen the hypothesis that processes like cell-cell signaling and ion transport are indeed pivotal regulators of tissue identity. Furthermore, the results also strengthen the view that fundamentally different immune processes occur in small and large intestine and may reflect a higher abundance of the MALT in the small intestine. Results of investigated GO terms are listed in Additional file 3. Figure 2D shows enrichment or depletion of mentioned gene ontology terms
Benchmarking of applied screening protocol and comparison to microarray
Novel identified transcriptionally active regions are clustered in neighbourhood of known genes and orientation is usually in sense orientation to the related transcript
Basic expression of nTARs is reduced compared to related genes, but can differ between intestinal tissues
Our findings describe a novel two-step RNA-Seq approach to systematically identify novel transcribed elements and for the first time present a view on the landscape of gene expression of the murine intestinal tract by means of massively parallel sequencing. Using this method we demonstrate high intestinal transcriptome complexity with expression of 74.1% of RefSeq annotated transcripts. The observed values for uniquely mappable reads are similar to other RNA-Seq studies employing murine and human complex tissues [9, 19] Compared to other tissues like the brain investigated by massively parallel sequencing (58.7% of known genes were reported as expressed in embryonic and neonatal mouse brain ), the intestine thus shows a higher complexity at a molecular level and the majority of genes are present in both small intestine and colon. RNA-Seq shows almost no background noise and allows an absolute quantification of transcripts . Thus, it is of note that our study clearly demonstrates the tissue-specific absence of certain transcripts, which are covered not even by a single sequence in the small intestine, but present at a relevant per base coverage in the colon (e.g. H+/K+-Transporter Atp12a) or vice versa are only present in the small intestine (e.g. type 2 glucose transporter SLC2A2). Highly abundant transcripts observed in our data sets relate to earlier microarray studies, several of these transcripts have been shown to be strongly expressed in the intestine . Several of the most abundant transcripts are well known players in intestinal physiology, e.g. carbonic anhydrase 1 in the colon or fatty acid binding protein 2 in small intestine tissue. In addition, for exclusively expressed transcripts highest significance values for the enrichment of certain gene ontology terms were found in processes clearly associated with the investigated tissue (e.g. ion transport, cell-cell signaling).
As the detection and quantification of transcripts presented here is based only on a limited number of datasets from two different tissues of the same individuals, conclusions about transcripts as being present in either only colon or only small intestine remain clearly descriptive. Transcripts displaying strong differential expression between colon and small intestine, should be considered only as exemplary observations and may indicate biological processes that are more prominent in one tissue over the other. Using the data as a first blueprint, it will be interesting to discriminate the roles of absent gene expression and rare transcripts in the determination of intestinal tissue identity and function.
Yet, several advantages can clearly be identified in this benchmarking study: (a) Digital gene expression analysis by RNA-Seq has a wide (and to experimental requirements adaptable ) dynamic range and also allows a detailed picture of extremely rare transcript forms. (b) Unlike microarrays RNA-Seq is not limited to the detection of a priori determined sequences and thus allows the detection of unknown transcripts. We have chosen a two-step approach to identify and validate novel transcribed elements that result in polyadenylated transcripts using two independent RNA-Seq library preparation methods. For the intestine we show a high number of non-annotated regions of transcriptional activity, 20,699 of these could be verified by an independent protocol emphasizing the still limited knowledge on tissue-specific mammalian transcriptome signatures. Interestingly the classification of nTARs in relation to annotated transcripts confirmed a strong clustering in the vicinity of known gene as recently reported for other nTARs in different tissues from mouse  and human cell lines . Even though there is evidence for still unknown transcripts expressed in the intestine (NGA), the more considerable lack of information seems to be in the fine structure of known gene loci. (c) The method allows for a simple discrimination and annotation of read strandedness and thus allows for a deeper insight into identified transcriptionally active regions. As example we have focused on the sense-antisense distribution of novel RNA sequences in the vicinity of known transcripts. The majority of identified gene-associated nTARs are in sense orientation, although a distinct number of pure antisense elements and also mixed nTARs could be identified. Most of intronic nTARs are expressed at a lower level when compared to adjacent or directly linked genes. Thus, some of the detected nTARs may also display premature, non-spliced RNA molecules. However, it is plausible that the many of the transcriptionally active regions in the vicinity of known genes are representing tissue-specific modulatory events. In particular, we demonstrate an unprecedented diversity of nTARs at the 5'or 3' border of known genes, which are realized both in sense and antisense direction. While some of the antisense findings may point to novel regulatory antisense transcripts , the finding of sense nTARs downstream of known genes highlight the leakiness of many of the known polyadenylation signals [35, 36] and point to a highly diverse and tissue-specific realization of 3'-untranslated regions.
In summary, the current study provides a public data resource for other researchers (e.g. for the identification of context-dependent transcript isoform and/or regulatory antisense transcript expression) and demonstrates the power of RNA-Seq approaches in order to identify novel strand-specific transcriptional units. Our observations may point to complex and so far undetected sense/antisense regulation events in many of the transcripts that warrant functional in-depth investigation and may ultimately lead to novel insights into intestinal biology.
Materials and methods
Total RNA was isolated from liquid nitrogen frozen intestinal tissues of in total 6 9-10-weeks old C57B6 mice (housed under SPF conditions) either using RNeasy mini kit (Qiagen) followed by mRNA enrichment with Oligotex mRNA purification kit (Qiagen) for SMART sequencing or mirVana miRNA isolation kit (Ambion) for use with whole transcriptome analysis kit (WTAK, Ambion). RNA was isolated from either total small intestine tissue (jejunum) or colon tissue (distal colon).
Mice were maintained in a 12-h light-dark cycle under standard conditions and were provided with food and water ad libitum. Procedures involving animal care were conducted in conform to national and international laws and policies.
500 ng enriched mRNA was used for SMART cDNA synthesis. For second strand synthesis and amplification, a 5'-biotinylated version of PCR primer II was employed. 13 cycles of amplification led to a yield of more than 2 µg cDNA. Subsequently, SOLiD V2 fragment library protocol (Applied Biosystems) was applied and transcript ends were depleted by two rounds of Dynabeads M-280 streptavidin (Invitrogen) treatment. For SOLiD WTAK (RNA fragmentation protocol) 10 µg total RNA was enriched for polyadenylated RNA and used as input for library construction following manufacturer's instructions (Applied Biosystems). First type of libraries (cDNA fragmented) was sequenced on a SOLiD V2 and V2.5 (replicates) sequencing by ligation sequencer following manufacturer's instructions, second type of libraries (RNA fragmented) on a SOLiD V4. The full datasets have been submitted to a public data repository (Gene Expression Omnibus, http://www.ncbi.nlm.nih.gov/geo accession number: GSE21746).
Colour space reads (.csfasta) were mapped against the mouse genome reference (mm9). For matching SOLiD™ BioScope™ Software V1.2.1 (Applied Biosystems) was employed using a mismatch penalty of -2, i.e. the mapping pipeline first searches for short matches between a read and the reference. For this initial seed we used 30 bp for the 35 bp reads, allowing for up to 3 mismatches and a 38 bp seed for 50 bp reads with up to 3 mismatches. Additionally we used a repetitive mapping scheme for the 50 bp reads, with a 25 bp seed and up to 2 mismatches. Successfully placed seeds are then extended, adding +1 to the score for every match and using a penalty of 2 for each mismatch. Finally the shortest of best scored alignments is chosen, for details see Bioscope user manual at: http://www3.appliedbiosystems.com. The coverage custom tracks represent the visualization of the SAMtools pileup output .
Oligonucleotide DNA microarray hybridization
Total RNA was processed as previously described  and hybridized to an Affymetrix Mouse 430 2.0 array (Affymetrix Inc, Santa Clara, CA) according to the manufacturer's protocol. Data was normalized using RMA (AGCC, Affymetrix) and signals with a detection p-value of = 0.05 were considered as present. Experimental and analytical part of the microarray analysis was performed following the MIAME standards. The datasets have been submitted to a public data repository (Gene Expression Omnibus, http://www.ncbi.nlm.nih.gov/geo accession number: GSE21746).
Transcript expression rates were calculated using the bioscope *.bam output files and transcript annotations from the UCSC homepage (refGene table of build mm9, February, 21th 2011). To interrogate expression levels we calculated FPKM using the published tool Cufflinks. As we rely on a small sample set we defined a conservative value of 3-fold difference between the two tissues in order to filter for potentially interesting results. A present/absent threshold was set to 0.01 FPKM as reported previously . Present transcripts were required to have at least two independent start points. Gene Ontology analysis was performed as previously published  by comparing genes present or absent only in either colon or small intestine. Biological processes associated to the transcripts were retrieved from the Gene Ontology Consortium (http://www.geneontology.org).
Gene saturation plot and estimation of total detectable RefSeq transcripts by regression analyses
To further improve the estimation of total expressed genes, the second intersection of this initial regression curve with the experimental collected data points has been determined and for the points on the right side of the intersection another non-linear regression curve has been calculated. This has been repeated until the correlation of the regression curve reached 0.99 and no further improvement could be achieved.
Detection of tissue-specific increased gene expression
nTAR detection and classification algorithm
To investigate hitherto unannotated but transcriptionally active regions (nTARs) the *.bam files were screened for chained, covered bases which were not present in investigated databases (refGene, ensGene). In order to use a conservative strategy and to avoid a high false positive rate (e.g. around exon/intron boundaries) we chose minimum nTAR length of 50 bp and defined a detection threshold of 5 reads and two independent start points. Although deeper annotations (e.g. FANTOM, ENCODE) exist, we have chosen a design similar to a previous study  using a combination of RefSeq and ENSEMBL as standard gene databases to detect novel elements. Depending on the position of the nTAR relative to annotated genes 9 classes were defined. nTARs with a distance to annotated genes greater than 10 kb were classified as non-gene associated (NGA) events. nTARs within the 10 kb range which did not start or end right beside to annotated genes were classified as upstream or downstream gene neighbourhood (UGN, DGN). nTARs starting or ending right beside to annotated genes were classified as upstream or downstream gene intersections (UGI, DGI). All other nTARs were located within annotated genes. nTARs extending exons were classified as exon-linked up- or downstream (ELU, ELD) events. Intragenic elements (IGE) were defined by no overlap with exons, whereas intron spanning elements (ISE) covered a whole intron. Hits fitting into several classes were counted only once following a priority list: ISE, ELD/ELU, UGI/DGI, IGE, UGN/DGN, NGA. Hits in neighbourhood of two genes were assigned to the closest one. To verify the nTARs, reads derived from the RNA fragmentation protocol were investigated. Validation of an nTAR required at least 3 reads from the RNA fragmentation protocol. For all nTARs, RNA fragmentation protocol reads were counted separately for sense and antisense direction (compared to the respective gene). Method and equations for calculation of the detection ratio of nTARs to related genes and differential regulation of nTARs in different tissues can be found in SI methods.
We gratefully appreciate the technical assistance of Lena Bossen and Melanie Friskovec. We thank Kai Lao for helpful advice. The work was supported by the BMBF Network "Systematic Genomics of chronic inflammation" GP9/GP10, the DFG Clusters of excellence Inflammation at Interfaces and Future Ocean and the SFB 877 subproject B9 (SS and PR). UCK was supported by the Dr. Helmut Robert memorial foundation.
- Gerhard DS, Wagner L, et al: The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC). Genome Res. 2004, 14: 2121-2127.PubMedView Article
- Carninci P, Kasukawa T, et al: The transcriptional landscape of the mammalian genome. Science. 2005, 309: 1559-1563.PubMedView Article
- Kawai J, Shinagawa A, et al: Functional annotation of a full-length mouse cDNA collection. Nature. 2001, 409: 685-690. 10.1038/35055500.PubMedView Article
- Okazaki Y, Furuno M, Kasukawa T, et al: Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature. 2002, 420: 563-573. 10.1038/nature01266.PubMedView Article
- Birney E, Stamatoyannopoulos JA, Dutta A, et al: Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007, 447: 799-816. 10.1038/nature05874.PubMedView Article
- Gerstein M, Snyder M, Wang : RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009, 10: 57-63. 10.1038/nrg2484.PubMed CentralPubMedView Article
- Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M: The transcriptional landscape of the yeast genome defined by RNA sequencing. Science. 2008, 320: 1344-1349. 10.1126/science.1158441.PubMed CentralPubMedView Article
- Gan Q, Chepelev I, Wei G, Tarayrah L, Cui K, Zhao K, Chen X: Dynamic regulation of alternative splicing and chromatin structure in Drosophila gonads revealed by RNA-seq. Cell Res. 2010
- Sultan M, Schulz MH, Richard H, Magen A, Klingenhoff A, Scherf M, Seifert M, Borodina T, Soldatov A, Parkhomchuk D, Schmidt D, O'Keeffe S, Haas S, Vingron M, Lehrach H, Yaspo M-L: A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science. 2008, 321: 956-60. 10.1126/science.1160342.PubMedView Article
- Hillier LW, Reinke V, Green P, Hirst M, Marra MA, Waterston RH: Massively parallel sequencing of the polyadenylated transcriptome of C. elegans. Genome Res. 2009, 19: 657-666. 10.1101/gr.088112.108.PubMed CentralPubMedView Article
- Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008, 5: 621-8. 10.1038/nmeth.1226.PubMedView Article
- Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ: Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. 2008, 40: 1413-5. 10.1038/ng.259.PubMedView Article
- Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E, Veyrieras J-B, Stephens M, Gilad Y, Pritchard JK: Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature. 2010, 464: 768-772. 10.1038/nature08872.PubMed CentralPubMedView Article
- Klevebring D, Bjursell M, Emanuelsson O, Lundeberg J: In-depth transcriptome analysis reveals novel TARs and prevalent antisense transcription in human cell lines. PLoS ONE. 2010, 5: e9762-10.1371/journal.pone.0009762.PubMed CentralPubMedView Article
- Cloonan N, Forrest ARR, Kolle G, Gardiner BBA, Faulkner GJ, Brown MK, Taylor DF, Steptoe AL, Wani S, Bethel G, Robertson AJ, Perkins AC, Bruce SJ, Lee CC, Ranade SS, Peckham HE, Manning JM, McKernan KJ, Grimmond SM: Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat Methods. 2008, 5: 613-9. 10.1038/nmeth.1223.PubMedView Article
- Guttman M, Garber M, Levin JZ, Donaghey J, Robinson J, Adiconis X, Fan L, Koziol MJ, Gnirke A, Nusbaum C, Rinn JL, Lander ES, Regev A: Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol. 2010, 28: 503-510. 10.1038/nbt.1633.PubMed CentralPubMedView Article
- Barbacioru C, Wang Y, Nordman E, Lee C, Xu N, Wang X, Bodeau J, Tuch BB, Siddiqui A, Lao K, Surani MA, Tang : mRNA-Seq whole-transcriptome analysis of a single cell. Nat Methods. 2009, 6: 377-382. 10.1038/nmeth.1315.PubMedView Article
- Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L: Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010, 28: 511-515. 10.1038/nbt.1621.PubMed CentralPubMedView Article
- Han X, Wu X, Chung W-Y, Li T, Nekrutenko A, Altman NS, Chen G, Ma H: Transcriptome of embryonic and neonatal mouse cortex by high-throughput RNA sequencing. Proc Natl Acad Sci USA. 2009
- Parkhomchuk D, Borodina T, Amstislavskiy V, Banaru M, Hallen L, Krobitsch S, Lehrach H, Soldatov A: Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res. 2009
- Matkovich SJ, Zhang Y, Van Booven DJ, Dorn GW: Deep mRNA sequencing for in vivo functional analysis of cardiac transcriptional regulators: application to Galphaq. Circ Res. 2010, 106: 1459-1467. 10.1161/CIRCRESAHA.110.217513.PubMed CentralPubMedView Article
- Bates MD, Erwin CR, Sanford LP, Wiginton D, Bezerra JA, Schatzman LC, Jegga AG, Ley-Ebert C, Williams SS, Steinbrecher KA, Warner BW, Cohen MB, Aronow BJ: Novel genes and functional relationships in the adult mouse gastrointestinal tract identified by microarray analysis. Gastroenterology. 2002, 122: 1467-1482. 10.1053/gast.2002.32975.PubMedView Article
- Schröder N, Sekhar A, Geffers I, Müller J, Dittrich-Breiholz O, Kracht M, Wedemeyer J, Gossler A: Identification of mouse genes with highly specific expression patterns in differentiated intestinal epithelium. Gastroenterology. 2006, 130: 902-907. 10.1053/j.gastro.2005.12.025.PubMedView Article
- Clevers H: At the crossroads of inflammation and cancer. Cell. 2004, 118: 671-674. 10.1016/j.cell.2004.09.005.PubMedView Article
- Schreiber S, Rosenstiel P, Albrecht M, Hampe J, Krawczak M: Genetics of Crohn disease, an archetypal inflammatory barrier disease. Nat Rev Genet. 2005, 6: 376-388.PubMedView Article
- Jager M, Ott C-E, Grunhagen J, Hecht J, Schell H, Mundlos S, Duda GN, Robinson PN, Lienau J: Composite Transcriptome Assembly of RNA-seq data in a Sheep Model for Delayed Bone Healing. BMC Genomics. 2011, 12: 158-10.1186/1471-2164-12-158.PubMed CentralPubMedView Article
- Jiang H, Wong WH: Statistical inferences for isoform expression in RNA-Seq. Bioinformatics. 2009, 25: 1026-1032. 10.1093/bioinformatics/btp113.PubMed CentralPubMedView Article
- Kuo WP, Liu F, Trimarchi J, Punzo C, Lombardi M, Sarang J, Whipple ME, Maysuria M, Serikawa K, Lee SY, McCrann D, Kang J, Shearstone JR, Burke J, Park DJ, Wang X, Rector TL, Ricciardi-Castagnoli P, Perrin S, Choi S, Bumgarner R, Kim JH, Short GF, Freeman MW, Seed B, Jensen R, Church GM, Hovig E, Cepko CL, Park P, Ohno-Machado L, Jenssen T-K: A sequence-oriented comparison of gene expression measurements across different hybridization-based technologies. Nat Biotech. 2006, 24: 832-840. 10.1038/nbt1217.View Article
- Rhead B, Karolchik D, Kuhn RM, Hinrichs AS, Zweig AS, Fujita PA, Diekhans M, Smith KE, Rosenbloom KR, Raney BJ, Pohl A, Pheasant M, Meyer LR, Learned K, Hsu F, Hillman-Jackson J, Harte RA, Giardine B, Dreszer TR, Clawson H, Barber GP, Haussler D, Kent WJ: The UCSC Genome Browser database: update 2010. Nucleic Acids Res. 2010, 38: D613-619. 10.1093/nar/gkp939.PubMed CentralPubMedView Article
- He Y, Vogelstein B, Velculescu VE, Papadopoulos N, Kinzler KW: The antisense transcriptomes of human cells. Science. 2008, 322: 1855-1857. 10.1126/science.1163853.PubMed CentralPubMedView Article
- Torres TT, Metta M, Ottenwälder B, Schlötterer C: Gene expression profiling by massively parallel sequencing. Genome Res. 2008, 18: 172-177.PubMed CentralPubMedView Article
- Feng L, Liu H, Liu Y, Lu Z, Guo G, Guo S, Zheng H, Gao Y, Cheng S, Wang J, Zhang K, Zhang Y: Power of Deep Sequencing and Agilent Microarray for Gene Expression Profiling Study. Mol Biotechnol. 2010
- van Bakel H, Nislow C, Blencowe BJ, Hughes TR: Most "dark matter" transcripts are associated with known genes. PLoS Biol. 2010, 8: e1000371-10.1371/journal.pbio.1000371.PubMed CentralPubMedView Article
- Yelin R, Dahary D, Sorek R, Levanon EY, Goldstein O, Shoshan A, Diber A, Biton S, Tamir Y, Khosravi R, Nemzer S, Pinner E, Walach S, Bernstein J, Savitsky K, Rotman G: Widespread occurrence of antisense transcription in the human genome. Nat Biotechnol. 2003, 21: 379-386. 10.1038/nbt808.PubMedView Article
- Takagaki Y, Seipelt RL, Peterson ML, Manley JL: The polyadenylation factor CstF-64 regulates alternative processing of IgM heavy chain pre-mRNA during B cell differentiation. Cell. 1996, 87: 941-952. 10.1016/S0092-8674(00)82000-0.PubMedView Article
- Tian B, Hu J, Zhang H, Lutz CS: A large-scale analysis of mRNA polyadenylation of human and mouse genes. Nucleic Acids Res. 2005, 33: 201-212. 10.1093/nar/gki158.PubMed CentralPubMedView Article
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup: The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009, 25: 2078-2079. 10.1093/bioinformatics/btp352.PubMed CentralPubMedView Article
- Häsler R, Begun A, Freitag-Wolf S, Kerick M, Mah N, Zvirbliene A, Spehlmann ME, von Wurmb-Schwark N, Kupcinskas L, Rosenstiel P, Schreiber S: Genetic control of global gene expression levels in the intestinal mucosa: a human twin study. Physiol Genomics. 2009, 38: 73-79. 10.1152/physiolgenomics.00010.2009.PubMedView Article
- Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, Church GM: Systematic determination of genetic network architecture. Nat Genet. 1999, 22: 281-285. 10.1038/10343.PubMedView Article
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.