Whole transcriptome analyses are an essential tool for understanding disease mechanisms. Approaches based on next-generation sequencing provide fast and affordable data but rely on the availability of annotated genomes. However, there are many areas in biomedical research that require non-standard animal models for which genome information is not available. This includes the Syrian hamster Mesocricetus auratus as an important model for dyslipidaemia because it mirrors many aspects of human disease and pharmacological responses. We show that complementary use of two independent next generation sequencing technologies combined with mapping to multiple genome databases allows unambiguous transcript annotation and quantitative transcript imaging. We refer to this approach as “triple match sequencing” (TMS).
Contigs assembled from a normalized Roche 454 hamster liver library comprising 1.2 million long reads were used to identify 10’800 unique transcripts based on homology to RefSeq database entries from human, mouse, and rat. For mRNA quantification we mapped 82 million SAGE tags (SOLiD) from the same RNA source to the annotated hamster liver transcriptome contigs. We compared the liver transcriptome of hamster with equivalent data from human, rat, minipig, and cynomolgus monkeys to highlight differential gene expression with focus on lipid metabolism. We identify a cluster of five genes functionally related to HDL metabolism that is expressed in human, cynomolgus, minipig, and hamster but lacking in rat as a non-responder species for lipid lowering drugs.
The TMS approach is suited for fast and inexpensive transcript profiling in cells or tissues of species where a fully annotated genome is not available. The continuously growing number of well annotated reference genomes will further empower reliable transcript identification and thereby raise the utility of the method for any species of interest.