Skip to main content
Fig. 1 | BMC Genomics

Fig. 1

From: High-throughput and quantitative genome-wide messenger RNA sequencing for molecular phenotyping

Fig. 1

DeTCT pipeline workflow. Between nine and 11 pairs of mutant and normal zebrafish embryos were collected from one clutch and RNA extracted. a Following DNaseI treatment and chemical fragmentation, molecules representing the 3′ end of transcripts were enriched by pulldown using an anchored biotinylated oligo dT primer attached to streptavidin magnetic beads (orange line). An RNA oligo matching part of the Illumina read 2 adapter (purple line) was ligated onto the 5′ end, the RNA eluted and annealed to an oligo comprising partial read 1 Illumina adapter (dark blue line) followed by 12 random bases (beige line), then an eight base indexing sequence specific to each sample (light blue line) and finally a 14 base anchored polyT sequence (grey line). After reverse transcription the Illumina adapter sequences were completed during library amplification. Libraries were quantified, pooled in equimolar amounts and sequenced by Illumina HiSeq 2500. b After decoding the indexing sequence, the trimmed zebrafish sequences (read 1 in green and read 2 in red) were mapped to the reference genome and duplicate reads were flagged. c The coordinate representing the transcript counting 3′ end (TC 3′ end) was predicted using the base immediately 3′ of the polyT sequence in read 1 (green dashed arrow and green curved line). After calling peaks using all mapped read 2s the resulting counts were associated with their respective sample (red curved line). The count data were used to identify differential transcript abundance between conditions using DESeq2 [28] and reported as a fold change with an adjusted p-value. The TC 3′ ends were matched to the closest Ensembl transcript 3′ ends on the same strand (black line). Gene list tables were produced and ordered by the lowest adjusted p-value. These gene lists were filtered for genes showing differential transcript abundance using the adjusted p-value and the proximity of the TC 3′ end and Ensembl gene end (typically adjusted p-value <= 0.05 and within -100 and +5000 bases)

Back to article page