Skip to main content
Fig. 3 | BMC Genomics

Fig. 3

From: The effects of sequencing depth on the assembly of coding and noncoding transcripts in the human genome

Fig. 3

Effect of sequencing depth on the numbers and correctness of transcripts assembled from simulated short-read data. 10 million (n = 20), 20 million (n = 10), 50 million (n = 4), 100 million (n = 2) and 200 million (n = 1) hPSC reads were simulated using SRR597912 and SRR597895 reads. Transcript assembly and coding potential evaluation were done for each simulated read sample. The relationship between the number of reads and transcript counts (panel A), coding to noncoding ratio (panel B), sensitivity (panel C) and precision (panel D) are presented for each sample. The transcript assembly obtained from the merged SRR597912 and SRR597895 reads was used as the reference for the estimation of sensitivity and precision. E. Sensitivity is positively correlated with the number of reads for GENCODE-annotated and novel transcripts. Samples with the same read numbers produced similar numbers of transcripts. F. The numbers of samples for which different categories of transcripts were correctly assembled. The assemblies of 20 replicates of 10 million reads were investigated to highlight correctly assembled transcripts. G. The proportion of TE-containing transcripts was similar across various read depths. The sensitivity of assemblies of coding transcripts (H) and noncoding transcripts (I) showing the differences between TE-containing and TE-lacking transcripts. J. Cumulative expression of assembled transcripts. The transcripts were ranked based on expression level, and the cumulative expression is presented in percentages. K. Cumulative precision of transcripts based on the expression levels. Based on the expression levels and the cumulative precision in 200 and 10 million read assemblies (lower panel), the transcripts were grouped into three classes. The upper panel shows the percentages of GNCODE-annotated transcripts in the three classes of transcripts. L. Cumulative coding to noncoding ratio varied with expression levels and the numbers of reads

Back to article page