Skip to main content
Fig. 5 | BMC Genomics

Fig. 5

From: The effects of sequencing depth on the assembly of coding and noncoding transcripts in the human genome

Fig. 5

Single cell data reveal the heterogeneity of transcripts assembled from different data sources and read depths. The top 3,000 barcodes, corresponding to 3,000 cells were extracted from the alignment of c11/S0730 cell line. Panels A-B and E–F show the results from short-read transcript assembly using SRR597912 and SRR597895, and panels C-D- and G-H show the results for the transcript assembly from long-read H9 sample. The normalized count of the coding (panel A) and noncoding (panel B) transcripts are positively correlated to the percentages of cells in which the transcript expressions are detected in scRNA-seq data. The normalized counts of the coding (panel C) and noncoding (panel D) transcripts from long-read assembly are positively correlated to the percentages of cells with detectable expression in scRNA-seq data. The means of the expression levels and percentages of cells with detectable expressions are shown in red fonts in panels A-D. The numbers of detectable transcripts in scRNA-seq data were compared for transcripts assembled from different read numbers of short-read (panels E and F) and long-read (panels G and H) data. For coding (panel E) and noncoding (panel F) transcripts, more transcripts assembled from 200 million reads were detected in scRNA-seq data than the transcripts assembled from 10 million reads. More coding (panel G) and noncoding (panel H) transcripts detectable from scRNA-seq data were found for the assembly from 75% long-read data than for the assembly from 25% long-read data. For panels E–H, each point represents a cell. The diagonals represent the points of equal numbers of transcripts from the compared assemblies

Back to article page