Skip to main content

Advertisement

Table 4 Summary of CDS and 5′ and 3’ Untranslated Region (UTR) statistics, including overlaps between gene sets and with assembled Illumina transcripts and RNAseq Illumina reads

From: Improving eukaryotic genome annotation using single molecule mRNA sequencing

Gene region Statistic AC-Orig AC-PB
All genes Overlapping genes Unique genes All genes Overlapping genes Unique genes
CDS # of genes 16,026 15,808 218 17,540 15,931 1609
Average length (bp) 962.8 962.4 994.3 894.3 922.0 619.8
Total length (kbp) 15,430.0 15,213.2 216.8 15,685.3 14,688.0 997.3
Coverage by Illumina reads % of genes 78.1% 78.0% 80.7% 79.6% 78.2% 93.7%
% of bases 66.9% 66.8% 68.7% 67.5% 66.4% 83.7%
Length not covered (kbp) 5111.7 5043.9 67.8 5104.8 4942.2 162.7
Coverage by Illumina Stringtie contigs % of genes 65.2% 65.1% 67.9% 66.9% 64.8% 84.0%
% of bases 61.8% 61.8% 64.3% 62.4% 61.2% 79.5%
Length not covered (kbp) 5895.8 5818.4 77.4 5899.6 5694.7 204.9
5’ UTR # of genes 1101 1083 18 3404 2702 702
Average length (bp) 57.7 58.2 24.3 88.2 83.9 104.8
Total length (kbp) 63.5 63.1 0.4 300.2 226.7 73.6
Coverage by Illumina reads % of genes 74.1% 74.1% 72.2% 78.7% 79.8% 74.2%
% of bases 61.5% 61.3% 89.7% 73.1% 74.7% 68.4%
Length not covered (kbp) 18.7 18.6 0.1 80.7 57.4 23.3
Coverage by Illumina Stringtie contigs % of genes 77.3% 77.2% 83.3% 79.5% 80.4% 76.2%
% of bases 70.6% 70.5% 86.7% 66.8% 69.2% 59.5%
Length not covered (kbp) 24.5 24.4 0.0 99.5 69.8 29.8
3’ UTR # of genes 1234 1218 16 6608 5363 1245
Average length (bp) 78.4 78.8 52.3 232.1 234.7 221.0
Total length (kbp) 96.8 95.9 0.8 1533.9 1258.8 275.1
Coverage by Illumina reads % of genes 50.1% 50.3% 31.3% 73.3% 74.6% 67.5%
% of bases 73.9% 73.9% 75.6% 77.0% 78.2% 71.2%
Length not covered (kbp) 31.1 30.8 0.3 513.6 402.5 111.1
Coverage by Illumina Stringtie contigs % of genes 56.3% 56.7% 25.0% 70.4% 71.5% 65.3%
% of bases 67.9% 67.9% 60.0% 66.5% 68.0% 59.6%
Length not covered (kbp) 25.3 25.1 0.2 353.1 273.9 79.2