Skip to main content

Table 1 Transcriptome assembly statistics per species. The initial number of reads used, the number of reads after Trimmomatic processing, the number of initially assembled transcripts, the empirical mean insert size of the RNA-Seq libraries, the number of distinct 21-mers, the number of transcripts removed by CroCo, and the final number of transcripts, as well as the mean transcript length and number of bases in the final assemblies are shown. The BUSCO score is given as the percentage of complete (C) genes—divided into present as single copies (S) or duplicates (D)—and fragmented (F) genes of the 978 metazoa gene set. The next three rows detail the TransRate score, the number of transcripts remaining after TransDecoder translation and CD-HIT clustering, and the number of transcripts considered in the DE analysis. Below this a summary of the results from the Trinotate annotations giving the number of transcripts (and the corresponding percentage of the whole transcriptome in brackets) with a given annotation: ORF, contains a predicted open reading frame; BLASTX, the predicted ORF and/or the entire transcript produced a hit in the protein database; Pfam, a protein family domain was found; SignalP, a signal peptide was detected; TMHMM, a transmembrane helix is predicted

From: RNA-Seq of three free-living flatworm species suggests rapid evolution of reproduction-related genes

Assembly statistics M. hystrix M. spirale M. pusillum
Initial reads 160,231,340 173,766,431 157,755,458
Reads post trimming 148,699,208 160,248,517 147,615,465
Mean insert size 146 143 145
Distinct 21-mers 160,907,099 235,628,648 194,772,389
Assembled transcripts 169,758 296,658 177,453
Removed transcripts 217 156 274
Final transcripts 169,541 296,502 177,179
Mean transcript length 1094 764 756
Number of bases 185,792,353 226,578,146 134,085,334
BUSCO score
(Metazoa gene set)
C: 90.1
S: 49.3
D: 40.8
F: 3.4
C: 87.8
S: 37.3
D: 50.5
F: 4.7
C: 89.2
S: 55.8
D: 33.4
F:4.1
TransRate score 0.28 0.29 0.28
CD-HIT transcripts 53,132 74,135 53,416
DESeq2 transcripts 43,126 66,139 41,418
Annotation
 ORF 59,889 (35.3) 70,808 (23.9) 49,456 (27.9)
 BLASTX 47,837 (28.2) 50,033 (16.9) 42,940 (24.2)
 Pfam 42,330 (25.0) 43,840 (14.8) 34,726 (19.6)
 SignalP 6486 (3.8) 6601 (2.2) 5380 (3.0)
 TMHMM 15,399 (9.1) 16,322 (5.5) 14,537 (8.2)