Skip to main content

Table 1 Transcriptome assembly statistics per species. The initial number of reads used, the number of reads after Trimmomatic processing, the number of initially assembled transcripts, the empirical mean insert size of the RNA-Seq libraries, the number of distinct 21-mers, the number of transcripts removed by CroCo, and the final number of transcripts, as well as the mean transcript length and number of bases in the final assemblies are shown. The BUSCO score is given as the percentage of complete (C) genes—divided into present as single copies (S) or duplicates (D)—and fragmented (F) genes of the 978 metazoa gene set. The next three rows detail the TransRate score, the number of transcripts remaining after TransDecoder translation and CD-HIT clustering, and the number of transcripts considered in the DE analysis. Below this a summary of the results from the Trinotate annotations giving the number of transcripts (and the corresponding percentage of the whole transcriptome in brackets) with a given annotation: ORF, contains a predicted open reading frame; BLASTX, the predicted ORF and/or the entire transcript produced a hit in the protein database; Pfam, a protein family domain was found; SignalP, a signal peptide was detected; TMHMM, a transmembrane helix is predicted

From: RNA-Seq of three free-living flatworm species suggests rapid evolution of reproduction-related genes

Assembly statistics

M. hystrix

M. spirale

M. pusillum

Initial reads

160,231,340

173,766,431

157,755,458

Reads post trimming

148,699,208

160,248,517

147,615,465

Mean insert size

146

143

145

Distinct 21-mers

160,907,099

235,628,648

194,772,389

Assembled transcripts

169,758

296,658

177,453

Removed transcripts

217

156

274

Final transcripts

169,541

296,502

177,179

Mean transcript length

1094

764

756

Number of bases

185,792,353

226,578,146

134,085,334

BUSCO score

(Metazoa gene set)

C: 90.1

S: 49.3

D: 40.8

F: 3.4

C: 87.8

S: 37.3

D: 50.5

F: 4.7

C: 89.2

S: 55.8

D: 33.4

F:4.1

TransRate score

0.28

0.29

0.28

CD-HIT transcripts

53,132

74,135

53,416

DESeq2 transcripts

43,126

66,139

41,418

Annotation

 ORF

59,889 (35.3)

70,808 (23.9)

49,456 (27.9)

 BLASTX

47,837 (28.2)

50,033 (16.9)

42,940 (24.2)

 Pfam

42,330 (25.0)

43,840 (14.8)

34,726 (19.6)

 SignalP

6486 (3.8)

6601 (2.2)

5380 (3.0)

 TMHMM

15,399 (9.1)

16,322 (5.5)

14,537 (8.2)