From: Characterization of a second secologanin synthase isoform producing both secologanin and secoxyloganin allows enhanced de novo assembly of a Catharanthus roseus transcriptome

Composition of the clustered dataset (CD97) resulting from the processing of the combination of all single assemblies with CD-HIT-EST at 97 % identity. a Integration of contigs from single assemblies into clusters. Contigs which cannot be grouped with others are called singletons. True clusters, i.e. containing at least two different contigs, may have been formed by the combination of contigs from one or more initial single assemblies. b Correlation plot of single assemblies. Contigs found in each cluster (singletons and true clusters) were identified and counted per initial assembly. Two initial assemblies are therefore strongly correlated (Pearson Correlation Coefficient) if their contigs are found in the same clusters. c Composition of true clusters. This graph shows how many single assemblies are represented within clusters

