Skip to main content

Table 1 Comparison of the existing virome pipelines tools

From: Vipie: web pipeline for parallel characterization of viral populations from multiple NGS samples

Pipeline Tool

Vipie

ViromeScan [11]

VirusTAP [8]

Virome [16]

Metavir [14]

Taxonomer [6]

MetaShot [12]

Primary goal

Parallel analysis of multiple viral metagenomes from web and suited for molecular epidemiology studies.

To profile viromes using databases of existing eukaryotic viruses without assembly.

Identification of viruses in a sample, after a thorough elimination of known non-viral sequences.

Classification of all putative ORF found in a viral metagenome, characterization of viral communities.

Analysis of virome, diversity metrics and marker gene phylogenies.

Ultra fast metagenomics analysis focusing on detection of microorganisms, including virus and bacterial.

Highly accurate and comprehensive workflow for host-associate microbiome classification on multiple samples.

Web based

Yes.

No.

Yes.

Yes (Flash required).

Yes.

Yes.

No.

Outputs

Interactive table, plots and raw downloads. Clustered heatmaps with dynamic group assignment re-plots.

Static population pie charts. Sample based clustered heatmaps.

Contig based hits and seamless web BLAST interface.

Rich collection of sample source virome ORF and sequence categories.

Comparative analysis of viromes and annotations including networks, nonmetric distance and tree maps.

Interactive pie charts with kingdoms in bins and also impressive sunburst flare sub classifiers.

A Krona graph and Interactive Taxonomy HTML table along with csv file.

Source data

Paired-end reads; fastq format.

Sinle-end or paired-end reads; fastq format.

Paired-end reads.

Accepts also single-end reads; fastq format.

sff, or fastq; intended for the 454-generated metagenomes.

Reads (>300 bases) or assembled contigs.

Paired-end reads in fastq and fasta formats.

Paired-end reads in fastq format.

Trimming and filtering

YES, as the first step.

YES, after selection of viral reads, at the level of a bam file.

YES, as the first step.

YES: quality based; duplicate filtering; contamination

Not specified.

Not specified.

YES, as the first step.

De-novo assembly

YES, a choice of assemblers.

No.

YES, a choice of assemblers; done after subtraction steps.

No.

No.

No.

No.

Subtraction of human ref. and bacterial ribosomal sequences

Optional, only for the output of dark matter sequences.

YES, using Human Best Match Tagger. No for ribosomal.

YES, also other host databases available (mouse etc.).

Not specified for human. Ribosome is removed using BLAST against rDNA db.

Not specified.

Not subtracted but reported as part of detection.

Yes, reports identification of human host reads and bacterial mappings.

Means of virus identification

(a) BLAST against a pan-viral database.

(b) Remapping of original reads to the identified candidates.

Mapping to the members of the virus database using bowtie2 [24].

BLAST search against the NCBI nt database.

Protein BLASTP upon two databases. Several tiers of classification of the ORFs.

Not specified.

Taxonomer Binner DB with 21Ā bp kmers unique identifiers to known viruses.

Custom similarity workflow with hamming distance.

Virus database for identification

A custom database containing 20759 human, animal, plant and bacterial viruses.

Eukaryotic viruses only. Four custom databases available for download.

Specificity is maintained by the subtraction steps prior to assembly and BLAST search.

UniRef 100 peptide database, five annotated protein databases, MetaGenomes On-line.

GAAS tool (https://sourceforge.net/projects/gaas/).

Binner DB needs to be built using KAnalyze [42] (https://sourceforge.net/projects/kanalyze/files/).

TANGO [43] and NCBI Taxonomy [44].

Action when a read maps to different viruses

Score is split among the hit reference sequences.

Not specified.

Not specified.

Not specified.

Not specified.

Assigns as ambiguous.

Parsed for human endogenous retrovirus otherwise classify as ambiguous and discarded.

  1. Most tools use BLAST [23] for initial detection of known references. Vipie uniquely allows web parallel analysis of multi-samples and accounts read hits to multiple viral references for comprehensive population profiling