From: Peptimapper: proteogenomics workflow for the expert annotation of eukaryotic genomes
Name | Pipeline Interface | Database-driven for peptide identifi-cation | de novo peptide inter-pretation | User-friendly for biologists | Results curation | Results visuali-zation | Description | Revelance |
---|---|---|---|---|---|---|---|---|
Peptimapper (released in 2018) | Command line, Docker image, Galaxy tools | – | √ | √ | √ | √ | Peptide Sequence Tags (PSTs) obtained from partial interpretation of ion trap mass spectra are mapped onto the six-frame translation of genomic sequences giving hits. Hits are then clustered to detect potential coding regions. Clusters are evaluated and further compared to existing gene predictions. Clusters are available as GFF file to be uploaded into a genome viewer. https://galaxy.protim.eu https://hub.docker.com/r/dockerprotim/peptimapper/ or https://docker-ui.genouest.org/app/#/container/dockerprotim/peptimapper https://github.com/laeticlo/Ectoline | Improves genome annotation |
IPAW (2018) [61] | Command line | √ | – | – | √ | – | This is an Integrated Proteomics Analysis Workflow: i) Peptide spectra are searched in two different databases in parallel: VarDB filtered by class-specific FDR for SAAV peptides and 6FT of the human genome filtered by peptides pI. ii) SAAV candidates are curated by SpectrumAI and potential novel proteins are blasted onto public databases. ii) Curated results are validated by different controls. https://github.com/yafeng/proteogenomics_python | Identification of Pseudogenes, lncRNAs, nsSNPs and somatic mutations |
JUMPg (2016) [62] | Command line | √ | – | – | √ | √ | This pipeline includes multiple customized databases construction, tag-based database search, peptide-spectrum match filtering, ans data visualization. https://github.com/gatechatl/JUMPg/ | Improves genome annotation |
PGMiner (2016) [63] | Command line | √ | – | – | √ | √ | This workflow allows acquisition of mass spectrometric data, peptide identification against preprocessed sequence databases, assignment of statistical confidence to identified peptides, and mapping confident peptides to gene models. https://github.com/olalonde/pgtools | Improves genome annotation |
PROTEO-FORMER (2015) [64] | Command line, Virtual machine, Galaxy tools | √ | – | √ | √ | √ | RIBO-seq NGS data are processed to delineates proteoforms. RIBO-seq-derived sequences are then translated and mapped to a public database, creating a custom search database for peptides to MS/MS matching. | Identification of novel translation products |
PGTools (2015) [65] | Command line | √ | – | – | √ | √ | The software is divided into 2 phases: Phase 1 contains 8 modules to analyse MS/MS data using known proteins databases. Phase 2 contains 5 modules and 7 customized databases that allow MS/MS data to be analysed against the genome. That software includes applications, libraries, customized databases and visualization tools. | Improves genome annotation |
NextSearch (2015) [66] | Command line | – | – | – | √ | √ | Nucleotide EXon-graph Transcriptome Search identifies peptides by directly searching the nucleotide exon graph against tandem mass spectra. NextSearch outputs which are the proteome-genome/transcriptome mapping that can be visualized using public tools. | Improves genome annotation |
ProteoAnnotator (2014) [52] | Command line, Stand alone application | √ | – | √ | √ | √ | MS spectrum are queried by one or several proteomics databases search engines (MASCOT, OMSSA, X!Tandem or MSGF+) and results are converted into GFF adding genome coordinates and statistical confidence values. It exports mzIdentML files. | Improves genome annotation |
Peppy (2013) [67] | Command line, Stand alone application | √ | – | N/A | √ | – | This workflow generates a peptide database from a genome, tracks peptide loci, matches peptides to MS/MS spectra and assigns FDR confidence values to those matches. | Improves genome annotation |
Protk (released in 2012) | Command line, Galaxy tools | √ | – | √ | – | √ | It is a suite of tools for proteomics providing the following analysis tasks: (i) MS/MS data search with X!Tandem, Mascot, OMSSA and MS-GF+; (ii) peptide and protein inference with Peptide Prophet, iProphet and Protein Prophet; (iii) conversion of pepXML or protXML to tabular format, and (iv) mapping of peptides to genomic coordinates https://github.com/iracooke/protk | Improves genome annotation |
IggyPep (2010) [54] | Web interface | √ | √ | N/A | – | – | The pipeline is based on a database system with advanced indexing and querying strategy, which holds the translated genome in all six reading frames. It can be queried with de novo sequences or partial peptide sequence tags (PSTs). It determines the ORF amino acid comprising these tags and compiles a fasta-formated sequence file for a database-driven search. www.iggypep.org (No more accessible) | Improves genome annotation |
PepLine (2008) [18] | Command line | – | √ | N/A | √ | – | Peptide Sequence Tags (PSTs) obtained from partial interpretation of QTOF mass spectra are mapped onto the six-frame translation of genomic sequences giving hits. Hits are then clustered to detect potential coding regions. www.grenoble.prabi.fr/protehome/software/pepline (no more accessible) | Improves genome annotation |
Workflows for Proteomics Informed by Transcriptomics (2015) [57] | Galaxy tools | √ | – | √ | √ | √ | Galaxy Integrated Omics (GIO) provides workflows for 4 common use cases: i) a standard search against a reference proteome; ii) PIT protein identification without a reference genome; iii) PIT protein identification using a genome guide; iiii) and PIT genome annotation. http://gio.sbcs.qmul.ac.uk | Improves genome annotation |
Workflows for proteogenomics studies using Galaxy-P (2014–2018) [55, 56, 58, 59] | Galaxy tools | √ | – | √ | √ | √ | These modular workflows incorporating both established and customized software tools that improve depth and quality of proteogenomic results. http://galaxyp.org | Improves genome annotation |