Skip to main content

Table 2 Size of proteomic, simulated proteogenomic, and real proteogenomic databases for human

From: Evaluating the effect of database inflation in proteogenomic search on sensitive and reliable peptide identification

Database (target + decoy) # Target (AA) # Decoy (AA)
Proteomic 1Th + 1Dh 35,856,033 35,856,033
Simulated proteogenomic 1T1Dh + 2Dh 71,712,066 71,712,066
1T2Dh + 3Dh 107,568,099 107,568,099
1T5Dh + 6Dh 215,136,198 215,136,198
Real proteogenomic 6FTTh + 6FTDh 2,136,069,837 2,136,069,837
SGTh + SGDh 123,364,545 123,364,545
  1. Database sizes are measured by total length (AA) of contained peptides. 1Th: human reference protein database. nDh: decoy database of which size is n times of 1Th. 6FTTh: proteogenomic database constructed by 6-frame translation of human genome. 6FTDh: decoy database for 6FTTh. SGTh: proteogenomic database constructed by splicing information from human RNA sequencing data. SGDh: decoy database for SGTh