TY - STD TI - Haft DH, Loftus BJ, Richardson DL, Yang F, Eisen JA, Paulsen IT, White O. TIGRFAMs: a protein family resource for the functional identification of proteins. Nucleic Acids Res. 2001; 29(1):41–43. doi:http://dx.doi.org/10.1093/nar/29.1.41. UR - http://dx.doi.org/10.1093/nar/29.1.41 ID - ref1 ER - TY - STD TI - Wu CH, Huang H, Yeh LSL, Barker WC. Protein family classification and functional annotation. Comp Biol Chem. 2003; 27(1):37–47. 2011. doi:http://dx.doi.org/10.1016/S1476-9271(02)00098-1. UR - http://dx.doi.org/10.1016/S1476-9271(02)00098-1 ID - ref2 ER - TY - STD TI - Brown D, Krishnamurthy N, Sjölander K. Automated protein subfamily identification and classification. PLoS Comput. Biol. 2007; 3(8). doi:http://dx.doi.org/10.1371/journal.pcbi.0030160. UR - http://dx.doi.org/10.1371/journal.pcbi.0030160 ID - ref3 ER - TY - STD TI - Liu B, Gibbons T, Ghodsi M, Treangen T, Pop M. Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences. BMC Genomics; 12(2):1–10. doi:http://dx.doi.org/10.1186/1471-2164-12-S2-S4. UR - http://dx.doi.org/10.1186/1471-2164-12-S2-S4 ID - ref4 ER - TY - STD TI - Segata N, Waldron L, Ballarini A, Narasimhan V, Jousson O, Huttenhower C. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat. Methods. 2012; 9(8):811–4. doi:http://dx.doi.org/10.1038/nmeth.2066. UR - http://dx.doi.org/10.1038/nmeth.2066 ID - ref5 ER - TY - STD TI - Nguyen N, Mirarab S, Liu B, Pop M, Warnow T. TIPP: taxonomic identification and phylogenetic profiling. Bioinformatics. 2014; 30(24):3548–3555. doi:http://dx.doi.org/10.1093/bioinformatics/btu721. UR - http://dx.doi.org/10.1093/bioinformatics/btu721 ID - ref6 ER - TY - STD TI - Nayfach S, Bradley PH, Wyman SK, Laurent TJ, Williams A, Eisen JA, Pollard KS, Sharpton TJ. Automated and accurate estimation of gene family abundance from shotgun metagenomes. PLoS Comput Biol. 2015; 11(11):1004573. doi:http://dx.doi.org/10.1371/journal.pcbi.1004573. UR - http://dx.doi.org/10.1371/journal.pcbi.1004573 ID - ref7 ER - TY - STD TI - Rost B. Twilight zone of protein sequence alignments. Protein Eng. 1999; 12(2):85–94. doi:http://dx.doi.org/10.1093/protein/12.2.85. UR - http://dx.doi.org/10.1093/protein/12.2.85 ID - ref8 ER - TY - STD TI - Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990; 215(3):403–10. 1997. doi:http://dx.doi.org/10.1016/S0022-2836(05)80360-2. UR - http://dx.doi.org/10.1016/S0022-2836(05)80360-2 ID - ref9 ER - TY - STD TI - Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res.; 25(17):3389–3402. doi:http://dx.doi.org/10.1093/nar/25.17.3389, arxiv, http://nar.oxfordjournals.org/content/25/17/3389.full.pdf+html. Accessed 21 Sept 2016. UR - http://nar.oxfordjournals.org/content/25/17/3389.full.pdf+html ID - ref10 ER - TY - STD TI - Söding J. Protein homology detection by HMM-HMM comparison. Bioinformatics. 2005; 21(7):951–960. doi:http://dx.doi.org/10.1093/bioinformatics/bti125. UR - http://dx.doi.org/10.1093/bioinformatics/bti125 ID - ref11 ER - TY - JOUR AU - Eddy, S. R. PY - 1998 DA - 1998// TI - Profile hidden Markov models JO - Bioinformatics VL - 14 UR - https://doi.org/10.1093/bioinformatics/14.9.755 DO - 10.1093/bioinformatics/14.9.755 ID - Eddy1998 ER - TY - STD TI - Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer ELL, Tate J, Punta M. Pfam: the protein families database. Nucleic Acids Res. 2014; 42(D1):222–230. doi:http://dx.doi.org/10.1093/nar/gkt1223, arxiv http://nar.oxfordjournals.org/content/42/D1/D222.full.pdf+html. Accessed 21 Sept 2016. UR - http://nar.oxfordjournals.org/content/42/D1/D222.full.pdf+html ID - ref13 ER - TY - JOUR AU - Finn, R. D. AU - Clements, J. AU - Eddy, S. R. PY - 2011 DA - 2011// TI - HMMER web server: interactive sequence similarity searching JO - Nucleic Acids Res VL - 39 UR - https://doi.org/10.1093/nar/gkr367 DO - 10.1093/nar/gkr367 ID - Finn2011 ER - TY - JOUR AU - Eddy, S. R. PY - 2009 DA - 2009// TI - A new generation of homology search tools based on probabilistic inference JO - Genome Inform VL - 23 ID - Eddy2009 ER - TY - STD TI - Skewes-Cox P, Sharpton T, Pollard K, DeRisi J. Profile hidden Markov models for the detection of viruses within metagenomic sequence data. PLOS ONE. 2014; 9. doi:http://dx.doi.org/10.1371/journal.pone.0105067. UR - http://dx.doi.org/10.1371/journal.pone.0105067 ID - ref16 ER - TY - STD TI - Remmert M, Biegert A, Hauser A, Söding J. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods. 2012; 9(2):173–175. doi:http://dx.doi.org/10.1038/nmeth.1818. UR - http://dx.doi.org/10.1038/nmeth.1818 ID - ref17 ER - TY - STD TI - Söding J, Biegert A, Lupas AN. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 2005; 33(Web Server issue):244–8. 1995. doi:http://dx.doi.org/10.1093/nar/gki408. UR - http://dx.doi.org/10.1093/nar/gki408 ID - ref18 ER - TY - STD TI - Perdigao N, Heinrich J, Stolte C, Sabir KS, Buckley MJ, Tabor B, Signal B, Gloss BS, Hammang CJ, Rost B, Schafferhans A, O’Donoghue SI. Unexpected features of the dark proteome. Proc Natl Acad Sci USA. 2015; 112(52):15898–15903. doi:http://dx.doi.org/10.1073/pnas.1508380112. UR - http://dx.doi.org/10.1073/pnas.1508380112 ID - ref19 ER - TY - STD TI - Qian B, Goldstein RA. Detecting distant homologs using phylogenetic tree-based HMMS. Proteins: Structure, Function and Genetics. 2003; 52(3):446–453. doi:http://dx.doi.org/10.1002/prot.10373. UR - http://dx.doi.org/10.1002/prot.10373 ID - ref20 ER - TY - STD TI - Mitchison G, Durbin R. Tree-based maximal likelihood substitution matrices and hidden Markov models. J Mol Evol.; 41(6):1139–1151. doi:http://dx.doi.org/10.1007/BF00173195. UR - http://dx.doi.org/10.1007/BF00173195 ID - ref21 ER - TY - STD TI - Mitchison GJ. A probabilistic treatment of phylogeny and sequence alignment. J Mol Evol. 1999; 49(1):11–22. doi:http://dx.doi.org/10.1007/PL00006524. UR - http://dx.doi.org/10.1007/PL00006524 ID - ref22 ER - TY - STD TI - Afrasiabi C, Samad B, Dineen D, Meacham C, Sjölander K. The PhyloFacts FAT-CAT web server: ortholog identification and function prediction using fast approximate tree classification. Nucleic Acids Res. 2013; 41(Web Server issue):1–7. doi:http://dx.doi.org/10.1093/nar/gkt399. UR - http://dx.doi.org/10.1093/nar/gkt399 ID - ref23 ER - TY - STD TI - Krishnamurthy N, Brown D, Sjölander K. Flowerpower: clustering proteins into domain architecture classes for phylogenomic inference of protein function. BMC Evol Biol. 2007; 7(1):1–11. doi:http://dx.doi.org/10.1186/1471-2148-7-S1-S12. UR - http://dx.doi.org/10.1186/1471-2148-7-S1-S12 ID - ref24 ER - TY - STD TI - Qian B, Goldstein RA. Performance of an iterated T-HMM for homology detection. Bioinformatics. 2004; 20(14):2175–2180. doi:http://dx.doi.org/10.1093/bioinformatics/bth181. UR - http://dx.doi.org/10.1093/bioinformatics/bth181 ID - ref25 ER - TY - JOUR AU - Mirarab, S. AU - Nguyen, N. AU - Warnow, T. PY - 2012 DA - 2012// TI - SEPP: SATé-enabled phylogenetic placement JO - Proceedings of the Pac Symp Biocomput. VL - 17 ID - Mirarab2012 ER - TY - STD TI - Nguyen N, Mirarab S, Kumar K, Warnow T. Ultra-large alignments using phylogeny-aware profiles. Genome Biol. 2015; 16(1):124. doi:http://dx.doi.org/10.1186/s13059-015-0688-z. UR - http://dx.doi.org/10.1186/s13059-015-0688-z ID - ref27 ER - TY - STD TI - Price MN, Dehal PS, Arkin AP. FastTree 2–approximately maximum-likelihood trees for large alignments. PLOS ONE. 2010; 5(3):9490. doi:http://dx.doi.org/10.1371/journal.pone.0009490. UR - http://dx.doi.org/10.1371/journal.pone.0009490 ID - ref28 ER - TY - STD TI - Nguyen N. HIPPI README. 2016. https://github.com/smirarab/sepp/blob/master/README.HIPPI.md. Accessed 26 July 2016. UR - https://github.com/smirarab/sepp/blob/master/README.HIPPI.md ID - ref29 ER - TY - STD TI - The UniProt Consortium. Uniprot: a hub for protein information. Nucleic Acids Res. 2015; 43(D1):204–212. doi:http://dx.doi.org/10.1093/nar/gku989, arxiv http://nar.oxfordjournals.org/content/43/D1/D204.full.pdf+html. Accessed 21 Sept 2016. UR - http://nar.oxfordjournals.org/content/43/D1/D204.full.pdf+html ID - ref30 ER - TY - STD TI - Xu Q, Dunbrack RL. Assignment of protein sequences to existing domain and family classification systems: Pfam and the PDB. Bioinformatics. 2012; 28(21):2763–2772. doi:http://dx.doi.org/10.1093/bioinformatics/bts533. UR - http://dx.doi.org/10.1093/bioinformatics/bts533 ID - ref31 ER - TY - STD TI - Sunagawa S, Mende DR, Zeller G, Izquierdo-Carrasco F, Berger SA, Kultima JR, Coelho LP, Arumugam M, Tap J, Nielsen HB, Rasmussen S, Brunak S, Pedersen O, Guarner F, de Vos WM, Wang J, Li J, Doré J, Ehrlich SD, Stamatakis A, Bork P. Metagenomic species profiling using universal phylogenetic marker genes. Nat Methods. 2013; 10:1196–1199. doi:http://dx.doi.org/10.1038/nmeth.2693. UR - http://dx.doi.org/10.1038/nmeth.2693 ID - ref32 ER - TY - STD TI - Nguyen N. HIPPI dataset. 2016. https://doi.org/10.13012/B2IDB-6795126_V1. Accessed 8 Aug 2016. UR - https://doi.org/10.13012/B2IDB-6795126_V1 ID - ref33 ER -