- Open Access
Software-based analysis of bacteriophage genomes, physical ends, and packaging strategies
- Bryan D. Merrill†1,
- Andy T. Ward†1,
- Julianne H. Grose1 and
- Sandra Hope1Email authorView ORCID ID profile
© The Author(s). 2016
- Received: 25 April 2015
- Accepted: 13 August 2016
- Published: 26 August 2016
Phage genome analysis is a rapidly growing field. Recurrent obstacles include software access and usability, as well as genome sequences that vary in sequence orientation and/or start position. Here we describe modifications to the phage comparative genomics software program, Phamerator, provide public access to the code, and include instructions for creating custom Phamerator databases. We further report genomic analysis techniques to determine phage packaging strategies and identification of the physical ends of phage genomes.
The original Phamerator code can be successfully modified and custom databases can be generated using the instructions we provide. Results of genome map comparisons within a custom database reveal obstacles in performing the comparisons if a published genome has an incorrect complementarity or an incorrect location of the first base of the genome, which are common issues in GenBank-downloaded sequence files. To address these issues, we review phage packaging strategies and provide results that demonstrate identification of the genome start location and orientation using raw sequencing data and software programs such as PAUSE and Consed to establish the location of the physical ends of the genome. These results include determination of exact direct terminal repeats (DTRs) or cohesive ends, or whether phages may use a headful packaging strategy. Phylogenetic analysis using ClustalO and phamily circles in Phamerator demonstrate that the large terminase gene can be used to identify the phage packaging strategy and thereby aide in identifying the physical ends of the genome.
Using available online code, the Phamerator program can be customized and utilized to generate databases with individually selected genomes. These databases can then provide fruitful information in the comparative analysis of phages. Researchers can identify packaging strategies and physical ends of phage genomes using raw data from high-throughput sequencing in conjunction with phylogenetic analyses of large terminase proteins and the use of custom Phamerator databases. We promote publication of phage genomes in an orientation consistent with the physical structure of the phage chromosome and provide guidance for determining this structure.
- Phylogenetic tree
- DNA packaging
- Comparative genomics
Phamerator is a computer program  written to analyze the many Mycobacteriophages isolated and sequenced through the Howard Hughes Medical Institute (HHMI) Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science (SEA-PHAGES) program . Phamerator is popular among the large groups studying Mycobacteriophages  and Bacillus phages [5, 6] and is steadily gaining traction in other areas of phage research [7–11].
Herein we describe software-based methods to study phage genomes, determine phage genome ends, and identify phage DNA packaging strategies. There are several limitations to the original version of Phamerator that we sought to overcome. First, as originally written, Phamerator could only read existing databases hosted on remote servers and could not create custom databases to be explored on local computers. Second, no detailed documentation existed to describe how to make custom databases or use features other than the graphical user interface. The goal of this work was to enhance the existing code, and to make Phamerator accessible to all phage researchers by providing instructions on how to build and use a custom database in Phamerator. In addition, we describe best practices when preparing phage genomes for publication and effective downstream analysis using Phamerator and other programs. These contributions enable phage researchers to use this powerful program and provide a basis for more consistent deposition of phage genomes into NCBI that will facilitate downstream analyses.
Phamerator computer coding and database setup
Phamerator is written in Python and runs in the Linux Ubuntu operating system . Ubuntu can be installed on any computer as a virtual machine through programs like VirtualBox (https://www.virtualbox.org). Phamerator compiles Structured Query Language (SQL) databases of bacteriophage genomes using GenBank  formatted files. Phamerator compares all gene products in the database using ClustalW  or ClustalO  and BLASTP  and then groups these gene products into “phamilies” (phams) based on percent identity or BLASTP expect value (E-value) with other gene products in the pham. Phamerator also prepares linear genome maps for gene order and content (genome synteny) comparison, and includes nucleotide homology output. Researchers can manually assign phages into different clusters within a database, such as groups based on genome similarity, [11, 16, 17], genera  or host preference .
Phamerator database setup requires four main processing steps. In the first and second steps, Phamerator aligns all possible pairs of gene products in the database using both BLASTP and ClustalW and saves all statistically significant results. In the third step, the user specifies an E-value and a percent identity used to group proteins into phamilies. Other versions of Phamerator have been modified to instead use kClust to assign phamilies [19, 20] and run natively on Windows, Linux, and MacOS. These phamilies can help identify homologous gene products . In the final step, Phamerator identifies conserved domains in every protein in the database using the Conserved Domain Database (CDD) . These tools provide powerful analyses to study gene synteny and conservation.
Phamerator reads the phage data stored in the SQL database and displays it in a graphical user interface. Phamerator has two main graphical outputs: linear genome maps and phamily circles. The features and purposes of these graphics are described in the original Phamerator publication .
Main features of Phamerator for comparative phage genomics
Phage genome orientation
Effective Phamerator analysis of similar phage genomes requires consistency in the genome orientation and the location of the first base. As phage genomes are published it is important that the orientation and complementarity are intentional, reflect physical properties of the phage chromosome, and are consistent with well-characterized phages. Phage genomes are currently deposited with a wide variety in the base one calls for even very similar phages . Thus, one crucial step in preparing phage genomes from GenBank files for Phamerator and other analyses is to rearrange genomes that are oriented incorrectly so that genome content and gene order may be easily compared. Proper identification of physical ends and phage packaging strategies allows researchers to arrange phage genomes correctly before publishing them.
Although wet lab methods for determining phage ends and packaging strategies have been described previously , these experiments consume time and resources and may be inconclusive. Software-based methods using raw next-generation sequencing data provide insight into physical ends and packaging strategies . These data can guide, clarify, or potentially replace wet lab experiments, especially when working with large datasets.
Modifications to the original Phamerator code fixes errors and allows for continued compatibility
Phamerator Features and Modifications
Updates provided in new version
Works with BioPython 1.64
Continued compatibility with future Biopython versions
Building the Phamerator database
Added prompts for username, password, server location, and database name at each step
The new prompts replace what was once written directly into the code
ClustalO may be used instead of ClustalW to perform alignments
ClustalO is newer and is faster
Fixed script displaying the progress of BLAST and ClustalW
Helps users estimate when these jobs will finish
Pham and cluster tables
Column listing conserved domains for each pham was added to these tables
Used to quickly determine putative functions of proteins in a pham
Domain and pham labels in genome maps
Added whitespace to the right of these maps
Labels near the end of these maps are now visible
Delete BLAST and ClustalW scores
Users are prompted to delete or keep all scores when adding or removing phages
Scores can be deleted following major modifications to the database
The Graphical User Interface (GUI) of Phamerator is run on various operating systems with the aid of virtual software
The graphical interface of Phamerator has wide usage among universities involved in the SEA-PHAGES program and is growing in popularity among other phage researchers as well. SEA-PHAGES members can download a pre-configured Ubuntu virtual hard drive file (www.hhmi.org/seawiki) and gain access to the Mycobacteriophage database managed by Graham Hatfull at the University of Pittsburg and Steve Cresawn at James Madison University. The virtual hard drive can be run using VirtualBox (www.virutalbox.org) or other virtualization software. At BYU, Phamerator is accessible in the Windows environment by forwarding an X11 window over SSH from a Linux virtual machine (VM) running on a server. This always-on VM keeps local computers fast as resources aren't spent running a local VM. This server VM allows multiple users on each VM, also saving users the time it takes to install and manage a virtual machine. North Carolina State University (NCSU) has also successfully built their own Phamerator databases which they currently use for teaching and research purposes. A Virtual Computing Lab at NCSU allows students to log on to a Ubuntu virtual machine from anywhere on campus and access Phamerator.
After a Phamerator database of phage genomes is compiled and processed it can be viewed and studied using the graphical user interface. Prior to our work, database setup was exclusive to the SEA-PHAGES program. The following section describes how to prepare a Phamerator database using GenBank-formatted genome sequences so that any user can prepare a custom database for analysis.
A custom Phamerator database can be generated
Phamerator has three main parts: the graphical user interface (GUI), the Python scripts, and the SQL database. The GUI is the window used to view linear genome maps, pham circles, etc. Each Python script performs a specific function such as importing phages or computing Clustal scores. The SQL database is a set of linked tables where all of the phage gene sequences, alignment scores, etc. are stored. The database must be populated with phage genomes and processed before the end-user can view the desired genomes and access the features of Phamerator.
Install Ubuntu on a computer or inside a virtual machine.
Install Phamerator and the programs it needs to run.
Create a blank MYSQL database.
Insert table headers into the blank database so Phamerator knows where to store and access phage data.
Create GenBank-formatted files for recently sequenced phage genomes or retrieve phage GenBank files from NCBI. Use a program, such as DNA Master (http://cobamide2.bio.pitt.edu), to fix any formatting errors.
Import phage genome files into the SQL database.
Run Clustal comparisons on all phage gene products in the database. Each Clustal “job” compares one phage gene product against all others in the database and records significant alignments.
Run BLASTP comparisons on all phage gene products in the database. Each BLASTP “job” compares one phage gene product against all others and records significant E-values.
Run phamBuilder to group similar gene products into phamilies. Gene products are joined into a pham when they are similar to at least one other member by either a Clustal percent identity or BLASTP E-value at or above user-defined cutoffs. Commonly used values are 32.5 % identity and 1e-50 E-value .
Run cddSearch to identify conserved domains in gene products in the database using the CDD.
Export the database to a single SQL file to be shared with others.
Detailed instructions to execute these steps have been deposited at our website, http://phagehunters.byu.edu/Phamerator and are also included as Additional file 2. The instructions describe the process in detail to assist users through the technical tasks required to set up Phamerator. For example, Phamerator is currently only available for computers running Ubuntu. In most cases, this means that Ubuntu must be installed as a virtual machine. Processing a Phamerator database requires a computer with a powerful processor. An additional 40 GB of hard drive space is needed to set up a local copy of the CDD so conserved domains can be added to gene products in Phamerator. In the instruction manual, we provide descriptions of common errors that can occur due to variations in GenBank files and include a troubleshooting section for these errors. For example, GenBank files imported into Phamerator must contain unique locus tags, a “gene” feature, and a “CDS” feature for each gene. In addition, to avoid translation errors during importing, each gene in the file must use the “Bacterial and Plant Plastid” translation table. Furthermore, genomes that are arranged incorrectly or contain genes that wrap around the genome from the end to the beginning must first be modified using a program such as DNA Master, written by Dr. Jeffrey Lawrence and available online at http://cobamide2.bio.pitt.edu.
Publication of phage genomes without a standardized genome start location or orientation hinders analysis using comparative genomics software
Sequencing data can reveal phage DNA packaging strategy to select the genome start and orientation
Regardless of the packaging strategy or physical ends, all tailed bacteriophages (Caudovirales) end up with a linear DNA molecule packaged in the capsid of the mature virion . This genome is then injected into a new host, wherein most phage chromosomes circularize. The mechanism of circularization is dependent on the packaging strategy and the type of physical ends produced. Therefore, identification of the packaging strategy can reveal the location of the physical start of a phage genome, and sequencing data can often be analyzed to determine the packaging strategy used [23, 25, 26].
Phages that have circularly permuted DTRs due to headful packaging will always show reads that run off one end of the genome when sequenced completely. These wrap-around reads contain bases coinciding with the other end of the genome (Fig. 11). If PAUSE shows consistent read depth throughout the genome, wrap-around reads are identified by Consed, no putative exact DTR repeat regions are identified, and there are no sudden drops in coverage near the large terminase gene indicative of cohesive ends (Fig. 10), then the phage is likely circularly permuted and uses headful packaging.
Phages rely on terminase proteins to identify replicated phage chromosomes from among the other DNA inside of the host. Terminases package phage chromosomes into phage capsids and cut concatemers into genome-sized lengths. The role of the terminase varies depending on the packaging mechanism. Therefore, terminases with similar amino acid sequences usually package DNA using similar mechanisms and create similar physical ends [22, 23]. Phylogenetic analysis has been used to gain additional insight into the packaging strategies of novel or poorly-studied phages [32–34] and is one way to predict the type of ends, including whether a phage has host ends. Analysis of large terminase proteins from phages listed in Additional file 1 indicate that large terminases with similar packaging strategies tend to clade together (Fig. 6). The clades of the phylogenetic tree correspond exactly to the cluster grouping that was assigned in Phamerator based on the Phams to which each large terminase belongs (A1-F2). Casjens and Gilcrease reported packaging strategies based on phylogenetic analysis and defined 11 groups: 5’ cos (Lambda, P2); 3’ cos (HK97), headful (P2, Sf6, T4, 933 W, GTA), host ends (Mu and D3112), and short DTRs (T7) . Here, we propose five additional groups based on phylogenetic and Phamerator analysis: short DTRs (N4, C-st); headful (phiPLPE, phiKZ); and long DTRs (SPO1).
There are several considerations in making a phylogenetic tree containing large terminases. Although large terminases are well-conserved and are even similar among phages that infect different hosts, the overall diversity of large terminases is often too great to reliably analyze them all in one phylogenetic tree. This diversity causes instability of the branches and nodes as additional sequences are added. When adding a large terminase protein to a phylogenetic tree, some stability can be maintained by also including several BLAST hits that are similar to the terminase being queried, especially those hits that come from phages with experimentally determined packaging strategies.
Read pileups, wrap-around reads, changes in coverage density, and terminase phylogenies can guide researchers in making the appropriate “base one” call prior to publication or in designing wet lab experiments to verify the phage ends and packaging strategies. Exact DTRs in phages can be annotated  and these genomes are generally published with one repeat sequence on each end .
The complementarity of the genome is considered when making a base one call for phages that have exact DTRs, have host ends, or use protein-primed replication. For phages with cohesive ends, 5’ overhangs are placed at the beginning of the published genome, and 3’ overhangs are placed at the end. Base one calls for circularly permuted phages are more complicated because software-based methods cannot yet identify the pac sequence or pac fragment by looking at changes in coverage. Wet lab methods can occasionally identify the pac fragment as a piece of DNA that spans between the origin of replication and the site where the terminase makes the first cut. Because the large terminase protein is responsible for identifying and cutting at the pac site, the sequence of the pac site and the sequence of the large terminase protein often lie very close to each other, with the pac site often just upstream of the large terminase protein . We typically determine base one calls in circularly permuted phages at or just upstream of the large terminase gene with the large terminase gene in the forward direction. Standardizing base one call methods for all phage types, especially for circularly permuted phages, will facilitate comparison of phage genomes and easier identification of homologs.
Although the analyses we describe of high-throughput data can give a good indication of the packaging strategy and the physical ends of the phage chromosome, the data may not always provide a definitive answer. For instance, at least two packaging mechanisms are known produce linear chromosomes with no wrap-around sequences, exemplified by phage Mu and phage phi29. Such packaging strategies may be difficult to distinguish from phages with cohesive ends that do not generate artificially circular sequences. Phage Mu inserts copies of its DNA into the host chromosome via replicative transposition . When Mu DNA is excised from the host chromosome prior to being packaged, segments of the host chromosome become the ends of the linear phage DNA. Each segment of DNA packaged into a progeny phage contains different ends since they all came from different parts of the bacterial chromosome. These chromosomes are circularized  but are not believed to produce artificially circular genomes when sequenced. Phages like Bacillus phage phi29 also circularize in the host but have a protein covalently linked to each end that serves to prime DNA replication . Phages with host ends or terminal proteins do not generate artificially circular sequences because there is no repeated sequence at the phage ends. Raw sequencing data may rule out cohesive ends, headful packaging, and exact DTRs without confirming whether a phage has host ends or covalent terminal proteins. Wet lab experiments, similarity to previously sequenced and characterized phages, or comparison of large terminase proteins are necessary to verify whether phages have host ends or covalent terminal proteins .
A custom Phamerator database can be used to identify packaging strategies based on the large terminase protein
Our modifications to Phamerator combined with new documentation for setting up custom databases and troubleshooting errors make this powerful software widely available and user-friendly. We plan to release additional updates to Phamerator that will add new features and resolve persistent problems, including: display of pham circle relationships using parameters identical to those used to build phamilies, display of pham tooltips when the map alignment is changed, display of pham circles when no phages are assigned to the singleton cluster, and display of phage tRNAs on the linear genome map.
Using the techniques we described, high-throughput sequencing data can be used to determine packaging strategies and physical ends of phage chromosomes. Understanding the principles of phage genome packaging and utilizing phage genome comparison software will lead to informed decisions when publishing phage genomes, standardizing phage genome submission. Because phage genomes are being added to GenBank at a rapid rate, publishing them in a consistent manner will allow straightforward phage characterization and comparison using Phamerator and other programs.
Accession numbers for the 43 phage genomes and large terminase proteins used in this paper are listed in Additional file 1. We downloaded bacteriophage genomes in GenBank format from NCBI and used them to build a Phamerator database according to the instructions found in Additional file 2. Phage gene products in these genomes were compiled into a pham if they shared a BLASTP E-value of 1e-35 or less or 32.5 % identity as computed by ClustalO with at least one other gene product in the pham. The phylogenetic tree of 43 large terminase proteins was computed using the neighbor-joining method using ClustalX  with a bootstrap value 1000 and was displayed using Dendroscope .
The authors gratefully acknowledge Dr. Steven Cresawn for writing Phamerator initially, and for providing feedback as we modified Phamerator. We appreciate the assistance of Byron Doyle and Scott Carlson of BYU Life Sciences IT for making changes to the Phamerator code and managing the GitHub repository. We also appreciate the knowledge and assistance of Dan Russell and Dr. Graham Hatfull at the University of Pittsburgh, and are grateful for the support of the HHMI SEA-PHAGES program.
Funding for this work was provided by the Microbiology and Molecular Biology Department at Brigham Young University and by generous donations through LDS Philanthropies.
Availability of data and materials
The code for the Phamerator software is available at the GitHub website, https://github.com/byuphamerator/phamerator-dev/. Our instructions to setup the Phamerator software and to create a Phamerator database is available at our website at http://phagehunters.byu.edu/Phamerator. The Nexus file containing the phylogenetic tree of the large terminases (Fig. 6) is available in the TreeBase repository at https://treebase.org/treebase-web/home.html. The custom Phamerator SQL database we constructed to generate the phamily circles of large terminases (Fig. 13) is available as an example Phamerator database, “terminasephages.sql,” on our website at http://phagehunters.byu.edu/Phamerator.
BDM, JHG, and SH were responsible for the design and coordination of the research. BDM drafted the manuscript and wrote the Phamerator instructions. BDM and ATW made further modifications to the Phamerator code, edited the Phamerator instructions, gathered and analyzed data, and generated figures. ATW created the Phamerator database used in this analysis. SH and JHG edited extensively. All authors contributed to editing of the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethical approval and consent to participate
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- McAuliffe O, Ross RP, Fitzgerald GF. The new phage biology: from genomics to applications. In: McGrath S, Van Sinderen D, editors. Bacteriophage: Genetics and Molecular Biology. Norfolk, England: Caister Academic Press; 2007.Google Scholar
- Jordan TC, Burnett SH, Carson S, Caruso SM, Clase K, DeJong RJ, Dennehy JJ, Denver DR, Dunbar D, Elgin SC, Findley AM, Gissendanner CR, Golebiewska UP, Guild N, Hartzog GA, Grillo WH, Hollowell GP, Hughes LE, Johnson A, King RA, Lewis LO, Li W, Rosenzweig F, Rubin MR, Saha MS, Sandoz J, Shaffer CD, Taylor B, Temple L, Vazquez E, et al. A broadly implementable research course in phage discovery and genomics for first-year undergraduate students. mBio. 2014;5(1):e01051–01013.View ArticlePubMedPubMed CentralGoogle Scholar
- Cresawn SG, Bogel M, Day N, Jacobs-Sera D, Hendrix RW, Hatfull GF. Phamerator: a bioinformatic tool for comparative bacteriophage genomics. BMC Bioinformatics. 2011;12:395.View ArticlePubMedPubMed CentralGoogle Scholar
- Jacobs-Sera D, Marinelli LJ, Bowman C, Broussard GW, Guerrero Bustamante C, Boyle MM, Petrova ZO, Dedrick RM, Pope WH, SEA-PHAGES Program A, Modlin RL, Hendrix RW, Hatfull GF. On the nature of mycobacteriophage diversity and host preference. Virology. 2012;434(2):187–201.View ArticlePubMedPubMed CentralGoogle Scholar
- Lorenz L, Lins B, Barrett J, Montgomery A, Trapani S, Schindler A, Christie GE, Cresawn SG, Temple L. Genomic characterization of six novel Bacillus pumilus bacteriophages. Virology. 2013;444(1–2):374–83.View ArticlePubMedGoogle Scholar
- Grose JH, Belnap DM, Jensen JD, Mathis AD, Prince JT, Merrill B, Burnett SH, Breakwell DP. The genomes, proteomes and structure of three novel phages that infect the Bacillus cereus group and carry putative virulence factors. J Virol. 2014;88(20):11846–60.View ArticlePubMedPubMed CentralGoogle Scholar
- Smith MC, Hendrix RW, Dedrick R, Mitchell K, Ko CC, Russell D, Bell E, Gregory M, Bibb MJ, Pethick F, Jacobs-Sera D, Herron P, Buttner MJ, Hatfull GF. Evolutionary relationships among actinophages and a putative adaptation for growth in Streptomyces spp. J Bacteriol. 2013;195(21):4924–35.View ArticlePubMedPubMed CentralGoogle Scholar
- Merrill BD, Grose JH, Breakwell DP, Burnett SH. Characterization of Paenibacillus larvae bacteriophages and their genomic relationships to Firmicute bacteriophages. BMC Genomics. 2014;15(1):745.View ArticlePubMedPubMed CentralGoogle Scholar
- Marinelli LJ, Fitz-Gibbon S, Hayes C, Bowman C, Inkeles M, Loncaric A, Russell DA, Jacobs-Sera D, Cokus S, Pellegrini M, Kim J, Miller JF, Hatfull GF, Modlin RL. Propionibacterium acnes bacteriophages display limited genetic diversity and broad killing activity against bacterial skin isolates. mBio.2012;3(5):e00279–12.Google Scholar
- Sencilo A, Jacobs-Sera D, Russell DA, Ko CC, Bowman CA, Atanasova NS, Osterlund E, Oksanen HM, Bamford DH, Hatfull GF, Roine E, Hendrix RW. Snapshot of haloarchaeal tailed virus genomes. RNA Biol. 2013;10(5):803–16.View ArticlePubMedPubMed CentralGoogle Scholar
- Grose JH, Casjens SR. Understanding the enormous diversity of bacteriophages: the tailed phages that infect the bacterial family Enterobacteriaceae. Virology. 2014;468–470:421–43.View ArticlePubMedGoogle Scholar
- Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res. 2014;42(Database issue):D32–7.View ArticlePubMedGoogle Scholar
- Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23(21):2947–8.View ArticlePubMedGoogle Scholar
- Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li WZ, Lopez R, McWilliam H, Remmert M, Soding J, Thompson JD, Higgins DG. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol. 2011;7:539.View ArticlePubMedPubMed CentralGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10.View ArticlePubMedGoogle Scholar
- Hatfull GF, Jacobs-Sera D, Lawrence JG, Pope WH, Russell DA, Ko CC, Weber RJ, Patel MC, Germane KL, Edgar RH, Hoyte NN, Bowman CA, Tantoco AT, Paladin EC, Myers MS, Smith AL, Grace MS, Pham TT, O’Brien MB, Vogelsberger AM, Hryckowian AJ, Wynalek JL, Donis-Keller H, Bogel MW, Peebles CL, Cresawn SG, Hendrix RW. Comparative genomic analysis of 60 Mycobacteriophage genomes: genome clustering, gene acquisition, and gene size. J Mol Biol. 2010;397(1):119–43.View ArticlePubMedPubMed CentralGoogle Scholar
- Grose JH, Jensen GL, Burnett SH, Breakwell DP. Genomic comparison of 93 Bacillus phages reveals 12 clusters, 14 singletons and remarkable diversity. BMC Genomics. 2014;15(1):855.View ArticlePubMedPubMed CentralGoogle Scholar
- King AMQ, Adams MJ, Carstens EB, Lefkowitz EJ. Virus taxonomy: Classification and nomenclature of viruses: Ninth report of the International Committee on Taxonomy of Viruses. London: Academic Press; 2012.Google Scholar
- Lamine JG, DeJong RJ, Nelesen SM. PhamDB: a web-based application for building Phamerator databases. Bioinformatics. 2016;32(13):2026–8.Google Scholar
- Hauser M, Mayer CE, Söding J. kClust: fast and sensitive clustering of large protein sequence databases. BMC Bioinformatics. 2013;14(1):1–12.View ArticleGoogle Scholar
- Marchler-Bauer A, Zheng CJ, Chitsaz F, Derbyshire MK, Geer LY, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI, Lanczycki CJ, Lu F, Lu SN, Marchler GH, Song JS, Thanki N, Yamashita RA, Zhang DC, Bryant SH. CDD: conserved domains and protein three-dimensional structure. Nucleic Acids Res. 2013;41(D1):D348–52.View ArticlePubMedGoogle Scholar
- Casjens SR, Gilcrease EB. Determining DNA packaging strategy by analysis of the termini of the chromosomes in tailed-bacteriophage virions. Methods Mol Biol. 2009;502:91–111.View ArticlePubMedPubMed CentralGoogle Scholar
- Li S, Fan H, An X, Fan H, Jiang H, Chen Y, Tong Y. Scrutinizing virus genome termini by high-throughput sequencing. PLoS One. 2014;9(1):e85806.View ArticlePubMedPubMed CentralGoogle Scholar
- Darling ACE, Mau B, Blattner FR, Perna NT. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 2004;14(7):1394–403.View ArticlePubMedPubMed CentralGoogle Scholar
- Ritz MP, Perl AL, Colquhoun JM, Chamakura KR, Kuty Everett GF: Complete genome of Bacillus subtilis myophage CampHawk. Genome Announc.2013;1(6):e00984–13.Google Scholar
- DeCrescenzo AJ, Ritter MA, Chamakura KR, Kuty Everett GF. Complete genome of Bacillus megaterium siphophage Slash. Genome Announc.2013;1(6):e00862–13.Google Scholar
- Chung Y-B, Nardone C, Hinkle DC. Bacteriophage T7 DNA packaging. J Mol Biol. 1990;216(4):939–48.View ArticlePubMedGoogle Scholar
- Zhang X, Studier FW. Multiple Roles of T7 RNA Polymerase and T7 Lysozyme During Bacteriophage T7 Infection. J Mol Biol. 2004;340(4):707–30.View ArticlePubMedGoogle Scholar
- Dunn JJ, Studier FW, Gottesman M. Complete nucleotide sequence of bacteriophage T7 DNA and the locations of T7 genetic elements. J Mol Biol. 1983;166(4):477–535.View ArticlePubMedGoogle Scholar
- Gordon D, Green P. Consed: a graphical editor for next-generation sequencing. Bioinformatics. 2013;29(22):2936–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Adams MB, Hayden M, Casjens S. On the sequential packaging of bacteriophage P22 DNA. J Virol. 1983;46(2):673–7.PubMedPubMed CentralGoogle Scholar
- Dorscht J, Klumpp J, Bielmann R, Schmelcher M, Born Y, Zimmer M, Calendar R, Loessner MJ. Comparative genome analysis of Listeria bacteriophages reveals extensive mosaicism, programmed translational frameshifting, and a novel prophage insertion site. J Bacteriol. 2009;191(23):7206–15.View ArticlePubMedPubMed CentralGoogle Scholar
- Fouts DE, Rasko DA, Cer RZ, Jiang LX, Fedorova NB, Shvartsbeyn A, Vamathevan JJ, Tallon L, Althoff R, Arbogast TS, Fadrosh DW, Read TD, Gill SR. Sequencing Bacillus anthracis typing phages Gamma and Cherry reveals a common ancestry. J Bacteriol. 2006;188(9):3402–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Casjens SR, Gilcrease EB, Winn-Stapley DA, Schicklmaier P, Schmieger H, Pedulla ML, Ford ME, Houtz JM, Hatfull GF, Hendrix RW. The generalized transducing Salmonella bacteriophage ES18: complete genome sequence and DNA packaging strategy. J Bacteriol. 2005;187(3):1091–104.View ArticlePubMedPubMed CentralGoogle Scholar
- The GenBank Submissions Handbook [http://www.ncbi.nlm.nih.gov/books/NBK51157/]. Accessed 19 Aug 2016.
- Harshey RM. The Mu story: how a maverick phage moved the field forward. Mob DNA. 2012;3(1):21.View ArticlePubMedPubMed CentralGoogle Scholar
- Harshey RM, Bukhari AI. Infecting bacteriophage Mu DNA forms a circular DNA-protein complex. J Mol Biol. 1983;167(2):427–41.View ArticlePubMedGoogle Scholar
- Ortin J, Vinuela E, Salas M. DNA-protein complex in circular DNA from phage phi29. Nature New Biol. 1971;234(52):275.View ArticlePubMedGoogle Scholar
- Huson DH, Scornavacca C. Dendroscope 3: an interactive tool for rooted phylogenetic trees and networks. Syst Biol. 2012;61(6):1061–7.View ArticlePubMedGoogle Scholar
- Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol Biol Evol. 2013;30(12):2725–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Bravo A, Alonso JC, Trautner TA. Functional analysis of the Bacillus subtilis bacteriophage SPP1 pac site. Nucleic Acids Res. 1990;18(10):2881–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Chai SH, Kruft V, Alonso JC. Analysis of the Bacillus subtilis bacteriophages SPP1 and SF6 gene 1 product: a protein involved in the initiation of headful packaging. Virology. 1994;202(2):930–9.View ArticlePubMedGoogle Scholar