In this study we analyzed a large set of poxvirus genes with the purpose of identifying genes experiencing adaptive molecular evolution. Of 235 genes in the Cowpox virus genome, 175 were analyzed using phylogenetic analysis of maximum likelihood (PAML) and subsequently analyzed by Bayes empirical Bayes to determine the probability of each codon falling into the three possible site classes of purifying, neutral or diversifying selection. We identified 79 genes under diversifying selection, representing 45% of the analyzed genes in the genome. Of those, 25 (14% of genes analyzed) were identified with both models, indicating high confidence. Thus we identify diversifying selection as an important mechanism of evolution in poxviruses.
Analysis of genes spanning the CPXV genome revealed that diversifying selection is a more important mechanism of molecular evolution in the genome's terminal regions than in the central genes. This may be due to diversifying pressures applied by host interactions. Among different orthopoxvirus species and even strains, terminal regions are more diverse both in gene content and gene sequences . Many terminally located genes are host response modifiers that directly interact with components of the host immune response or cellular response to infection. Grouping the genes identified in this study into broad functional categories (Figure 3), we identified several host response modifiers and host range genes that have sites under diversifying selection. These may be sites that are involved in host-specific interactions, demonstrating adaptation to the virus's particular host. Previous studies have also demonstrated diversifying selection in poxvirus host response modifying genes [19–21]. The current study uses the largest dataset providing the most comprehensive analysis of diversifying selection in poxvirus genomes. Among the host response modifiers identified were several secreted immunomodulators (IL-18 binding protein, IL-1 beta receptor and IFN gamma receptor) that are known to contribute to virulence [22–24]. Genes that modulate the cellular response to infection were also identified, including the mitochondrial associated apoptosis inhibitor, an inhibitor of the protein kinase R (PKR) response to double stranded RNA, a ubiquitin ligase, and an inhibitor of Toll-like receptor signaling [25–28].
In addition to host response modifying genes, we also showed, using model M8, that genes involved in viral replication and virion structure experience diversifying selection, possibly due to their protein products being packaged in the virion. A proteomics study  identified 75 viral proteins in the VACV virion, 18 of which are present at abundances greater than 1% of the weight of the virion. Of those, 6 were identified in the current study, including the major core protein (A4), the most abundant protein in the virion by weight. Further, the immune response to poxvirus infection induces the formation of neutralizing antibodies to several virion membrane proteins [29, 30]. We identified 3 known major targets of neutralizing antibodies: an immature virion membrane protein (A7, CPXV-BR-154), carbonic anhydrase (D8, CPXV-BR-129) and an enveloped virion protein (B5, CPXV-BR-205). Thus detection of diversifying selection is not limited to host response modifying genes but more broadly to genes whose products interact with the host, such as major antigens.
Designation of families in the VOCs database was based on a BLAST expect value of 10-17, thus families are likely to be composed of orthologs . Horizontal gene transfer (HGT) is recognized as a factor in evolution of poxvirus genomes. It is possible that some of the genes within a gene family may not be orthologs if they arose through multiple independent horizontal gene transfer events. Several lines of evidence are needed to demonstrate HGT, including phylogenetic clustering of the gene in taxa unrelated to the genome under study. The origins of most poxvirus genes are unknown, although most chordopoxvirus genes show greater similarity to eukaryotic genes than to other viral genes . Full phylogenies of each poxvirus gene family are needed to determine if multiple horizontal gene transfer events have occurred within the family. Such research is underway and will be valuable in further interpretation of the current data.
Like highly conserved amino acids, variable positions under diversifying selection may indicate functionally or structurally important positions. Computational identification of amino acids under diversifying selection has revealed important functional or antigenic sites in several studies [32, 33]. One of the genes identified in this study was the Interleukin-18 binding protein (IL-18BP), which was previously shown to attenuate the immune response in mice . The crystal structure of the Ectromelia virus (ECTV) IL-18 binding protein was recently determined . Bayes empirical Bayes analysis demonstrated that most residues are under purifying selection, while some are neutral and a few are under diversifying selection (Figure 4). The crystal structure and previous mutagenesis studies  identified contact residues important in binding the ligand, IL-18. Most of these residues were found to be under purifying selection, supporting a requirement for conservation of these residues to maintain function. Interestingly, 2 residues shown through crystallography to be contacts in the binding interface were also identified as being under diversifying selection (D48 and I115 in CPXV-BR, E48 and L115 in ECTV). Both of these residues interact with binding site C on human IL-18 . Evidence of diversifying selection in these positions may suggest a role for these residues in adaptation to IL-18 of the specific host species of the virus.
Another important host response modifying gene that we identified is the mitochondrial associated apoptosis inhibitor (F1L in VACV-Cop). Apoptosis is an important cellular response to infection that serves to limit viral replication through removal of infected cells. F1L inhibits apoptosis through binding to the pro-apoptotic protein Bak [36, 37] thereby inhibiting the permeabilization of the mitochondrial membrane, a critical step in apoptosis. Interaction with Bak is via Bcl2-like homology (BH) domains that are highly divergent but nonetheless form characteristic BH domain folds . Among the poxviruses, the C-terminus (containing the BH domains) of the F1L family is highly conserved. There are 7 residues under diversifying selection located in this domain (Figure 4B). Of those, 2 residues identified in this study are located in the BH domains. BH3 and BH1, primarily located on alpha helices α2 and α5, make up the BH3 binding pocket responsible for binding Bak . One residue (A173 CPXV) identified in this study, corresponding to A144 in the F1L homolog of Modified Vaccinia Ankara (MVA), is located in the binding pocket and was shown by mutagenesis to increase binding affinity if mutated to phenylalanine. M124 in CPXV (I95 in MVA) is located in α2 and therefore could be involved in ligand binding or in determining the shape of the pocket. Another 3 sites under diversifying selection are located immediately C-terminal to BH2. Overall, the capacity of the Bayes empirical Bayes analysis to identify residues known to be important in protein function, such as in the IL-18BP and mitochondrial associated apoptosis inhibitor, suggests that it may be valuable to test other predicted sites across the genome for their role in protein function.
The identification of host-interacting genes in poxviruses as ones experiencing adaptive molecular evolution is consistent with the findings of several other studies identifying genes involved in the "host-pathogen arms race" or other co-evolutionary processes and is seen throughout nature. Chitinase and other plant defense proteins show evidence of diversifying selection, and mutagenesis studies have confirmed the functional importance of the identified sites [32, 38]. The wsp gene of the bacterium Wolbachia, which encodes an outer membrane protein, shows evidence of diversifying selection when in a parasitic relationship with arthropods, but not in a mutualistic relationship with nematodes . In other viruses, all the major genes of HIV [40, 41], and the capsid protein of Foot-and-Mouth Disease virus (FMDV)  experience diversifying selection. Importantly, the amino acids identified computationally in the FMDV capsid are known to be antigenic sites identified by monoclonal antibody escape mutants . In host species, diversifying selection has been shown in the antigen recognition sites of the major histocompatibility (MHC) gene [42, 43].
As has been demonstrated with the FMDV capsid protein , interaction between an antigen and the immune response can drive diversifying selection. Several studies have found evidence of diversifying selection in surface proteins on viruses, including HIV env, influenza hemagglutinin, and others [44–46]. Recent proteomics studies have identified the major and minor virion-associated proteins [16, 17]. In this study, 39% of the genes identified are associated with the virion (Figure 3). Virion structural components and virion-associated enzymes may have greater exposure to antibody responses, leading to diversifying selection.
A few small whole genomes have been analyzed for diversifying selection, showing that diversifying selection can play strikingly different roles in the molecular evolution of organisms. A study of Picornaviridae shows evidence of diversifying selection in structural proteins but not in non-structural proteins . Only a few codons in the astrovirus genome are under diversifying selection , while 9-38% of sites in human rhinoviruses experience diversifying selection . The poxvirus genome is significantly larger than any other viral genome analyzed by this method, and the results indicate a more important role for diversifying selection in poxvirus genome evolution than in other viruses. This may be related to the large size of the poxvirus genome and the large number of accessory "non-essential" genes such as the host-response modifiers or the large number of encapsidated proteins. Over 1700 genes of the Streptococcus genome were analyzed and approximately 8% of the genes were found to be under diversifying selection . Of those, 29% are related to virulence, and many others show tissue specific expression during invasive disease. A large fraction of poxvirus genes under diversifying selection are also known to be virulence factors, echoing the findings in Streptococcus.
In Streptococcus, several essential core function genes were identified, indicating that virulence is more complex than simply the presence of pathogen associated genes . Similarly, analysis of E. coli genomes found 29 genes common to pathogenic and non-pathogenic strains which showed evidence of diversifying selection, and many of which are involved in functions like DNA metabolism and nutrient acquisition . Several core poxvirus genes, not typically thought of as virulence factors, were also found to be under diversifying selection in the poxvirus genome. Some (but not all) of these are packaged in the virion and may be exposed to the host antibody response. However, this suggests that other poxvirus systems, in addition to manipulation of the host response, may be important in virulence. In attempting to explain virulence differences between strains of poxviruses, it may therefore be important to consider not only major genomic differences such as gene complement, but also diversifying selection in well conserved genes.