Transfer of energy pathway genes in microbial enhanced biological phosphorus removal communities
© Wong and Beiko. 2015
Received: 19 February 2015
Accepted: 6 July 2015
Published: 16 July 2015
Lateral gene transfer (LGT) is an important evolutionary process in microbial evolution. In sewage treatment plants, LGT of antibiotic resistance and xenobiotic degradation-related proteins has been suggested, but the role of LGT outside these processes is unknown. Microbial communities involved in Enhanced Biological Phosphorus Removal (EBPR) have been used to treat wastewater in the last 50 years and may provide insights into adaptation to an engineered environment. We introduce two different types of analysis to identify LGT in EBPR sewage communities, based on identifying assembled sequences with more than one strong taxonomic match, and on unusual phylogenetic patterns. We applied these methods to investigate the role of LGT in six energy-related metabolic pathways.
The analyses identified overlapping but non-identical sets of transferred enzymes. All of these were homologous with sequences from known mobile genetic elements, and many were also in close proximity to transposases and integrases in the EBPR data set. The taxonomic method had higher sensitivity than the phylogenetic method, identifying more potential LGTs. Both analyses identified the putative transfer of five enzymes within an Australian community, two in a Danish community, and none in a US-derived culture.
Our methods were able to identify sequences with unusual phylogenetic or compositional properties as candidate LGT events. The association of these candidates with known mobile elements supports the hypothesis of transfer. The results of our analysis strongly suggest that LGT has influenced the development of functionally important energy-related pathways in EBPR systems, but transfers may be unique to each community due to different operating conditions or taxonomic composition.
KeywordsEnhanced biological phosphorus removal Lateral gene transfer Metagenomics Phylogenetics Mobile genetic elements
Enhanced biological phosphorus removal (EBPR) communities are a common form of microbial treatment developed by Banard  that removes phosphorus and occasionally nitrogen from sewage. EBPR is environmentally sustainable and affordable , with microbial communities typically seeded from the local environment or from a seed stock. Considerable effort has been put into understanding EBPR, from community diversity (e.g. [3–5]), to metabolic function (e.g. [2, 6]) and engineering (e.g. [7, 8]), with the objective of improving efficiency and stability. A substantial amount of work has gone into understanding what organisms are present in EBPR plants [4, 5, 9–11], which organisms tend to be associated with each other (e.g. [4, 5]), their ecology (e.g. [10, 12, 13]), and how to engineer the EBPR process (e.g. [11, 14]). Recently, a conceptual ecosystem model  and a core microbiome  have been proposed, based mainly on 25 plants in Denmark, revealing a taxonomically broad group of characterized and uncharacterized organisms. However, the majority of EBPR-associated organisms are not found in all EBPR samples.
To develop the EBPR process, a carbon source, typically acetate or propionate, is input to the system, and anaerobic and aerobic conditions are cycled in a bioreactor to select for phosphate accumulating organisms (PAOs). Other organisms perform functions such as fermentation and hydrolysis, and are often referred to as the “flanking community” (e.g. [5, 16]). Phosphorus uptake occurs during the anaerobic cycle, and carbon- and energy-providing polymers are stored as polyhydroxyalkanoates (PHAs). During the aerobic phase, energy stored in the PHAs is used for growth and reproduction. The type of input carbon source is taken up at different rates for different organisms in EBPR, which could affect treatment plant operation . Because anaerobic and aerobic cycling is so important for EBPR community function, emphasis has been placed on metabolic pathways related to PHA metabolism (e.g. [17, 18]), glycolysis and gluconeogenesis (e.g. [16, 17, 19]), the pentose phosphate pathway (e.g. [2, 20]), and the citric acid cycle (e.g. [2, 16, 19, 20]) as a means to understand how EBPR functions, and through usage of a particular metabolic pathway, a way to make EBPR more efficient at removing phosphate (e.g. ).
Metagenomic sequencing of two lab-scale EBPR enrichment reactors allowed the elucidation of EBPR-relevant metabolism of a major PAO, the Betaproteobacterium Candidatus Accumulibacter phosphatis (CAP) Clade IIA strain UW-1 , including phosphate uptake and PHA degradation during the aerobic phase, and PHA storage and polyphosphate degradation during the anaerobic stage. The amount of sequence generated, and the technology used (Sanger sequencing, which generates relatively long reads) allowed the eventual assembly of the first complete genome of CAP Clade IIA strain UW-1. Recently, draft genome sequences of other CAP have been obtained [21, 22]. Sequencing of a full-scale reactor metagenome from Denmark highlighted an enrichment of genes associated with biofilm and phosphate metabolism, and the taxonomic diversity of full-scale reactor communities . Despite the existence of a group of organisms considered to be common in EBPR communities, the community structure and specific strains present can vary considerably between treatment plants .
Lateral gene transfer (LGT) is a well-established mode of evolution in bacteria that can be studied through a variety of approaches using genome sequences (e.g. [24–28]). LGT plays an important role in adaptation, for example, in heavy-metal metabolism (e.g. [29, 30]), and in antibiotic resistance (e.g. [29, 31]). Transfers tend to take place between close relatives, but many examples of transfer between more distant relatives have been reported as well (e.g. [24, 32]). LGT is known to have occurred in sewage treatment plants, impacting antibiotic resistance genes (e.g. [29, 33–36]), and xenobiotic degradation (e.g. [37, 38]). Many of these transfers are mediated by mobile genetic elements (MGEs) such as plasmids and transposons (e.g. [37, 38]). Engineering of treatment plants have used plasmids to bioaugment communities to allow metabolism of xenobiotics . Other mechanisms of LGT exist, such as gene transfer agents (e.g. ) and transformation (e.g. ), but their role in sewage treatment communities is not known.
The metagenomes of two non-EBPR sludge community plasmids were sequenced , revealing substantial differences in genes from a plant with primarily industrial waste and a plant with primarily household waste. The differences suggested that the prominence of carbohydrate metabolism genes from the industrial waste plant, and the genes related to defense factors in the household waste plant, were the result of selection in each of those communities. Others have noted that transferred plasmids in non-EBPR sludge can have a mosaic of functional genes . Some evidence of LGT has been identified in PAO genomes  but no such events have been proposed from metagenome data thus far.
There are many different bioinformatic approaches for the identification of LGT events (reviewed in [27, 42]), but most rely on whole-genome sequences. Different methods can identify very different sets of genes as putatively acquired via LGT (e,g, [28, 43, 44]). Metagenomic data introduce several challenges that make identification of LGT difficult, in particular, metagenome sequence fragments are short (typically < 1000 nucleotides in length) and of uncertain provenance in the community. Incorrectly assembled chimeric contigs often combine sequences from multiple members of the same genus, species or strain [45–47]. Chimeric contigs are more common in more diverse communities (e.g. ), and when using short-read sequencing technology with closely related strains , and can often lead to incorrect classification of contigs.
Despite these challenges, it would be an important step to develop sequence-based approaches to identify LGT within an environment to further our understanding of microbial adaptation. Approaches such as genetic exchange networks  could identify transfers between multiple taxonomic groups. Here we develop and apply two different analyses to identify candidate LGT events in EBPR metagenomic data for six relevant metabolic pathways. We focus on class-level gene transfers to avoid any errors in assembly at lower taxonomic levels that can affect the accuracy of classification. Our first method, classification discordance, exploits disagreement between taxonomic classifications of genes and longer assemblies. Our second method relies on phylogenetic incongruence. Both are then filtered by homology with known MGEs to identify putative cases of LGT that have been putatively transferred through MGEs.
The EBPR enrichment culture metagenomes for lab-scale bioreactors in Madison, Wisconsin, United States of America (USA) and Brisbane, Australia (AU) that comprised the first EBPR metagenome study , both sequenced using Sanger sequencing, were downloaded on April 21st 2009 from the Joint Genome Institute (http://genome.jgi.doe.gov/OZEBPRsludge/OZEBPRsludge.download.htm, projectID=201007& metagenome=4463936.3). The USA community is composed of 15,866 contigs and assemblies, 25,312,906 nucleotides, with reads an average of 986 nucleotides in length, and the AU community 11,188 contigs and 24,385,629 nucleotides, with reads an average of 1038 nucleotides in length. The EBPR metagenome for a full-scale bioreactor in Aalborg, Denmark (DK) that performs nitrogen removal in addition to phosphate removal , sequenced using Illumina GAII (2 x 72 paired end), was downloaded from the SEED (http://metagenomics.anl.gov/metagenomics.cgi?page=MetagenomeOverview& metagenome=4463936.3), and is composed of 269,385 contigs and 145,725,513 nucleotides of sequence data. We used the assemblies and predicted genes and putative proteins as generated by the original sequencing projects.
Mobile genetic element sequence data consisted of MGEs from the Phast  and the ACLAME databases . The Phast database is composed primarily of viral sequences and the ACLAME database is composed of plasmids, phage genomes and transposons. We also included the complete NCBI plasmid database, and added other plasmids from NCBI that were not in the plasmid database, but matched the search terms “sewage treatment”, “waste-water” and “wastewater”. In total, this amalgamated database contained 7,584,934 sequences.
Taxonomic and functional annotation of metagenomic contigs
Class-level taxonomic classification of contigs was done using RITA . RITA uses a reference database to assign a taxonomic classification to sequence data using both homology and nucleotide composition. We used RITA v1.0.1 with a reference data set of over 2986 genomes representing 65 different taxonomic classes (Additional file 1), using USEARCH v4.1.93  for homology searches and FCP v1.0.3  for nucleotide composition matching. RITA performs taxonomic classification and assigns sequences to one of four confidence groups based on the strength of evidence in favor of that classification. Sequences with identical taxonomic predictions from both homology and composition were assigned to Group I. Group II comprised sequences where the expectation value for the best-matching genome was at least 10 orders of magnitude smaller than the best-matching genome from a different class. Group III assignments are made when the NB likelihood score for the best-matching genome is at least 1.5 times greater than the NB likelihood for the best-matching genome from another class. Group IV assignments are based only on the best NB likelihood value. Accuracy of classifications increases with longer contigs , so only contigs at least 1000 nucleotides in length were used.
List of enzymes by Enzyme Commission number, and common name in text
Enzyme Commission Number
Identification of putative LGT events
Sequenced reference genomes are typically used for the identification of LGT, but metagenomes rarely produce reliable complete genome sequences. The complete genome of CAP Clade IIA strain UW-1 was however reconstructed from the USA EBPR metagenome. We used this genome to look for initial evidence of LGT in this EBPR community. We performed homology searches, using BLAST, of its genome against itself and 2773 reference genomes, and MGEs used in the EBPR-MGE homology searches. The top hits with a minimum of 60 % shared alignment were used as evidence of potential LGT.
We used two complementary approaches to identify putative LGT events in the EBPR metagenomes. The first approach identified strong disagreement between taxonomic classifications (“classification discordance”) of entire contigs and individual genes within those contigs. The second approach considered incongruence in phylogenetic trees as evidence of LGT. LGT identified by the two approaches were then filtered by homology with known MGEs.
Poor assembly could lead to chimeric contigs and spurious LGT inference. To assess the possible effects of misassembly, we examined the quality of the assemblies from which LGTs were inferred. Reads are available for the AU and USA communities (http://genome.jgi.doe.gov/OZEBPRsludge/OZEBPRsludge.download.htm, projectID=201007), but not the DK community, and none of the three communities had explicit mappings of reads to contigs. We mapped reads back to contigs using a BLASTN search (version 2.2.23), where 70 % of a read was required to align to a contig with a maximum expectation value of 10−30, and searched for reads that spanned some or all of the putatively transferred ORF and its neighbors.
The taxonomic classification of a whole contig suggests the lineage of the organism from which it was sequenced, but individual protein-coding open reading frames (ORFs) from the contig may differ in their taxonomic assignments. Such disagreements can suggest LGT events with an implied direction of transfer; the donor is the classification of the ORF, and the recipient is the classification of the entire contig. Each predicted ORF was classified at the class level using RITA with the same command-line parameters used above for the contig classifications, with ORFs from group I and group II RITA classifications considered as accurate.
Spuriously classified ORFs originating from classified contigs meeting our length requirements would lead to a questionable inference of LGT. To prevent this, we filtered out candidate transferred ORFs whose best composition-based prediction (i.e., the Naïve Bayes likelihood score) was not at least 15 % better than the contig prediction. If this criterion was satisfied, then the contig was considered a transfer recipient of the implicated ORF.
Phylogenetic methods incorporate models of the evolutionary process, providing a more accurate representation of evolutionary relationships amongst homologous sequences. We first performed all-versus-all BLAST (version 2.2.23) searches within each community to identify clusters of putative homologous proteins. These sets were then compared with 1642 reference prokaryotic genomes to expand and join clusters. Clusters were represented as an undirected graph using the “networkx” Python package (1.8.1). In the network, a node represents each sequence, and an undirected edge represents a homologous relationship between two sequences. For an edge to be drawn between two EBPR proteins, they must have 70 % sequence identity, and share 60 % alignment length with an e-value of 10−5 or smaller. This network was expanded by drawing edges between the nodes, the reference genome sequences and EBPR homologs meeting the BLAST similarity requirements. The network was then split into connected components, or a set of nodes that are connected to each other by a path of edges, where each connected component is considered a cluster.
The resulting clusters were often very large (≥1000 sequences), and included distantly related proteins of little use to LGT inference. To obtain sub-clusters, we constructed phylogenies and extracted subtrees. Sequence alignments were constructed from large clusters using MUSCLE (version 3.8.31)  with default settings, and trees were constructed using FastTree (version 2.1.4)  with the WAG model of amino acid evolution . We then manually extracted subtrees where FastTree Shimodaira-Hasegawa (SH)-test-based [60, 61] branch support values of at least 70 % denoted clusters of closely related sequences. Subtree extraction, alignment and phylogeny construction was repeated until subtrees comprised a maximum of approximately 200 sequences.
For detecting LGT, phylogenies are typically compared against a reference species tree (e.g. ). However, because EBPR community structure can vary over time (e.g. ) and metagenomes can represent incomplete samples of the total genetic material , crucial taxa including donors of genetic material may not be present in the sample. We used the DendroPy library  to calculate the patristic (branch-length) distances between sequences in the same phylogenetic tree, finding for each EBPR sequence the closest EBPR sequence from the same community and the closest reference sequence with an absolute branch length of 0.3 substitutions per site. Sequences with shorter branch lengths should be closest relatives.
Identifying candidate mobile genetic elements
Potential LGTs from each of the phylogenetic incongruence and classification discordance methods were then filtered by sequences that have homologs, as identified using BLAST with a maximum e-value of 10−30 against our custom database of MGEs. We included EBPR sequences with hits to MGE sequences of a different taxonomic class from the EBPR sequence, and had an alignment length of at least 60 % of the query EBPR sequence and 60 % of the subject MGE sequence.
The published CAP genome was used to find evidence of recent LGT, possibly in the context of the EBPR community. Of the 4562 sequences in the CAP genome, 1438 sequences had hits to genomes outside the Betaproteobacteria with the same e-value as the top CAP hit, suggesting the acquisition of many genes by CAP. The high degree of similarity indicates the possibility that many of these transfers occurred very recently. The observation of these recent transfers led us to search for LGT events in all sampled EBPR community members.
Summary of sequences used in analyses from all communities
# (%) of contigs retained
7,610 (47.96 %)
7,331 (65.52 %)
18,024 (6.69 %)
# (%) of ORFs retained
22,894 (66.06 %)
25,003 (81.15 %)
30,516 (10.14 %)
# enzymes in energy pathways (%) of annotated enzymes
645 (22.13 %)
714 (22.60 %)
524 (22.40 %)
Our first approach to identify putative LGT compared the taxonomic classification of an entire contig with the classification of its predicted ORFs. Of the ORFs that had hits to the metabolic pathways of interest, at least 50 % from each community (US: 20 ORFs, 68.9 %, AU: 88 ORFs, 55.7 %, and DK: 58 ORFs, 54.2 %) satisfied the criteria for discordance. All LGTs suggested by this method had hits to annotated MGEs from our database. The number of inferred transfers, the implicated enzymes and the participating taxonomic groups vary among metabolic pathways and communities (Additional file 5). However, some members appear to be more common recipients or donors of gene transfer in all communities and metabolic pathways, with Betaproteobacteria to Gammaproteobacteria (21 transfers) in AU the most common direction of transfer (Additional file 6). LGT events with Alphaproteobacteria as donor and Betaproteobacteria as recipient were the only pattern identified in all three communities.
A set of 987 trees covering 46,031 proteins from 1622 reference organisms were extracted from an initial set of 981 trees covering 243,031 proteins from 1642 reference organisms. The direction of transfer is difficult to infer as metagenomic sequencing and quality-filtering approaches remove possible within-community donor and recipient lineages, and tree topologies often cannot distinguish which of two implicated lineages is the most likely donor. A total of 14, 27, and 1 (DK, AU, and USA communities, respectively) predicted EBPR proteins differed from a reference sequence or an EBPR protein of a different taxonomic class by less than 0.3 substitutions per site (Additional file 16). This represented 4.71 %, 8.84 %, and 0.65 % (DK, AU, USA communities, respectively) of all sequences whose closest relative was a member of a different taxonomic class, but did not meet the 0.3 substitutions per site branch length cutoff.
The recipient of the single proposed transfer within the USA community is classified as Gammaproteobacteria, with predicted function associated with glycolysis and gluconeogenesis (EC 126.96.36.199). The AU community accounted for the majority of transfers, with some transfers identified on the same contig, but not evenly distributed across each metabolic pathway. The DK community had the largest number of inferred transfers in the citric acid cycle and nitrogen metabolism pathway. In the AU community, transfers consistently involved sequences belonging to contigs classified as Gammaproteobacteria and Betaproteobacteria, with Alphaproteobacteria, Bacilli and Chlorobia also implicated in transfer of some of the metabolic pathways. For the DK community, no common taxonomic groups were shared across metabolic pathways, and no sequences identified as transferred were classified as Betaproteobacteria. The Cytophagia were implicated in three pathways (butanoate metabolism, citric acid cycle and nitrogen metabolism), while a mixture of the Alphaproteobacteria, Bacteroidia, Flavobacteriia, Gammaproteobacteria, Methanomicrobia, Sphingobacteria are other classes present in the other three pathways (gluconeogenesis and glycolysis, pentose phosphate pathway, propanoate metabolism).
MGE homology and a common prediction set
Each method of LGT detection differs in its ability to identify different types of LGT events. All high-confidence LGT events have homology with sequences in known MGEs. A substantial number of sequences from each community had hits to known MGEs: 11,718 of 30,516 sequences from the DK community, 16,156 of 24,956 sequences from the AU community, and 15,530 of 22,662 sequences from the USA community. Of those MGE homologs, 2097 DK, 824 AU, and 875 USA community sequences are enzymes in KEGG pathways (Additional file 17).
A total of ten sequences, representing five enzymes, were identified as putatively transferred by the two approaches: enoyl-CoA hydratase (EC 188.8.131.52), acetyl-CoA C-acetyltransferase (EC 184.108.40.206), cytochrome-c oxidase (EC 220.127.116.11), phosphoglycerate kinase (EC 18.104.22.168), and 6-phosphofructokinase (EC 22.214.171.124). Of those ten sequences, eight were identified in the AU community, two in the DK community, and none in the USA community. All of the identified enzymes were present on plasmids in the ACLAME database, suggesting a possible mode of transfer. Both analyses almost always identified the same taxonomic classes as donors or as the top hit. The only transfer in DK, enzyme 126.96.36.199, was associated with nitrogen metabolism. Enzyme 188.8.131.52 was unique to GM. Two enzymes, 184.108.40.206 and 220.127.116.11, are common to BM and PM while 18.104.22.168 is common to glycolysis and gluconeogenesis and the pentose phosphate pathway. No common transfers were found that belonged to the CAC. For AU, six of the eight recipient contigs were classified as Gammaproteobacteria, with the remainder Betaproteobacteria and Chlorobia. For DK, the recipient contigs were classified as Bacilli and Cytophagia.
Length statistics for contigs with putative LGT events
Average Contig Length
Average Number of Genes
DK LGT six pathways
DK LGT transposases & integrases 6 pathways
AU LGT six pathways
AU LGT transposases & integrases 6 pathways
USA LGT six pathways
USA LGT transposases & integrases 6 pathways
Since DK reads were not available and USA LGTs were not part of the shared set of transfers from both analyses, we were only able to assess coverage of AU LGTs. Only seven of the eight AU LGTs had matching reads, but all homologous reads had an expectation value of 0.0. Of the 57 reads with at least partial homology to the putatively transferred ORFs, 50 had alignments that extended into adjacent ORFs, suggesting that the inferred events were not due to misassembly. Two putatively transferred ORFs each had an aligned read that spanned the full length of the ORF (Additional file 18). One LGT had two reads that did not extend into neighbouring ORFs, and started or ended in intergenic regions. Alignments for the remaining five reads partially covered the putatively transferred ORFs.
Using a series of approaches that are applicable to metagenomic data, we found strong evidence that LGT has impacted six energy metabolism pathways in EBPR communities. Some genes appear to have been independently transferred in more than one community. Although some groups are associated with multiple LGT events, no clear patterns of donor/recipient partners emerged for all three communities. The common set of transfers between the two analyses, and MGE homology filtering, provide the strongest evidence for LGT. The majority of transfers shared by both analyses were identified in the AU community, none in the USA community, and only two genes transferred in the DK community, which were the only shared transfers identified in nitrogen metabolism. Differences in predicted events across the three communities may represent independent evolutionary trajectories, differences in local community composition, or biases in observation due to incomplete sampling of the metagenome.
Our contig length and ORF taxonomic quality-filtering approaches favored the detection of a relatively small set of high-confidence LGT predictions. Although choosing the class level decreases the number of potential LGTs found and precludes detection of LGT between members of the same class, the long-range transfers we have identified show the strongest evidence for discordance. Our use of contigs in excess of 1000 nucleotides long considerably reduced the proportion of sequences being retained, especially for the DK community, where the average contig length was 504 nucleotides. However, longer contigs are better for detecting LGT (Table 3). This could be due to a higher probability that genes from a different source are found on longer contigs, or inaccurate classification due to short contigs. Additionally, longer contigs were needed to identify transposases and integrases in tandem with our genes of interest.
Mapping of metagenomic reads to contigs validated most of our LGT inferences; however, one putatively transferred ORF in our high-confidence set did not have any matching reads. Accuracy of assemblies, including metagenomic assemblies, depends on sequencing technology and the complexity of communities [65, 66]. Less-complex communities (~10 genomes) have the most accurate assemblies with Sanger sequencing, and complex communities (100+ genomes) have the most accurate assemblies with Illumina sequencing . Regardless of assembly accuracy, it is unclear why this ORF should be present in the assembled contigs, while having no corresponding match in the reads used to generate those contigs.
Different methods of detecting LGT are often biased towards finding certain types of transfer events [43, 44]. Our approaches do not identify transfers at lower taxonomic levels and are biased towards detection of complete genes. Naïve Bayes likelihood ORF filtering should eliminate many dubious classifications, but does not provide any information about the age of the transfer event. The phylogenetic approach provides information about age of transfers, but identified the fewest candidate LGT events. This is because it requires that the donor lineage in the community or a close relative be sampled, and LGT events that do not appreciably distort the tree will not be detected by this approach.
Different EBPR plants have distinct population characteristics [4, 9], with different operational parameters between the sampled EBPR communities, and full-scale plants being more complex and dynamic than lab-scale reactors [10, 11, 19]. All three communities use different carbon feeds: molasses in DK , propionate in AU and acetate in USA . Propionate has been shown to be a more desirable carbon source relative to acetate, providing PAOs a selective advantage over competitors, and resulting in a more stable community over time [12, 17, 67, 68]. The propanoate metabolic pathway, which shows different amounts of evidence for LGT between the three communities, with very few transfers in the acetate-fed USA community, and a large number of transfers in the propionate-fed AU community, especially between the Betaproteobacteria and Gammaproteobacteria. The DK community has an intermediate number of transfers, but with more taxonomic groups implicated than the AU community. The taxonomic composition of EBPR communities is known to change over time , and with changing carbon sources ; this variability may also manifest through gene exchange between constituents of the community.
Focusing on LGT in energy-related metabolic pathways considered relevant to EBPR function provides context to the role of LGT in EBPR communities. LGT events not in the six energy pathways are also likely to be important in EBPR communities, such as phosphate metabolism, bacteriophage resistance, and flocculation/biofilm formation. Future analyses should also focus on other metabolic pathways for insights into alternative metabolism, and in broad functional categories for overall community functional aspects of LGT. Additional sequencing of EBPR communities would provide further insight into whether there are common LGT events.
DHJW was supported by the TULA Foundation and Genome Atlantic. RGB acknowledges the support of the Canada Research Chairs program. Computational analyses were supported by a grant from the Canada Foundation for Innovation to RGB. We are also grateful to Christian Blouin, Norbert Zeh, Elizabeth Edwards, Cheryl Devine, and Frances Slater for their feedback and enlightening discussions.
- Banard JL. A review of biological phosphorus removal in activated sludge. Water Sa. 1976;2:136–44.Google Scholar
- Oehmen A, Lemos PC, Carvalho G, Yuan Z, Keller J, Blackall LL, et al. Advances in enhanced biological phosphorus removal: from micro to macro scale. Water Research. 2007;41:2271–300.PubMedView ArticleGoogle Scholar
- He S, Gu AZ, McMahon KD. Fine-scale differences between Accumulibacter-like bacteria in enhanced biological phosphorus removal activated sludge. Water Sci Technol. 2006;54:111–7.PubMedView ArticleGoogle Scholar
- Mielczarek AT, Nguyen HTT, Nielsen JL, Nielsen PH. Population dynamics of bacteria involved in enhanced biological phosphorus removal in Danish wastewater treatment plants. Water Research. 2013;47:1529–44.PubMedView ArticleGoogle Scholar
- Nielsen PH, Saunders AM, Hansen AA, Larsen P, Nielsen JL. Microbial communities involved in enhanced biological phosphorus removal from wastewater—a model system in environmental biotechnology. Current Opinion in Biotechnology. 2012;23:452–9.PubMedView ArticleGoogle Scholar
- Yuan Z, Pratt S, Batstone DJ. Phosphorus recovery from wastewater through microbial processes. Current Opinion in Biotechnology. 2012;23:878–83.PubMedView ArticleGoogle Scholar
- Tu Y, Schuler AJ. Low acetate concentrations favor polyphosphate-accumulating organisms over glycogen-accumulating organisms in enhanced biological phosphorus removal from wastewater. Environmental Science & Technology. 2013;47:3816–24.View ArticleGoogle Scholar
- Zhang T, Liu Y, Fang HHP. Effect of pH change on the performance and microbial community of enhanced biological phosphate removal process. Biotechnology and Bioengineering. 2005;92:173–82.PubMedView ArticleGoogle Scholar
- He S, McMahon KD. Microbiology of “Candidatus Accumulibacter” in activated sludge. Microbial Biotechnology. 2011;4:603–19.PubMed CentralPubMedView ArticleGoogle Scholar
- Kong Y, Ong SL, Ng WJ, Liu W-T. Diversity and distribution of a deeply branched novel proteobacterial group found in anaerobic-aerobic activated sludge processes. Environmental Microbiology. 2002;4:753–7.PubMedView ArticleGoogle Scholar
- Wong M-T, Mino T, Seviour RJ, Onuki M, Liu W-T. In situ identification and characterization of the microbial community structure of full-scale enhanced biological phosphorous removal plants in Japan. Water Research. 2005;39:2901–14.PubMedView ArticleGoogle Scholar
- Gonzalez-Gil G, Holliger C. Dynamics of microbial community structure and enhanced biological phosphorus removal of propionate- and acetate-cultivated aerobic granules. Applied and Environmental Microbiology. 2011;77:8041–51.PubMed CentralPubMedView ArticleGoogle Scholar
- He S, Gu AZ, McMahon KD. Progress toward understanding the distribution of Accumulibacter among full-scale enhanced biological phosphorus removal systems. Microb Ecol. 2008;55:229–36.PubMedView ArticleGoogle Scholar
- Zhang T, Zhang X-X, Ye L. Plasmid metagenome reveals high levels of antibiotic resistance genes and mobile genetic elements in activated sludge. PLoS ONE. 2011;6:e26041.PubMed CentralPubMedView ArticleGoogle Scholar
- Nielsen PH, Mielczarek AT, Kragelund C, Nielsen JL, Saunders AM, Kong Y, et al. A conceptual ecosystem model of microbial communities in enhanced biological phosphorus removal plants. Water Research. 2010;44:5070–88.PubMedView ArticleGoogle Scholar
- García Martín H, Ivanova N, Kunin V, Warnecke F, Barry KW, McHardy AC, et al. Metagenomic analysis of two enhanced biological phosphorus removal (EBPR) sludge communities. Nat Biotechnol. 2006;24:1263–9.PubMedView ArticleGoogle Scholar
- Oehmen A, Yuan Z, Blackall LL, Keller J. Comparison of acetate and propionate uptake by polyphosphate accumulating organisms and glycogen accumulating organisms. Biotechnology and Bioengineering. 2005;91:162–8.PubMedView ArticleGoogle Scholar
- Seviour RJ, Mino T, Onuki M. The microbiology of biological phosphorus removal in activated sludge systems. FEMS Microbiology Reviews. 2003;27:99–127.PubMedView ArticleGoogle Scholar
- Lanham AB, Oehmen A, Saunders AM, Carvalho G, Nielsen PH, Reis MAM. Metabolic versatility in full-scale wastewater treatment plants performing enhanced biological phosphorus removal. Water Research. 2013;47:7032–41.PubMedView ArticleGoogle Scholar
- McIlroy SJ, Albertsen M, Andresen EK, Saunders AM, Kristiansen R, Stokholm-Bjerregaard M, et al. “Candidatus Competibacter-”lineage genomes retrieved from metagenomes reveal functional metabolic diversity. The ISME Journal. 2013;8:613–24.PubMed CentralPubMedView ArticleGoogle Scholar
- Flowers JJ, He S, Malfatti S, del Rio TG, Tringe SG, Hugenholtz P, et al. Comparative genomics of two “Candidatus Accumulibacter” clades performing biological phosphorus removal. The ISME Journal. 2013;7:2301–14.PubMed CentralPubMedView ArticleGoogle Scholar
- Mao Y, Yu K, Xia Y, Chao Y, Zhang T. Genome reconstruction and gene expression of “Candidatus Accumulibacter phosphatis” Clade IB performing biological phosphorus removal. Environmental Science & Technology. 2014;48:10363–71.View ArticleGoogle Scholar
- Albertsen M, Hansen LBS, Saunders AM, Nielsen PH, Nielsen KL. A metagenome of a full-scale microbial community carrying out enhanced biological phosphorus removal. The ISME Journal. 2011;6:1094–106.PubMed CentralPubMedView ArticleGoogle Scholar
- Beiko RG, Harlow TJ, Ragan MA. Highways of gene sharing in prokaryotes. Proceedings of the National Academy of Sciences of the United States of America. 2005;102:14332–7.PubMed CentralPubMedView ArticleGoogle Scholar
- Koonin EV, Galperin MY. Prokaryotic genomes: the emerging paradigm of genome-based microbiology. Current Opinion in Genetics & Development. 1997;7:757–63.View ArticleGoogle Scholar
- Lawrence JG, Ochman H. Amelioration of bacterial genomes: rates of change and exchange. Journal of Molecular Evolution. 1997;44:383–97.PubMedView ArticleGoogle Scholar
- Ragan MA. Detection of lateral gene transfer among microbialgenomes. Current Opinion in Genetics & Development. 2001;11:620–6.View ArticleGoogle Scholar
- Ragan MA. On surrogate methods for detecting lateral gene transfer. FEMS Microbiology Letters. 2001;201:187–91.PubMedView ArticleGoogle Scholar
- Sentchilo V, Mayer AP, Guy L, Miyazaki R, Green Tringe S, Barry K, et al. Community-wide plasmid gene mobilization and selection. The ISME Journal. 2013;6:1173–86.View ArticleGoogle Scholar
- Sobecky PA, Coombs JM. Horizontal gene transfer in metal and radionuclide contaminated soils. In: Gogarten MBB, Gogarten JP, Lorraine O, editors. Horizontal Gene Transfer, Genomes in Flux. Clifton, NJ: Humana Press; 2009. p. 455–72.View ArticleGoogle Scholar
- Barlow M. What antimicrobial resistance has taught us about horizontal gene transfer. In: Gogarten MBB, Gogarten JP, Lorraine O, editors. Horizontal Gene Transfer, Genomes in Flux. Clifton, NJ: Humana Press; 2009. p. 397–411.View ArticleGoogle Scholar
- Popa O, Hazkani-Covo E, Landan G, Martin W, Dagan T. Directed networks reveal genomic barriers and DNA repair bypasses to lateral gene transfer among prokaryotes. Genome Research. 2011;21:599–609.PubMed CentralPubMedView ArticleGoogle Scholar
- Hong H, Ko H-J, Choi I-G, Park W. Previously undescribed plasmids recovered from activated sludge confer tetracycline resistance and phenotypic changes to Acinetobacter oleivorans DR1. Microb Ecol. 2014;67:369–79.PubMedView ArticleGoogle Scholar
- Ma L, Zhang X-X, Zhao F, Wu B, Cheng S, Yang L. Sewage treatment plant serves as a hot-spot reservoir of integrons and gene cassettes. Journal of Environmental Biology. 2013;34(2 Spec No):391–9.PubMedGoogle Scholar
- Szczepanowski R, Bekel T, Goesmann A, Krause L, Krömeke H, Kaiser O, et al. Insight into the plasmid metagenome of wastewater treatment plant bacteria showing reduced susceptibility to antimicrobial drugs analysed by the 454-pyrosequencing technology. Journal of Biotechnology. 2008;136:54–64.PubMedView ArticleGoogle Scholar
- Zhang, T., Zhang, X.-X., & Ye, L. Plasmid metagenome reveals high levels of antibiotic resistance genes and mobile genetic elements in activated sludge. PLoS ONE. 2011; 6(e26041).Google Scholar
- Schlüter A, Krahn I, Kollin F, Bönemann G, Stiens M, Szczepanowski R, et al. IncP-1-beta plasmid pGNB1 isolated from a bacterial community from a wastewater treatment plant mediates decolorization of triphenylmethane dyes. Applied and Environmental Microbiology. 2007;73:6345–50.PubMed CentralPubMedView ArticleGoogle Scholar
- Top EM, Springael D, Boon N. Catabolic mobile genetic elements and their potential use in bioaugmentation of polluted soils and waters. FEMS Microbiology Ecology. 2002;42:199–208.PubMedView ArticleGoogle Scholar
- Bathe S, Schwarzenbeck N, Hausner M. Plasmid-mediated bioaugmentation of activated sludge bacteria in a sequencing batch moving bed reactor using pNB2. Letters in Applied Microbiology. 2005;41:242–7.PubMedView ArticleGoogle Scholar
- Lang AS, Zhaxybayeva O, Beatty JT. Gene transfer agents: phage-like elements of genetic exchange. Nature Reviews Microbiology. 2012;10:472–82.PubMed CentralPubMedGoogle Scholar
- Thomas CM, Nielsen KM. Mechanisms of, and barriers to, horizontal gene transfer between bacteria. Nat Rev Microbiol. 2005;3:711–21.PubMedView ArticleGoogle Scholar
- Zhaxybayeva O. Detection and quantitative assessment of horizontal gene transfer. In: Gogarten MBB, Gogarten JP, Lorraine O, editors. Horizontal Gene Transfer, Genomes in Flux. Clifton, NJ: Humana Press; 2009. p. 195–213.View ArticleGoogle Scholar
- Lawrence JG, Ochman H. Reconciling the many faces of lateral gene transfer. Trends Microbiol. 2002;10:1–4.PubMedView ArticleGoogle Scholar
- Ragan MA, Harlow TJ, Beiko RG. Do different surrogate methods detect lateral genetic transfer events of different relative ages? Trends in Microbiology. 2006;14:4–8.PubMedView ArticleGoogle Scholar
- Charuvaka A, Rangwala H. Evaluation of short read metagenomic assembly. BMC Genomics. 2011;12 Suppl 2:S8.PubMed CentralPubMedView ArticleGoogle Scholar
- Mavromatis K, Ivanova N, Barry K, Shapiro H, Goltsman E, McHardy AC, et al. Use of simulated data sets to evaluate the fidelity of metagenomic processing methods. Nat Methods. 2007;4:495–500.PubMedView ArticleGoogle Scholar
- Pignatelli M, Moya A. Evaluating the fidelity of de novo short read metagenomic assembly using simulated data. PLoS ONE. 2011;6:e19984.PubMed CentralPubMedView ArticleGoogle Scholar
- Skippington E, Ragan MA. Lateral genetic transfer and the construction of genetic exchange communities. FEMS Microbiology Reviews. 2011;35:707–35.PubMedView ArticleGoogle Scholar
- Zhou Y, Liang Y, Lynch KH, Dennis JJ, Wishart DS. PHAST: a fast phage search tool. Nucleic Acids Res. 2011;39(Web Server issue):W347–52.PubMed CentralPubMedView ArticleGoogle Scholar
- Leplae R, Lima-Mendez G, Toussaint A. ACLAME: a CLAssification of Mobile genetic Elements, update 2010. Nucleic Acids Res. 2010;38(Database issue):D57–61.PubMed CentralPubMedView ArticleGoogle Scholar
- Macdonald NJ, Parks DH, Beiko RG. Rapid identification of high-confidence taxonomic assignments for metagenomic data. Nucleic Acids Res. 2012;40:e111.PubMed CentralPubMedView ArticleGoogle Scholar
- Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26:2460–1.PubMedView ArticleGoogle Scholar
- Parks DH, Macdonald NJ, Beiko RG. Classifying short genomic fragments from novel lineages using composition and homology. BMC Bioinformatics. 2011;12:328.PubMed CentralPubMedView ArticleGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. Journal of Molecular Biology. 1990;215:403–10.PubMedView ArticleGoogle Scholar
- Klimke W, Agarwala R, Badretdin A, Chetvernin S, Ciufo S, Fedorov B, et al. The National Center for Biotechnology Information's Protein Clusters Database. Nucleic Acids Res. 2009;37(Database issue):D216–23.PubMed CentralPubMedView ArticleGoogle Scholar
- Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research. 2000;28:27.PubMed CentralPubMedView ArticleGoogle Scholar
- Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–7.PubMed CentralPubMedView ArticleGoogle Scholar
- Price MN, Dehal PS, Arkin AP. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Molecular Biology and Evolution. 2009;26:1641–50.PubMed CentralPubMedView ArticleGoogle Scholar
- Goldman N, Whelan S. Statistical tests of gamma-distributed rate heterogeneity in models of sequence evolution in phylogenetics. Molecular Biology and Evolution. 2000;17:975–8.PubMedView ArticleGoogle Scholar
- Shimodaira H, Hasegawa M. Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic Inference. Molecular Biology and Evolution. 1999;16:1114–6.View ArticleGoogle Scholar
- Goldman N, Anderson JP, Rodrigo AG. Likelihood-based tests of topologies in phylogenetics. Systematic Biology. 2000;49:652–70.PubMedView ArticleGoogle Scholar
- Slater FR, Johnson CR, Blackall LL, Beiko RG, Bond PL. Monitoring associations between clade-level variation, overall community structure and ecosystem function in enhanced biological phosphorus removal (EBPR) systems using terminal-restriction fragment length polymorphism (T-RFLP). Water Research. 2010;44:4908–23.PubMedView ArticleGoogle Scholar
- Ni J, Yan Q, Yu Y. How much metagenomic sequencing is enough to achieve a given goal? Scientific Reports. 2013;3:1968.PubMed CentralPubMedGoogle Scholar
- Sukumaran J, Holder MT. DendroPy: a Python library for phylogenetic computing. Bioinformatics. 2010;26:1569–71.PubMedView ArticleGoogle Scholar
- Sims D, Sudbery I, Ilott NE, Heger A, Ponting CP. Sequencing depth and coverage: key considerations in genomic analyses. Nature Reviews Genetics. 2014;15:121–32.PubMedView ArticleGoogle Scholar
- Mende DR, Waller AS, Sunagawa S, Järvelin AI, Chan MM, Arumugam M, et al. Assessment of metagenomic assembly using simulated next generation sequencing data. PLoS ONE. 2012;7:e31386.PubMed CentralPubMedView ArticleGoogle Scholar
- Chen Y, Randall AA, McCue T. The efficiency of enhanced biological phosphorus removal from real wastewater affected by different ratios of acetic to propionic acid. Water Research. 2004;38:27–36.PubMedView ArticleGoogle Scholar
- Thomas M, Wright P, Blackall L, Urbain V, Keller J. Optimisation of Noosa BNR plant to improve performance and reduce operating costs. Water Sci Technol. 2003;47:141–8.PubMedGoogle Scholar
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.