Genomic data mining reveals a rich repertoire of transport proteins in Streptomyces

Background Streptomycetes are soil-dwelling Gram-positive bacteria that are best known as the major producers of antibiotics used in the pharmaceutical industry. The evolution of exceptionally powerful transporter systems in streptomycetes has enabled their adaptation to the complex soil environment. Results Our comparative genomic analyses revealed that each of the eleven Streptomyces species examined possesses a rich repertoire of from 761-1258 transport proteins, accounting for 10.2 to 13.7 % of each respective proteome. These transporters can be divided into seven functional classes and 171 transporter families. Among them, the ATP-binding Cassette (ABC) superfamily and the Major Facilitator Superfamily (MFS) represent more than 40 % of all the transport proteins in Streptomyces. They play important roles in both nutrient uptake and substrate secretion, especially in the efflux of drugs and toxicants. The evolutionary flexibility across eleven Streptomyces species is seen in the lineage-specific distribution of transport proteins in two major protein translocation pathways: the general secretory (Sec) pathway and the twin-arginine translocation (Tat) pathway. Conclusions Our results present a catalog of transport systems in eleven Streptomyces species. These expansive transport systems are important mediators of the complex processes including nutrient uptake, concentration balance of elements, efflux of drugs and toxins, and the timely and orderly secretion of proteins. A better understanding of transport systems will allow enhanced optimization of production processes for both pharmaceutical and industrial applications of Streptomyces, which are widely used in antibiotic production and heterologous expression of recombinant proteins. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2899-4) contains supplementary material, which is available to authorized users.


Background
Streptomyces is a group of soil-dwelling Gram-positive bacteria, which are well known for their ability to produce a broad array of secondary metabolites including antibiotics, antifungals, antiparasitic drugs, anticancer agents, immunosuppressants, and herbicides [1,2]. They are also ideal systems in biotechnology for heterologous expression of recombinant proteins with simple downstream processing and high yields [3,4]. In order to survive in the complex soil environment, streptomycetes have evolved exceptionally powerful transport systems [5,6]. For example, in Streptomyces coelicolor, there are more than 600 predicted transport proteins with a large proportion being the ATP-binding Cassette (ABC) and Major Facilitator Superfamily (MFS) transporters, which have been implicated in the transport of secondary metabolites including antibiotics [7]. In addition to secondary metabolites, streptomycetes also secret to the environment a mass of proteins through the general secretory (Sec) pathway and the twin-arginine translocation (Tat) pathway [8][9][10]. These secretory systems are known to facilitate nutrient acquisition. For example, secreted cellulases and chitinases can degrade otherwise insoluble nutrient sources.
Transporters are of critical importance to all living organisms in facilitating metabolism, intercellular communication, biological synthesis and reproduction. They are involved in the uptake of nutrients from the environment, the secretion of metabolites, the efflux of drugs and toxins, the maintenance of ion concentration gradient across membranes, the secretion of macromolecules, such as sugars, lipids, proteins and nucleic acids, signaling molecules, the translocation of membrane proteins, and so on [11]. A Transporter Classification (TC) system has been developed by the Saier group [11,12]. To date, more than 10,000 non-redundant transport proteins comprising about 750 families are collected in their Transporter Classification Database (TCDB) [13]. These families are divided among seven major classes: Channels/Pores (Class 1), Electrochemical Potential-driven Transporters (Class 2), Primary Active Transporters (Class 3), Group Translocators (Class 4), Transmembrane Electron Carriers (class 5), Accessory Factors Involved in Transport (Class 8), and Incompletely Characterized Transport Systems (Class 9). This classification system has been applied to in-depth studies of transporters in a number of microbial genomes [14][15][16][17], and is being adopted in this study for Streptomyces.
The availability of genomes from closely related Streptomyces species enables comprehensive analysis of the transport protein families in Streptomyces. In this study, we report a catalog and comparative genomic analysis of transporters in eleven Streptomyces species with complete genome sequences and annotations, including S. coelicolor (SCO), S. avermitilis (SAV), S. bingchenggensis (SBI), S. cattleya (SCAT), S. flavogriseus (SFLA), S. griseus (SGR), S. hygroscopicus (SHJG), S. scabiei (SCAB), S. sp. SirexAA-E (SACTE), S. venezuelae (SVEN) and S. violaceusniger (STRVI) [7,[18][19][20][21][22][23][24]. We identified and classified these Streptomyces transporters, using the nomenclature in the TCDB. The class, transmembrane topology and substrate specificity of these transporters are investigated in detail. An improved understanding of Streptomyces transporters will bring new insights into the mechanisms underlying the unique and powerful secretion systems of secondary metabolites and proteins in this group of bacteria of enormous economic and biomedical significance.

Results and discussion
Abundant transporters are present in eleven Streptomyces genomes Strong material intake and secretion capacity powered by transport systems is an adaptive attribute of soil-dwelling bacteria [1]. We used the coding sequences from eleven Streptomyces genomes to query the TCDB [13,25] using BLASTP and identified 761-1258 transporters in these eleven genomes, which accounted for 10.2 to 13.7 % of each respective proteome (Table 1 and Additional file 1). S. bingchenggensis, which has the largest genome, and the largest number of protein-coding genes, has the largest number of transporters, whereas S. cattleya contains only 761 transporters, the lowest number and proportion of transporters among the eleven Streptomyces species.

Streptomyces transporters show diverse transmembrane topology
The capacity of a transporter is often associated with the complexity and topology of its transmembrane region(s) where the major events of substrate uptake or output across the cell membranes take place. Using the TMHMM (TransMembrane prediction using Hidden Markov Models) algorithm [26], we performed the transmembrane topology analysis for Streptomyces transporters to identify the transmembrane segments (TMSs). The number of TMSs ranges from 0 to 24. The largest number of TMSs observed in a transporter in the eleven Streptomyces genomes varies from 16 to 24 (Table 2) It is possible that these 12-TMS transporters have arisen from the primordial 6-TMS form via intragenic duplication [27]. Among the transporters with more than 6 TMSs, the transporters with an even number of TMSs are more abundant than those with an odd number of TMSs (Fig. 1). The distribution of TMSs in S. griseus transporters is unique: this bacterium has 53 transporters with 9 TMSs, mostly ABC transporters, accounting for 5.4 % of the total transporters. This proportion is significantly higher than that of the other ten sibling species. On the other hand, S. griseus has the lowest proportion of 12-TMS transporters (7.3 %), most of which are also ABC transporters. These topology patterns suggest that during the evolution of transporters in S. griseus, the "6 + 3" events may be more frequent than the typical "6 + 6" events observed in ten other Streptomyces species [27,28].

Transporters in eleven Streptomyces genomes can be divided into seven classes and 171 families
The Streptomyces transporters fall into seven classes and 171 transporter families according to the TCDB system (Table 3 and Additional file 2). The distribution of transporters in each species is depicted in Fig. 2.
The Primary Active Transporters (Class 3) is the most abundant class of transporters in Streptomyces, which includes 365-705 transporters (representing about 48.0-57.5 % of the total transport machinery). This class of transporters plays important roles in various aspects of bacterial life cycle, especially in the import and export of secondary metabolites, and cation transportation.
Class 2 transporters, the electrochemical potentialdriven transporters, are also widely found in Streptomyces. 212-330 transporters in eleven Streptomyces genomes belong to this class, which account for 24.4 %-31.4 % of all the transporters. The porters in this class include uniporters, symporters and antiporters. The most abundant family, MFS, in Class 2 transporters has been implicated in drug efflux. Lineage-specificity is also observed in this class of transporters. For example, S. bingchenggensis possesses two Ion-gradient-driven Energizers (TC 2.C), while the other ten Streptomyces species only have Porters (uniporters, symporters, antiporters) (TC 2.A).
Class 1 transporters are not abundant, but are functionally important for Streptomyces. 22-34 channel/pore transporters are present in these eleven genomes, accounting for 2.3 %-3.2 % of all the transporters. The majority of these channel-type proteins are alpha-type channels (TC 1.A), which have been implicated in stress responses of Gram-positive bacteria, especially responses to osmotic pressure [27]. A small number of proteins belong to β-type porins and a fewer are putative Channel-Forming Toxins (TC 1.C). The membranebounded channel (TC 1.I) subclass is rare in Streptomyces; only S. bingchenggensis has a transport protein from this subclass.
Classes 4, 5, and 8 are relatively less abundant. About 3.0 %-5.3 % of all the transport proteins are Class 4 transporters. Two major subclasses observed in Class 4 are the PTS Glucose-Glucoside (Glc) family (4.A.1) and the Fatty Acid Transporter (FAT) family (4.C.1), which are responsible for the transport of glucoses-glucosides and fatty acids, respectively. Notably, S. cattleya, which has the smallest repertoire of transporters among the eleven Streptomyces, does not seem to contain any Glc transporters; it remains unknown if it uses an alternative system.   4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 [31,32]. The ABC transport system is composed of the intake system and the efflux system.
The 30 intake families (TC 3.A.1-3.A.33) that we identified in the Streptomyces genomes are specialized in the uptake of diverse nutrient substances. This intake system includes families of Carbohydrate Uptake Transporters (TC 3.A.1.1, 3.A.1.2) that transport saccharides, Polar Amino Acid Uptake Transporters and Hydrophobic

The MFS transporters
Unlike the ABC transporters, the MFS transporters are driven by an electrochemical potential formed by ion concentration gradients across the cytomembrane [30]. There are 90-169 (10.1 %-15.0 %) MFS transporters in eleven Streptomyces genomes. Streptomyces possesses 39 subfamilies of MFS transporters, including 20 intake systems, 13 efflux systems and 6 systems whose transport direction is unknown. The substances transported by the intake systems are mainly saccharides and organic acids.
One of the most important roles of the MFS transporters is drug efflux [30].

The wide distribution of substrates for Streptomyces transporters
The capacity of the complex and powerful transporter system in Streptomyces is evidenced by the broad scope of the substrates being transported. Figure 3a shows the distribution of transporters that transport different type of substrates in Streptomyces, including carbon sources, drugs, toxicants, electrons, inorganic molecules, macromolecules, amino acids and derivatives, nucleotides and derivatives, vitamins, and accessory factors. The carbon source transporters are the most abundant, with their proportion of all the transport proteins ranging from 21.7 to 31.6 % in eleven genomes. Notably, the substrates of an average of 6.4 % of the transporters in Streptomyces genomes examined cannot be determined based on genomic analysis, and await advanced structural and biochemical characterization.
Streptomyces transporters can be divided into three classes, uptake, efflux and bidirectional, according to the direction of the substrates transported (Fig. 3b)  Streptomyces have lineage-specific protein secretion systems Streptomyces have two major lineage-specific protein transport systems, the Tat system (TC 2.A.64) and the Sec system (TC 3.A.5) [8,9]. The Tat system was shown to be related to the pathogenicity of pathogenic bacteria [33]. In S. scabies, the transporters in the Tat pathway secrete several toxicity-associated proteins [34]. While the key component proteins of the Tat system, TatA, TatB and TatC, are present in all eleven Streptomyces genomes we looked at, lineage-specificity is clearly shown with respect to the copy number variation of these genes (Table 4). Only one copy of the tatB and tatC genes is present in nine Streptomyces genomes; S. flavogriseus has two copies of the tatB genes and S. hygroscopicus has two copies of the tatC genes. The copy number of the tatA gene ranges from one to three in eleven genomes (Table 4). Phylogenetic analysis shows that the multiple copies of the tatA genes may have different evolutionary origins and can be divided into three independent clades, namely tatA1, tatA2 and tatA3 (Fig. 4a).
The tatA paralogous genes in the majority of the Streptomyces genomes belong to different clades. Notably, all the three tatA paralogous genes in S. cattleya are clustered into the tatA3 clade, indicative of recent gene duplication events.  Similarly, the Sec system is also species-specific. This system includes SecA, SecY, SecE, SecG, SecD, SecF, YajC, FtsY, etc. [35], all of which are highly conserved in Streptomyces (Table 5). There is only one copy of the secE, secG, secD, secF, yajC and ftsY genes in each of the eleven Streptomyces genomes. Interestingly, there is a second set of secA2/secY2 genes in several species, which may be involved in the secretion of proteins with specific functions, for example, the secretion of toxic proteins [36]. In S. avermitilis, for instance, there are two copies of the secA genes, and S. venezuelae has two copies of the secY genes.
The evolutionary pattern in the secD and the secF genes is particularly interesting (Fig. 4b). In bacteria, these genes encode accessory factors in the Sec pathway that can accelerate the translocation of protein substrates. There are two forms of the secD and secF genes: in the first form, these two genes are adjacent but separate, while in the second form, the two genes are fused into a single secDF gene. The fused secDF is present in seven Streptomyces genomes. Unlike most bacteria that have one of the two forms, the majority of Streptomyces species have both the separated form and the fused form [37]. The acquisition of a second copy may confer a selective advantage to Streptomyces by enhancing the capacity and the effectiveness of protein transport.

Conclusions
Comparative genomic analyses of eleven Streptomyces genomes revealed an abundant repertoire of 761-1258 transporters, belonging to seven transporter classes and 171 transporter families. The powerful transport systems in Streptomyces play critical roles in drug efflux, protein secretion and stress response. A better understanding of transport systems will allow enhanced optimization of production processes for both pharmaceutical and industrial applications of Streptomyces.

Data
The completed whole genome data of the eleven Streptomyces species (Table 1), including amino acid sequences and functional annotations of all the proteins were downloaded from the NCBI database (http://www.ncbi.nlm.nih.gov/genome/browse/). The transporter classification and amino acid sequences of all classified transporters were downloaded from the TCDB database (http:// www.tcdb.org/) [13]. We also collected data from the TransporterDB database [38] (http://www.membranetransport.org/) which included the transporter classification data of S. coelicolor and S. avermitilis, and from the Transporter Inference Parser database [39] (http://biocyc.org/), which identified transporter according to their function annotation and included the relevant data of S. coelicolor, S. avermitilis, S. griseus and S. scabies.

Identification and classification of transporters
The BLASTP search of all the proteins in eleven Streptomyces species versus all the transport proteins in TCDB database was conducted to identify transporters in Streptomyces that are homologs to known and predicted transporters in the TCDB [13,25]. The threshold for homologous genes was set as follow: E-value ≤ 10 -5 , similarity ≥ 50 %, and the On the basis of the degree of similarities with known or predicted transporters in the TCDB, as well as the conserved domains and the number and location of TMSs, we further classified the Streptomyces transporters into families and subfamilies of homologous transporters according to the TC system [13]. The TC number generally has five components: V.W.X.Y.Z, representing the transporter class, subclass, family, subfamily and the substrate or range of substrates transported [11,12]. Most Streptomyces transporters were classified at the transporter family level. The transporters in superfamilies such as ABC and MFS were classified at the subfamily level.
The substrate and transport direction of each Streptomyces transporter was predicted based on homology to functionally characterized transporters in the TCDB. Classification of a putative transporter into a family or subfamily according to the TC system allows for the prediction of substrate types and transport direction with confidence [13,17,41].

Phylogenetic analysis of transport protein families
Multiple sequence alignments were obtained using Clustal X 2.1 [42]. Phylogenetic trees were reconstructed using MEGA6 with neighbor-joining (NJ), maximum parsimony (MP) and maximum likelihood (ML) methods [43].