Horizontal gene transfer contributes to virulence and antibiotic resistance of Vibrio harveyi 345 based on complete genome sequence analysis

Background Horizontal gene transfer (HGT), which is affected by environmental pollution and climate change, promotes genetic communication, changing bacterial pathogenicity and drug resistance. However, few studies have been conducted on the effect of HGT on the high pathogenicity and drug resistance of the opportunistic pathogen Vibrio harveyi. Results V. harveyi 345 that was multidrug resistant and infected Epinephelus oanceolutus was isolated from a diseased organism in Shenzhen, Southern China, an important and contaminated aquaculture area. Analysis of the entire genome sequence predicted 5678 genes including 487 virulence genes contributing to bacterial pathogenesis and 25 antibiotic-resistance genes (ARGs) contributing to antimicrobial resistance. Five ARGs (tetm, tetb, qnrs, dfra17, and sul2) and one virulence gene (CU052_28670) on the pAQU-type plasmid p345–185, provided direct evidence for HGT. Comparative genome analysis of 31 V. harveyi strains indicated that 217 genes and 7 gene families, including a class C beta-lactamase gene, a virulence-associated protein D gene, and an OmpA family protein gene were specific to strain V. harveyi 345. These genes could contribute to HGT or be horizontally transferred from other bacteria to enhance the virulence or antibiotic resistance of 345. Mobile genetic elements in 71 genomic islands encoding virulence factors for three type III secretion proteins and 13 type VI secretion system proteins, and two incomplete prophage sequences were detected that could be HGT transfer tools. Evaluation of the complete genome of V. harveyi 345 and comparative genomics indicated genomic exchange, especially exchange of pathogenic genes and drug-resistance genes by HGT contributing to pathogenicity and drug resistance. Climate change and continued environmental deterioration are expected to accelerate the HGT of V. harveyi, increasing its pathogenicity and drug resistance. Conclusion This study provides timely information for further analysis of V. harveyi pathogenesis and antimicrobial resistance and developing pollution control measurements for coastal areas.

The wide use of antibiotics, increased environmental pollution and global climate change are leading to the enhancement of drug resistance of Vibrio spp., including V. harveyi. Several studies show that drug resistance and multidrug resistance (MDR) are occurring in aquaculture worldwide, including in shrimp ponds in Thailand, Malaysia, India, and China [16][17][18][19]; shellfish farms in Malaysia, Korea, USA, Poland and China [20][21][22][23][24]; and fish farms in Italy, Korea, and China [25][26][27]. Enhanced drug resistance leads to stronger virulence resulting in difficulty in preventing and treating V. harveyi infection [15,16]. For example, Nakayama et al. [28] found that gradually increasing antibiotic concentration and frequent subculturing enhances V. harveyi antibiotic resistance, elevating the toxicity of V. harveyi. The pathogenic and drug-resistant genes of V. harveyi are a key to the fundamental cause of pathogenicity and drug resistance. Therefore, studying the pathogenic and drug-resistance genes of V. harveyi will provide an important foundation for determining pathogenic and drug-resistance mechanisms.
Although V. harveyi a recognized pathogen of marine animals, different strains vary in their ability to cause disease [29]. The main virulence factors of V. harveyi are extracellular proteases, outer membrane proteins, hemolysins, esterases, phospholipases, exotoxin and secretion systems [29,30]. Production of antimicrobial enzymes that inactivate antibacterial drugs is an important resistant mechanism [31]. Inactivating enzymes can be expressed by genes on plasmids or chromosomes [32]. Virulence factors and antibiotic-resistance genes (ARGs) can also be vertically transferred and spread via horizontal gene transfer (HGT) through mobile genetic elements (MGEs) such as plasmids, bacteriophages, transposons, integrative and conjugative elements (ICEs), and genetic islands (GIs) [33]. Park et al. [34] found that MGEs that are prophages, GIs and pathogenicity islands carry different combinations of virulence factors that promote immune evasion and superantigens that contribute to serious Staphylococcus aureus infection. Le et al. [35] reported that 0.92% (36 of 38,895) analyzed proteins from 31 Rickettsiales genomes are associated with strong bootstrap support for HGT with function as ATPases, aldolases, transporter activities, cystathionine beta-lyases, sugar phosphate permeases, growth inhibitors and antitoxin activities. For horizontal transfer, exogenous DNA must evade the bacterial immune system such as restriction modification systems (RM systems) and CRISPR-Cas systems [36,37].
Serious infection and multidrug resistance lead to the emergency of the availability of whole genome sequences for different V. harveyi strains to determine pathogenic and antimicrobial-resistance mechanisms. Evidence is growing that HGT is an important driving force for prokaryotic evolution affecting pathogenicity and drug resistance [34,38]. Coastal urbanization has recently intensified, resulting in the production of a large amount of antibiotics, heavy metals, and nutrients pollutants (Additional file 1: Table S1), which may increase the pathogenicity and drug resistance of V. harveyi by affecting HGT [39,40]. V. harveyi is the dominant species that causes serious infection and mortality of farmed fish in Guangdong, Southern China [14,15]. However, little information is available on the mechanism of V. harveyi pathogenesis and drug resistance.
The predominant strain V. harveyi 345 is multidrug resistant to ampicillin, rifampicin, tetracycline, pediatric compound sulfamethoxazole tablets, vancomycin, doxycycline, trimethoprim, streptomycin, kanamycin, sulfamethoxazole, furazolidone, cefixime, and chloramphenicol. This strain was isolated from a diseased E. oanceolutus in Shenzhen, Southern China. V. harveyi 345 is suspected to lead to kidney enlargement and softening, spleen enlargement, and anal bleeding of E. oanceolutus and has a median lethal dose (LD 50 ) of 9.83 × 10 5 CFU·g − 1 [41]. Therefore, its complete genome information and HGT events are helpful for clarifying V. harveyi pathogenesis and drug resistance, controlling V. harveyi disease and reducing economic losses. In this study, we present the entire genome sequence of V. harveyi 345 with comparative genomics analysis of its pathogenesis, antimicrobial resistance and genome expansion caused by HGT.

Gene annotation
A total of 5678 genes (fewer than ZJ0603 and more than 29 other strains) (Additional file 2: Table S2) were predicted to be an average 929 bp with a mean GC content of 45.68%, accounting for 85.29% of the genome. The three highest levels of gene length distribution were 400-500 bp, 600-700 bp, and 300-400 bp and the three lowest were 0-100 bp, 1900-2000 bp, and 1800-1900 bp (Fig. 2a). In addition, 128 tRNA, 31 rRNA and 13 sRNA genes were identified with confidence. A total of 89 tandem repeats, including 44 minisatellite DNA and 35 microsatellite DNA, were predicted (Table 1 and Fig. 1).
Predicted open reading frames (ORFs) were further classified into Clusters of Orthologous Groups (COG) (4586 ORFs in total contenting 80.77%, Fig. 2b). According to COG categorization analysis, the top five groups in abundance were amino acid transport and metabolism (361 ORFs), transcription (350 ORFs), general function prediction only (343 ORFs), signal transduction mechanisms (312 ORFs) and carbohydrate transport and metabolism (312 ORFs) (Fig. 2b). In addition, 3247 genes were classified into 40 functional Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (Additional file 3: Figure S1). Some genes were involved in more than one KEGG pathway and included the top five pathways for replication and repair (784 ORFs), carbohydrate metabolism (633 ORFs), signal transduction (506 ORFs), infectious diseases (463 ORFs) and membrane transport (432 ORFs). Furthermore, 3453 genes were classified into 40 functional Gene Ontology (GO) classifications (Additional file 4: Figure S2). Some genes were involved in more than one GO classifications and included the top five classifications of metabolic process (1914 ORFs), cellular process (1881 ORFs), catalytic activity (1785 ORFs), single-organism process (1526 ORFs) and binding (1494 ORFs).
Twelve RM system genes were annotated in V. harveyi strain 345 (Table 2). Ten belonged to Type I RM systems, encoding three R subunits (hsdR), two S subunits (hsdS) and five M subunits (hsdM). The other two belonged to type III RM systems, encoding a methyltransferase (mod) and a restriction enzyme (res).

Genomic islands, prophages and CRISPR-Cas systems
Forty-seven GIs were detected on ChI (Additional file 6: Table S4) and 24 on ChII in V. harveyi 345 (Additional file 7: Table S5). In the GIs of ChI, 40 transposases were encoded belonging to the IS110, IS21, IS256, IS3, IS5/IS1182, IS66, IS91, ISL3, and ISNCY families and seven belonging to two GIs. Another eight genes were identified that encoded IS66 family insertion sequence hypothetical proteins. Three integrases and one serine recombinase were encoded. Three genes were  predicted to encode type III secretion proteins and one was predicted to encode another virulence factor (CU052_RS12915). In the GIs of ChII, nine genes encoding transposases belonging to the IS3 and ISNCY families were detected with four belonging to more than one GI. We found three genes encoding integrases and 13 genes on three GIs predicted to encode type VI secretion system (T6SS) proteins. Two intact prophage sequences were identified (Additional file 8: Table S6 and no CRISPR elements were predicted. Phylogenetic analysis, pan-core genes, dispensable and strain-specific genes Phylogenetic trees showed that V. harveyi 345 closely related to other V. harveyi strains, especially V. harveyi VHJR7 and V. harveyi CAIM463 (Fig. 3).
The pan-gene refers to the total number of genes in all 31 strains, while core-gene represents orthologous genes shared among them. Dispensable genes were not included in at least one of the 31 strains and strainspecific genes were included in only one strain. Using 31 genomes (Additional file 2: Table S2), 9878 pan-genes with a total size of 2,764,797 amino acid (aa) and 3338 core-genes with a total size of 1,133,465 aa were identified ( Fig. 4a-b-c). A dispensable gene heatmap showed that each strain contained strain-specific genes (Fig. 4d). A total of 217 genes were specific to V. harveyi 345 with 10 on p345-67, 94 on ChI, 17 on ChII, and 96 on p345-185 (Additional file 9: Table S7, Fig. 5a). Twenty-nine genes were annotated by Non-Redundant (NR) database and the others were hypothetical protein-encoding genes (Additional file 9: Table S7, Fig. 5b). Four annotated genes encoded MGEs, including two type IV conjugative transfer proteins (CU052_28715 and CU052_28750) and two recombinases (CU052_29120 and CU052_00050) on p345-185 or p345-67. Four (CU052_13455, CU052_ 13535, CU052_13525, and CU052_13445) were phagerelated genes on ChI. Four genes (CU052_14535, CU052_14240, CU052_23640, and CU052_13020) on Chr I encoded membrane proteins. One beta-lactamase class C-encoding gene (CU052_00920) was on Chr I. Two DNA methylase-encoding genes (CU052_28385 and CU052_28885) belonging to the bacterial immune system were on p345-185. One virulence-associated protein D (CU052_13320) was on Chr I (Additional file 9: Table S7).

Strain-specific gene family
A strain-specific gene family was present in one of the 31 strains. Seven gene families were strain specific to V. harveyi 345 with > 50% identity to Pseudovibrio spp., Photobacterium spp., Escherichia spp., and other Vibrio spp. (V. parahaemolyticus) and were involved in recombination, peptide synthesis, conjugation and integration (Table 5).
HGT candidates on chromosomes CU052_00920 in V. harveyi 345 encoded a betalactamase class C consisting of 486 amino acids. It shared no similarity with any protein of V. harveyi, Vibrio spp. or even γ-proteobacteria presently known. Instead, it shared more than 40% identity with the protein in four Pseudovibrio spp. (58.76-59.21% identity) suggesting recent acquisition of this gene from a Pseudovibrio strain ( Fig. 6a and d). Two proteins were annotated, one as a serine hydrolase and the other as a class A beta-lactamase-related serine hydrolase.
The CU052_14535 gene, encoding a 167 amino acid membrane protein in V. harveyi 345, was identified as an OmpA family protein in many other homologs. CU052_14535 showed more than 40% identity with many strains of Photobacterium spp., Salinivibrio spp., Aeromonas spp., Ferrimonas spp., Grimontia spp., Aliivibrio spp., and other Vibrio species, but not with V. harveyi (data not shown). The top 15 similar strains were selected for muti-alignment and phylogenetic analysis      Fig. 6c and f). CU052_14535 was close to an outer membrane beta-barrel protein in V. splendidus (97.04% identity), and an OmpA family protein in V. crassostreae (97.04% identity), suggesting recent acquisition of this gene from V. splendidus or V. crassostreae ( Fig. 6c and f).

Discussion
V. harveyi 345 was isolated and studied because of its multidrug resistance and serious virulence to E. oanceolutus in Shenzhen, Southern China. The complete genome  transfer is the exchange of genetic material within species without any sexual mechanism [43]. This phenomenon is widely documented in bacteria and has a role in bacterial evolution and adaptation [43]. Comparative genomics of 31 V. harveyi strains was conducted to analyze HGT events by identifying strain-specific genes and strain-gene families. Considering the average gene number of 5087 for the 31 V. harveyi strains, the 3338 core genes represented approximately 66% of the total genome, meaning that approximately two-thirds of the genome was conserved among all strains. However, flower plots and dispensable heatmaps showed that each stain contained strain-specific genes, probably obtained by HGT. A total of 217 genes were specific to V. harveyi 345, which was more than other strains except for ZJ0603. This result suggested that a large number of HGT events happened in 345. In addition, seven gene families were specific to V. harveyi 345. Most (80.65%, 175/217) of the strain-specific genes had > 40% identify to other Vibrio species. The remaining came from other genera such as Shewanella, Photobacterium, Pseudovibrio, and Escherichia. We focused on the characterization of three strain-specific genes, CU052_00920, CU052_13320, and CU052_14535. CU052_00920 encoded a class C betalactamase and was transferred from Pseudovibrio spp. Betalactamase enzymes are reported to inactivate beta-lactam antibiotics by hydrolyzing the peptide bond of the characteristic four-membered beta-lactam ring, rendering the antibiotic ineffective [44]. This result suggested that CU052_00920 contributed to antibiotic resistance, especially ampicillin resistance of V. harveyi 345. CU052_13320 encoded a virulence-associated protein D and closes to a virulence factor in Pseudomonas aeruginosa (73.40% identity) and Aeromonas sobria (75.53% identity). CU052_ 13320 probably contributed to the virulence of 345. CU052_14535 encoded an OmpA family protein and was acquired from V. splendidus or V. crassostreae. Among pathogenic bacteria, OmpA proteins are important for pathogenesis including bacterial adhesion, invasion, or intracellular survival and evasion of host defenses or stimulation of proinflammatory cytokine production [45]. This result suggested that CU052_14535 regulates the virulence of V. harveyi 345. DNA can also be horizontally transmitted between bacteria through plasmids, phages or uptake of naked DNA from environment [46]. Homology analysis showed that p345-185 had 99.97% homology (query cover = 99%) to pVPS62 in V. parahaemolyticus. PVPS62 is reported to be a pAQU-type plasmid and emerge MDR conjugative plasmid among important pathogens [47]. Five antimicrobial-resistance genes (tetm, tetb, qnrs, dfra17, and sul2) involved in resistance to tetracycline, fluoroquinolone, trimethoprim, and sulfonamide, and one virulence gene (CU052_28670) involved in immune evasion were located on plasmid p345-185. These genes could be horizontally transferred, promoting drug resistance and virulence. Especially, CU052_29140 (sul2) showing 100% amino acid identity to sul2 in Acinetobacter baumannii, Escherichia coli, Actinobacillus pleuropneumoniae, Histophilus somnim, and Salmonella enteric, should contribute to the resistance of pediatric compound sulfamethoxazole tablets and sulfamethoxazole. Similarly, Klein et al. [48] found that chloramphenicol, oxytetracycline and chlortetracycline could be successful transferred by R-plasmids. Except for plasmids, HGT encompasses a variety of genetic units, collectively known as MGEs [49] and including phages, GIs, and integrating conjugative elements (ICEs). A total of 47 GIs, with encoding many transposases, integrases, and recombinases, and two incomplete prophage sequences were identified in V. harveyi 345. These genetic units probably acted as HGT delivery tools, contributing to pathogenesis, drug resistance and environmental adaptation of V. harveyi 345. Several type III/VI secretion proteins along with IS family transposases were predicted on the same GI that could be easily transmitted. Dissemination of these genes could further compromise Vibrio infections, limiting treatment options. These results further indicated that HGT contributed to virulence and antibiotic resistance of V. harveyi 345.
The 345 strain was isolated from a diseased E. oanceolutus in Shenzhen, Southern China. Recent increasing  aquaculture density has led to frequent breeding diseases and increased use of antibiotics in Southern China, resulting in serious antibiotic residues and pollution [50]. Intensification of coastal urbanization has led to massive discharge of pollutants (heavy metals, nutrients, biocides), and intensification of human activities has caused environmental climate change, especially global warming [51]. Many studies show that the wide use of antibiotics; pollution by heavy metals, nutrients and biocides; and global climate change regulate HGT by affecting plasmid replication, changing phage activity, adjusting HGT-related enzyme activity, and damaging immune systems. These factors regulate bacterial resistance and pathogenicity [52][53][54]. For example, Beaber et al. [53] found that ciprofloxacin induces transfer of SXT, an ICE derived from V. cholera encoding genes that confer resistance to chloramphenicol, sulphamethoxazole, trimethoprim and streptomycin. The mechanism is increasing expression of SXT activators, enhancing drug resistance. Similarly, high environmental temperatures in combination with UV irradiation accelerate the spread of stx (Shiga toxin) genes by enhancing Stx prophage induction and Stx phagemediated gene transfer [52]. Therefore, environmental pollution and climate change may increase the communication of virulence genes and drug resistance genes by affecting HGT. The result will enhance the toxicity and drug resistance of V. harveyi and affect the ecological safety of aquaculture ecosystems.

Conclusions
We present the first complete genome of the serious disease-causing V. harveyi 345 which has a multidrugresistant phenotype. We did comparative genomics between this strain and 30 other V. harveyi strains. The genome was determined to study V. harveyi 345 virulence and antimicrobial resistance. Multiple virulence factors and resistance genes were predicted in this strain, consistent with results of Jiang et al. [38]. We searched for evidence of HGT and evaluated genomic traits relating to HGT. The high quality complete genome sequence generated in this study will aid further studies for a deeper understanding of the mechanisms of Vibrio pathogenesis and antibiotic resistance. These studies could improve seafood quality and reduce economic loss. We recommend that more research be done on the response mechanism of HGT relative to environmental change and antibiotic use. This research will be an important scientific basis for predicting outbreaks and controlling V. harveyi disease. As environmental pollution including use of antibiotics and climate change enhance the virulence and drug resistance of pathogens, research should explore management approaches, for example by bacteriophages, to manage the occurrence of V. haveyi in the environment [55].

DNA extraction
V. harveyi 345 was grown overnight in 2216E medium (BD, USA) at 28°C with vigorous shaking. Overnight cultures were inoculated with a 1:1000 dilution into 20 ml fresh 2216E medium and grown until OD600 = 0.5. Cells were collected by centrifugation and genomic DNA extracted using the CTAB method [56]. The quantity and quality of genomic DNA were evaluated using a Qubit Fluorometer (Invitrogen, Carlsbad, CA, USA) and 1% agarose gel electrophoresis. DNA was stored at − 80°C and prepared for genome sequencing.

General annotation
ORFs were predicted by Glimmer v.3.02 [57], and genome annotation was completed using the NCBI prokaryotic genome automatic annotation pipeline with identity ≥40% and E_value ≤10 − 5 (Blast and annotation for all study sets used the same identities and E_values). COG, KEGG, GO and NR databases were used to search domain architecture. The tRNA genes were identified by tRNAscan-SE v.1.23 [58] and rRNA genes with RNAmmer v.1.2 [59]. The sRNAs were predicted by blasting nucleic acid against Rfam database [60]. Tandem repeats were predicted by Tandem Repeat Finder v.4.04 [61].

Comparative genomics
Comparative genome analysis was conducted among 31 V. harveyi strains with V. harveyi VH2 as the reference strain and an average gene number of 5087 (Additional file 2: Table S2) to assess V. harveyi evolution. A phylogenetic tree was constructed based on the single nucleotide polymorphism (SNP) matrix of the 31 strains with the maximum-likelihood method, bootstrapped 1000 times via Treebest-1.9.2 software [67].
Extraction of pan-, core-, dispensable and strain-specific genes Core-pan genes were analyzed using a BLAST method [68]: Genes were taken from reference genome V. harveyi VH2 as a gene pool. Genes predicted in one strain among other 30 strains (Query samples) were BLAST with the gene pool, and results filtered by length and identity (> 40%). BLAST coverage ratios (BCR, reference BCR = match/reference length × 100%, query BCR = match/query length × 100%, and match stands for the length used for BLAST) of genes from the gene pool and Query samples were calculated separately. If BCR values from the reference and Query sample were smaller than the setting value (40%), the reference gene did not have homology with the Query gene. The nonhomologous gene from the Query genome was added to the gene pool. Query samples were processed and the final gene pool used as the pan gene pool. The non-homolog genes from the reference genome were the strain specific genes against the Query sample. Nonhomologous genes were BLAST with the gene pool of another Query sample, repeated for all samples, and the final nonhomologous gene pool was used as the strain-specific gene pool. After aligning to genes from samples, BCR values of genes from the pan gene pool were calculated for each sample. The coverage array was generated for this pool. If the BCR value of a gene in each sample was larger than the setting value, the gene was a core gene. If the gene was predicted from an assembled result, BLAST results were filtered and the sequence was removed if the number N (N means uncertain nucleotides in gene) was larger than the setting (30%) for the gene. Dispensable genes were included in the pan-gene pool but not the core-gene pool.

Construction of gene families
Gene families were constructed using genes from the reference strain and the target bacterium. Protein sequences were aligned with basic local alignment search tool of protein (Blastp) and redundancy was eliminated. Gene family TreeFam clustering was carried out with alignment results and Hcluster_sg software. Alignment results for proteins were converted into multiple sequence amino acids in the CDS area after multiple sequence alignments with the clustered gene family using Muscle software [69]. Gene family tree construction analysis was carried out for multiple sequence alignment results using Muscle and neighborhood-joining (NJ) method with Treebest software [67].

Characterization of putative HGT
Of the strain-specific genes, we extracted CU052_00920, encoding a beta-lactamase class C; CU052_13320, encoding a virulence-associated protein D; and CU052_ 14535 encoding an OmpA family protein for further analysis on contribution to the antibiotic resistance or virulence of V. harveyi 345. The genes were compared to the NR protein database of NCBI using the Blastp tool. The results were filtered manually to find significant hits to proteins belonging to species other than V. harveyi. Multiple sequence alignment was conducted using ClustalW [70]. An alignment phylogenetic tree was constructed from aligned sequences using the Kimura 2-parameter model [71] with the neighborjoining method, bootstrapped 1000 times, with MEGA6.0 software [72].
In addition, the entire sequence of p345-185 was compared to the NR nucleotide database of NCBI using a Blastn tool. A phylogenetic tree was constructed with the neighbor-joining method using max seq difference = 0.75 and BLAST pairwise alignments. The characterization method was the same for CU052_29140 (sul2) on p345-185 as for CU052_00920, CU052_13320, and CU052_ 14535.