- Research article
- Open Access
Whole genome evaluation of horizontal transfers in the pathogenic fungus Aspergillus fumigatus
BMC Genomicsvolume 11, Article number: 171 (2010)
Numerous cases of horizontal transfers (HTs) have been described for eukaryote genomes, but in contrast to prokaryote genomes, no whole genome evaluation of HTs has been carried out. This is mainly due to a lack of parametric methods specially designed to take the intrinsic heterogeneity of eukaryote genomes into account. We applied a simple and tested method based on local variations of genomic signatures to analyze the genome of the pathogenic fungus Aspergillus fumigatus.
We detected 189 atypical regions containing 214 genes, accounting for about 1 Mb of DNA sequences. However, the fraction of atypical DNA detected was smaller than the average amount detected in the same conditions in prokaryote genomes (3.1% vs 5.6%). It appeared that about one third of these regions contained no annotated genes, a proportion far greater than in prokaryote genomes. When analyzing the origin of these HTs by comparing their signatures to a home made database of species signatures, 3 groups of donor species emerged: bacteria (40%), fungi (25%), and viruses (22%). It is to be noticed that though inter-domain exchanges are confirmed, we only put in evidence very few exchanges between eukaryotic kingdoms.
In conclusion, we demonstrated that HTs are not negligible in eukaryote genomes, bearing in mind that in our stringent conditions this amount is a floor value, though of a lesser extent than in prokaryote genomes. The biological mechanisms underlying those transfers remain to be elucidated as well as the biological functions of the transferred genes.
Horizontal transfers in eukaryotes
Horizontal transfers (HTs) are a major force of evolution in prokaryotes [1–5]. The average amount of DNA transferred in prokaryote genomes varies from 0 to 17% according to different studies [4, 6–8]. The transferred genes remaining in the genome either increase fitness or allow the colonization of new environments [2, 3, 9–11]. However, the extent of HT in eukaryotes is less known though they were proposed to play a role as important as for prokaryotes [12–19]. In fact, most of the documented cases concern insertions of viruses (especially retroviruses) into eukaryote genomes [20–23] and exchanges between symbiont, parasite [18, 24] or organelle genomes [25, 26] and their host genome. At last, as conjugation between distant species is unlikely by meiosis, a possibility of transfer between eukaryotes was evoked by gene introgression following hybridization between closely related species .
For the former examples, the biological mechanisms are understood, demonstrated or are hypotheses with strong support. However the mechanisms involved in DNA exchanges between distant species are mostly unknown, either between eukaryotes or to explain the numerous reports of HTs between prokaryotes and eukaryotes. Among the mechanisms acting in prokaryotes, transformation by free DNA is possible for eukaryotes but is less efficient than it is for prokaryotes . Transduction was hypothesized but its efficiency differs as a function of species families from possible to unlikely by lack of vectors . Also, alternative mechanisms were suggested, like phagocytosis or by means of bacterial type IV secretion systems that could promote the transfer of DNA from prokaryotes to eukaryotes [13, 30]. Thus, while HT results are observed, the underlying mechanisms are yet to be discovered.
Choice of a HT detecting method for A. fumigatus
The HT detection methods generally used in eukaryotes are based on gene homology. The determination method depends on the number of homologs of the target and of its phylogenetic distribution. In the case of the detection by Blast of only a few homologs for a gene of interest, an alignment analysis showing more homology with genes/proteins of distant species than to a closer one indicates a horizontal transfer event for this gene. A typical example of such detection is to find a close prokaryotic homolog to a eukaryotic gene [31–34]. A more reliable method can be used in the case of numerous homologs and their broad distribution in the evolution tree. In this latter case, a phylogenetic analysis is performed and incongruence in the phylogenetic tree leads to a similar conclusion [35, 36]. In each of these cases, the study concerns only a peculiar gene or a small group of genes [37–42]. Indeed, due to the restrictions exposed above, a fair number of genes cannot be analyzed this way: ORFans of course but also genes with only one or a small number of homologs leading to an inconclusive situation. Moreover, due to the patchiness of eukaryote sequences in Genbank, it is difficult to assess horizontal transfer between eukaryotes species , while it is easier to assess transfers between prokaryote and eukaryote species . However, many newly sequenced genomes were analyzed for horizontally transferred genes (HGT) and in some cases a large number of HGTs were detected by phylogenetic analyzes (for instance 587 genes - 5.6% of all genes - were found of bacterial origin in the diatom P. tricornutum). Therefore, while a high number of genes could be found of alien origin, these studies, as discussed above, could not be qualified as whole genome studies.
In order to analyze whole genomes and to cope with the difficulties discussed previously, the so-called parametric methods were designed. They are based either on the whole set of genes of a species or on variations of the composition characteristics of the genomic sequence itself. Methods using gene information are based on differences in codon usage between highly expressed, lowly expressed and alien genes [45–48]. However, none of the methods based on codon usage can be applied to eukaryote genomes as gene regulation is different from prokaryotes and no tool has been designed to cope with this fact [47, 48].
The other methods are based on the variations of base composition detected by different order Markov models along a genome: the so-called genomic signature [49, 50]. This genomic signature was demonstrated to be species-specific and quite similar all along the genome [50–54]. This species-specificity was used to detect horizontally transferred DNA by analyzing a genome and searching for regions exhibiting a different signature than the majority of the genome [4, 6–8, 55–65]. These methods use only the information contained in the analyzed genome and when applied to the whole genome sequence they allow the detection of atypical regions containing no annotated genes [6, 61].
Phylogenetic and parametric methods, while detecting common genes, diverge in certain cases. It was proposed that these two types of methods addressed different types of HGTs [66, 67]. It was proposed that combining signature and gene based methods increased either specificity or sensitivity of HT detection [33, 58, 68].
In general, when compared to prokaryotic genomes, eukaryote genomes are larger and more complex due to the presence of non-coding sequences, low complexity regions, isochores and fragmented genes. Therefore, most of the parametric methods used for prokaryotes are either inefficient or not suitable to eukaryotic genomes. Likewise, methods based on variations of the G+C composition work poorly due to the intrinsic variations of base composition in eukaryote genomes . For these reasons, no genome-wide study of horizontal transfers in an eukaryotic genome using parametric methods was published. However, some eukaryotic genomes present characteristics close to prokaryotic ones and allow attempting the use of parametric methods on them. For instance, it has been shown that variation of short oligonucleotide usage is moderate in some fungi genomes and that parametric methods based on this type of criterion could be applied to them [50, 70, 71]. Moreover, HTs seem to play an important role in the evolution of fungi [29, 72–75]. Therefore, we chose to analyze the extent of horizontal transfers in the genome of Aspergillus fumigatus[76–78]. A. fumigatus is a pathogenic fungus causing a wide range of diseases including mycotoxicosis, systemic diseases and allergic reactions. The mortality rate is high in infected patients, especially in immuno-compromised ones. Here we propose to use a simple and tested method based on short oligonucleotide usage  to evaluate the amount of HTs in the genome of Aspergillus fumigatus.
We found that HTs in fungi are not negligible, accounting for 1 Mb, representing about 3% of the genome and that donor species belong mainly to 3 classes, bacteria, fungi and viruses.
The Aspergillus fumigatus Af293 genome (Genbank NC_007194 - NC_007201)  has a size of 29.4 Mb and is composed of 8 chromosomes. Its base composition is balanced: G+C% = 49.8%.
HT detection method
We used a method based on the variations of tetranucleotide frequencies along a sequence. The method was already described and tested on prokaryotic genomes and the principles are recalled hereafter . The specificities of eukaryotic genomes implied a pretreatment and in a first step, we removed from the genome all the centromeric and telomeric low complexity regions which exhibited an atypical signature and did not correspond to transferred DNA. The genome was subsequently analyzed by a 5 kb sliding window, with a step of 500 bp. The signature of each window was calculated and the Euclidian distance of every window signature to the whole genome signature was assessed. Then, the window signatures were clustered by a k-means algorithm and a partition based on the distance distributions per class and the average distances of the classes to the genome was performed. In a previous work with prokaryotic genomes, less than 8 classes (average 4 ) were required to take into account the intrinsic genome variation and the atypical signatures. Due to an increased intrinsic variation of base composition in eukaryote genomes, the number of classes was raised to 20 for the A. fumigatus genome. The partition separated the k-means classes into one group exhibiting rather homogeneous signatures whose distance to the whole genome signature was small (90% of the windows) and one group of heterogeneous classes with a large distance to the genome signature (10% of windows). Thus, we considered that the first group of classes represents the host genome and calculated the average signature of this host genome. The Euclidian distances of all the window signatures were recalculated with regards to this new host signature. Afterwards, taking into account only the windows of this putative host genome, we established a threshold equal to the 99% percentile of the Euclidian distance to the host genome. All the windows whose signature exhibited a distance above this threshold were considered atypical and potentially corresponding to foreign DNA. We chose a high threshold in order to favor specificity rather than sensitivity.
Atypical region analysis
All genes included in the atypical regions were analyzed: we investigated their functions and compared them to Genbank by BlastP (E-value ≤ 10-10 and coverage ≥ 80%) in order to identify the closest homologous sequences if any. For atypical regions containing no annotated coding sequences, a BlastX analysis (E-value ≤ 10-1) was done in order to identify remnants of coding sequences and a BlastN (E-value ≤ 10-1) to find homology at the DNA level.
Protein sequences from the Blast analysis were aligned by ClustalW . The trees (neighbor joining algorithm ) were bootstrapped (1000 trees) and the consensus trees calculated with the Philip package . Species trees were inferred by retaining only one homolog per species (the best strain or the best homolog, the less significant paralogs were discarded).
We have derived and updated Genstyle, a database of species signatures . Our database contains about 65,000 signatures of species strains, organelles, viruses and plasmids. It was composed as following: for each entry, all non redundant sequences longer or equal to 1 kb were gathered from Genbank then concatenated for signature calculation. We calculated the signature of each atypical region and searched the database for the closest signatures in terms of Euclidian distance. As genomic signatures are species-specific [6, 50, 52–54, 83–85], the species with the closest signatures could be considered as potential donors of the atypical regions only if the distances obtained were below the average threshold used for HT detection (241 AU) .
In a first step, we checked that as already shown for other eukaryotes  all chromosomes of A. fumigatus presented a similar signature and intrinsic variability. The concatenated sequence of the 8 chromosomes was then used to establish the threshold. The study of the signature variations along the genome allowed for the distinguishing of 189 distinct atypical regions (Figure 1, Additional file 1). They represented 3.1% of the total genome (908 kb, Table 1). The average size of the atypical regions was 4.5 kb, ranging from 500 bp to 52.5 kb. In general, the atypical regions were spread along all chromosomes indicating no chromosome preference for foreign DNA insertions (Figure 1, Table 2).
HT distribution on chromosomes
Though all chromosomes contained atypical regions some seemed to exhibit a particular distribution like a sub-telomeric trend on chromosome 4 or an under-representation on the short arm of chromosome 2. We also denoted that in some cases, atypical regions were physically clustered as it can be seen at position 2.3 Mb of chromosome 6 (c6r14-c6r23, representing 53 kb of atypical sequences out of 107 kb of genomic DNA) (Figure 1, Table 2).
Content of atypical regions
The 189 atypical regions detected can be divided into two groups: those containing annotated genes (134) and those with no coding features (55). A total of 214 annotated genes are encoded in the atypical regions. We checked by BlastP if new homologs were sequenced since the genome analysis [76, 78]. Most of these genes exhibited homologous counterparts (Additional files 1 and 2) with the exception of ORFans. The ORFans can be divided in 2 classes: 16 genes from A. fumigatus have no homologs at all in GenBank and 5 have a homolog only in N. fischieri a very close neighbor of A. fumigatus.
The functions of 81 transferred genes are unknown. Considering the other 133 genes, a function is inferred for 39 of them and a putative one for the 94 others (Additional file 3). The majority of them (91; 68%) belong to central and intermediate metabolism. We detected few genes involved in virulence [78, 86] among the horizontally transferred genes although this means of virulence spreading was already demonstrated for pathogenic fungi [16, 72, 87, 88]. We detected a few genes proposed to play a role in pathogenicity: 1 lipase, 4 peptide transporters , 5 genes of gliotoxin synthesis involved in virulence [90, 91] and two genes coding for allergenic proteins. Also, we observed a high number of mobile elements detected in the atypical regions. Alongside the 214 genes, we found 129 transposons belonging to 5 families: Copia, Gypsy, hAT, Line and DDE1. In some cases, these transposons are clustered in a single region (Additional file 1, see c2p24, c4p18 or c6p2 for instance). We checked the signatures of mobile elements and found that they exhibited a signature close to that of the host genome and so were not the cause of the detection of the region but more likely markers of the transfer events .
Fifty-five atypical regions lacked annotated genes. Despite this, a BlastX and BlastN analyses allowed to propose the presence of gene relics in 24 (47%) of these regions (Table 3). Besides some rRNA genes (regions c4r5 and c4r6), supposedly not transferred but detected by the method, and transposons, we found pseudogenes of nuclear or mitochondrial origin and plasmid parts. Figure 2 shows an example of such a region containing both transposons and a pseudogene. The large numbers of transposons contained in these regions (Table 3) supports their status of horizontally transferred regions . It is interesting to notice that 3 annotated genes and a pseudogene are of mitochondrial origin, indicating HTs between mitochondrial and nuclear genomes.
Putative origin of the atypical regions
It is possible from the BlastP analysis to get an indication of the donor species except for the ORFans (Table 4, Additional file 2) or genes/proteins with few homologs. The majority of the homologs detected originated only from fungal species (56%). It is to be noted that 16 genes are specific of A. fumigatus (no homolog in other fungal species). All the other genes had homologs in at least one or the other Aspergillus sp. or Neosartorya fischeri (a very close relative of A. fumigatus). This supports the view that most of the transfer events occurred before the Aspergillus speciation. For instance out of the 120 genes exhibiting homologs mainly in fungi, 18 (15% not taking into account the ORFans) had homologs only in Aspergillus sp. or in N. fischeri. However the patchiness of the Aspergillus species represented by the different genes suggests numerous rearrangements and gene losses in these species. Another point is that some genes had homologs only in N. fischeri (5, Table 4) confirming the very close relationship between A. fumigatus and N. fischeri. From the BlastP analysis, it can be noted that 19% exhibited homologs in other domains of life; for instance, 26 genes had homologs exclusively in prokaryotes out of the fungi homologs (Table 4). We also detected 19 homologs exclusively in other eukaryotic kingdoms (Table 4). From this analysis it is possible, not only to confirm the transferred status of the genes but also to propose in peculiar cases a source of these genes. The criterion for a confident result are a very high conservation (very low E-Value), a coverage over 90% and an alternation of fungi species with those from other domains or kingdoms. For instance, gene AFUA_7G06140 possibly originates from Amoebozoa species, gene AFUA_1G11310 from Metazoa species and genes AFUA_1G01660, AFUA_6G09600 and AFUA_6G09660 among others would be of Prokaryotic origin. Other genes exhibit a more complex perturbed evolutionary history like genes AFUA_1G05200 and AFUA_4G14130 originating from other Eukaryotic kingdoms and some exhibit a very complex history mixing Eukaryotic and Prokaryotic origins like genes AFUA_1G06810, AFUA_1G10110, AFUA_2G00720, AFUA_4G07710 or AFUA_5G10120 for instance. To confirm the transferred status and research an origin when the homologs were all from fungi origin or when the origin was more difficult to ascertain, phylogenetic trees were inferred (examples of phylogenetic trees are shown in Figure 3 and Additional file 4)). These phylogenetic protein trees exhibited large incongruencies as compared to their respective SSU rRNA trees. This confirmed a perturbed evolutionary history and supported the transferred status of these genes.
It is difficult to assess the species of origin of the transferred genes from the Blast or the phylogenetic analyses due to the bias in homologous sequenced genes. Another way to propose a species of origin for an HT was to benefit from the species-specificity of the genomic signature. If the horizontally transferred regions have kept the characteristics of their species of origin, then by comparing their genomic signature to a homemade database of species signatures, we can obtain indications about their origin. We compared the signature of the 189 atypical regions to the database [53, 82] and for 117 of them plausible donor species could be assigned (samples of region and of their closest neighbor signatures are shown in Figure 4). Due to possible miss-assignments caused either by the representativeness of the database or to the amelioration of the transferred sequences , only broad categories of donors are presented. Figure 5 presents the distribution of these donor species as a function of their origin. Three major groups of donors are identified: bacteria (40%), fungi (25%) and viruses (22%). Among the bacteria species two groups are over-represented: Proteobacteria and Actinobacteria. An important point is the very small number of exchanges between fungi and non-fungal eukaryotes detected either from the BlastP or the signature analyses.
Discussion and Conclusion
As parametric methods were not used until now to detect horizontal transfers in eukaryote genomes, we used a method which requires only generic hypotheses: i.e. a signature quite homogeneous for the major part of the genome and a minority of regions exhibiting different signatures, these regions containing supposedly horizontally transferred DNA sequences. A. fumigatus is a genome of choice for this type of study, being an intermediate genome in terms of coding density (50% ) between high coding density prokaryotic genomes (often above 95%), or lower eukaryotes (P. tetraurelia ≈ 75%) and very low coding density of higher eukaryote genomes (Homo sapiens ≈ 1.5%). Moreover, the intrinsic variability of the A. fumigatus genome is quite low allowing the use of this type of parametric method (Figure 1).
The parameters used here are such that we favored specificity over sensitivity. In fact, the threshold of 99% percentile used in the definition of the host genome is very strict . It was already shown that lowering the threshold level while increasing sensitivity decreases specificity such that the number of false positives increases [6, 94]. Besides, the use of sliding windows does not allow the detection of short isolated genes and it is recommended to use it in combination with a gene-based method [33, 58, 68]. In our conditions, the quantity of HTs detected is probably under-estimated and could be considered, in the absence of a gold standard, as a minimum value. The Blast and phylogenetic analyzes confirmed the transferred status of the annotated genes embedded in the detected regions (Table 4, Additional file 2, Figure 3 and Additional file 4). These analyzes were possible only when the number of homologs was sufficient for such an analysis. Nevertheless, the agreement in all these methods supports the importance of horizontal transfers in A. fumigatus.
In our conditions, we were able to detect 189 regions, accounting for 3.1% of the genome exhibiting a signature different from that of the majority of the A. fumigatus genome (Table 1). The total amount of atypical DNA is consequent (almost 1 Mb) but with regards to the size of the genome it is under the average percentage detected in prokaryote genomes [6–8]. For instance, using the same method and in the same conditions, Dufraigne et al. detected an average of 5.6% of atypical regions for 22 prokaryote genomes as compared to the 3.1% detected here in A. fumigatus. We also tested a lower threshold 97.5% percentile  to evaluate its effect on the quantity of atypical sequence detected. In this later case, the amount of atypical sequences of the genome accounted for 4.6%, so about a 50% increase as compared to the 99% percentile threshold but still lower than the amount detected in prokaryotic genomes. There are few direct comparison data for eukaryotes genomes as all the studies are based on Blast or phylogenetic studies and so concern only genes. For instance in the diatom P. tricornutum, 587 genes were considered of bacterial origin (about 6% of the total gene content but only about 2% of the genome sequence ), this is far more than the 214 annotated genes detected here in a genome of comparable size and coding density. Gene based methods do not take into account the whole transfer event which could contain intergenic regions or regions lacking annotated genes (relics of HT events) that could bring information on genome evolution as well as on transfer mechanisms.
Different causes could account, in the state of our knowledge, for the apparent lower amount of transfers in eukaryotes compared to prokaryotes. Either this is due to differences in the mechanisms responsible for HT in eukaryotes and prokaryotes making it biologically more difficult in eukaryotes and so decreasing its frequency. Either, if HTs occur at the same rate in both domains, foreign DNA is eliminated faster in eukaryote genomes. It must also be taken into account that considering gene exchange, the transferred genes must be selected and "ameliorated" to be expressed in a new eukaryotic environment. The high proportion of non-coding regions could be interpreted as an accelerated inactivation of useless genes, for instance because they originated from other domains of life and could not be expressed due to the differences in gene expression machinery. This phenomenon could account for the greater amount of detected regions lacking annotated genes that could be in the process of elimination as supported by the presence of pseudo-genes.
The putative HTs are spread among all the eight chromosomes exhibiting no positional bias (Figure 1, Table 2). The number of HTs per chromosome is proportional to the chromosome' size (Table 2). However, it seems that the average size of the transferred regions are a bit larger inA. fumigatus than the average in 22 prokaryotes species (4.5 kb vs 2.8 kb) . Among the 189 atypical regions detected, six were larger than 20 kb and 35 (19%) exhibited the minimum detectable size of 500 bp.
Two detected regions (c4r5 and c4r6, Table 3, Additional file 1) are possibly false positives. Indeed, they contain rRNA and it was already shown that rRNA exhibits a specific signature [6, 61]. One region (c4r6, 3 kb) contains quite exclusively rRNA (Table 3) while the other is an ambiguous case, it is larger (8 kb) than c4r6 and contains rRNA as well as two transposons and could be a remnant of a horizontal transfer event or a composite region with an HT event close to rRNA sequences (Table 3, Additional file 1).
For most of the genes included in the atypical regions, it was not possible to assign a function. Indeed, we were able to assign a putative function to 133 (62%) of the 214 atypical genes and 21 of them are ORFans. This fraction of HGTs with a function is comparable to recent publication where around 50% of the detected genes have no known function . It is to be noted that 55 of the 189 atypical regions lack of annotated genes and apart from those containing rRNA (see above) they could be considered as remnants of HTs (see Table 3 for those containing pseudogenes or transposons) as the original gene content was presumably of no use for A. fumigatus. This proportion is far greater than for prokaryotic genomes, where only a few regions with no genes were detected [6, 61]. Finally, the high number of transposons detected in atypical regions supports their horizontally transferred status .
The functions of transferred genes belonged mainly to the central and intermediate metabolism. Few genes seemed to be involved directly in pathogenicity, however, 5 genes (8 genes when using the 97.5% threshold, see above) out of 10 of the gliotoxin synthesis cluster, involved in virulence are detected as transferred. This result supports the hypothesis already proposed on the foreign origin of this cluster [91, 96, 97]. It is possible to propose a history of the evolution of this gene cluster. The original cluster was transferred in block to an ancestor of Aspergillus sp. on chromosome 6, then a duplication occurred giving birth to a second reduced cluster on chromosome 3 (7 genes) . This small cluster was "ameliorated" (not detected) as it is often the case for duplicated genes. The original cluster also undergoes amelioration for some genes, as it appears that some genes cannot be detected in our conditions.
We obtained information of two different types on the origin of the transfers in A. fumigatus: one for genes only with the BlastP and the phylogenetic analyzes and another for whole HT regions with the signature analysis. These results are complementary and in rather good agreement if we take into account the fact that the first two analyzes are based on genes and the last on detected regions (including those with no annotated gene). The only discrepancy concerns the fact that we found no homologous genes in viruses (Table 4 and Figure 5). The BlastP analysis provided two striking facts. First, there are few horizontally transferred genes species-specific to A. fumigatus as we found only 16 genes (≈ 4% of the annotated transferred genes) with no homolog in other Aspergillus species nor in N. fischeri. Second, resulting from the previous statement, all the other genes exhibit homologous counterparts in other Aspergillus species or in N. fischeri indicating that these genes were transferred in a common ancestor of Aspergillus sp. and N. fischeri before the clade formation. This is why these genes belong to the Aspergillus core genome as defined by Fedorova et al. . From the Blast analysis, we detected only 26 genes with only homologous counter-parts in fungi and prokaryotic genomes (Additional file 2), this number is in the lower bound of those reported for sequenced protist genomes by Keeling and Palmer (in "supplementary Table S1" ). Complementary information is provided by the search for the origin of the transferred regions as a whole. First of all, it is the only way to propose an origin for HT regions lacking annotated genes. Of course due to amelioration processes the species proposed could be different from the donor species. However, it was already shown that if we don't get the true species, we get information on the domain, the kingdom or the family as a function of the distance between the signature of the HT region and that of the proposed donors. For this reason, we only took into account broad categories of species to analyze the signature data (Figure 5). As already shown by different studies, the origin of HT regions is diverse and encompasses all domains of life (Figure 5) [12–14, 16, 24, 29, 72]. However, 3 groups of donor species are dominant here: bacteria, fungi and viruses (Figure 3). It was proposed that transduction was unlikely for HT in fungi due to a lack of knowledge about possible vectors . Nevertheless, it appears that 22% of the donor species are viruses (Figure 5). A hypothesis to explain this fact would be that free viral DNA present in the environment  or in the intracellular compartment during phagocytosis [13, 30] may be involved in transformation the same way as in prokaryotes.
Exchanges between eukaryotic species or between prokaryotes and eukaryotes are documented (see  for a review). However, while bacteria are represented by numerous donors belonging to Proteobacteria or Actinobacteria, archaea are seldom involved in HT in A. fumigatus (about 3% of the donor species and few homologs in Blast analysis, Additional file 2 and Figure 5). It is to be noted that if we proposed donor species from other domains of life, there are very few donor species from other eukaryotic kingdoms (only 9%, Figure 5) outside of the fungi kingdom (25%) whatever the method used (Table 4 and Figure 5) and the next eukaryotic group are plants (around 5%). This suggests that inter kingdom exchange of genetic material is more restricted than from the bacterial domain. However, due to the patchiness of the database for eukaryotic sequences, this result could change in the future when more sequences will be available for eukaryotic species. We also observed HT from organelle genomes as some mitochondrial fragments are embedded in atypical regions (Table 3, Additional files 1 and 3).
This work opens a field of study for evaluating the contribution of HTs to eukaryote genomes. The genomes concerned would be those presenting a low intrinsic variation, i.e. fungi, plants, lower eukaryotes, etc. with the exception of the highly intrinsically variable genomes of warm-blood vertebrates until appropriate methods are designed. At last, the biological mechanisms underlying those transfers remain to be elucidated as well as the biological role of the transferred genes.
Horizontally transferred gene
Doolittle WF: Lateral genomics. Trends Cell Biol. 1999, 9: M5-8. 10.1016/S0962-8924(99)01664-5.
Dutta C, Pan A: Horizontal gene transfer and bacterial diversity. J Biosci. 2002, 27: 27-33. 10.1007/BF02703681.
Eisen JA: Horizontal gene transfer among microbial genomes: new insights from complete genome analysis. Current Opinion in Genetics & Development. 2000, 10: 606-611.
Ochman H, Lawrence JG, Groisman EA: Lateral gene transfer and the nature of bacterial innovation. Nature. 2000, 405 (6784): 299-304. 10.1038/35012500.
Ruiting L, Reeves PR: Gene transfer is a major factor in bacterial evolution. Mol Biol Evol. 1996, 13: 47-55.
Dufraigne C, Fertil B, Lespinats S, Giron A, Deschavanne P: Detection and characterization of horizontal transfers in prokaryotes using genomic signature. Nucleic Acids Res. 2005, 33 (1): e6-10.1093/nar/gni004.
Garcia-Vallvé S, Romeu A, Palau J: Horizontal gene transfer in bacterial and archeal complete genomes. Genome Research. 2000, 10: 1719-1725. 10.1101/gr.130000.
Nakamura Y, Itoh T, Matsuda H, Gojobori T: Biased biological functions of horizontally transferred genes in prokaryotic genomes. Nature Genetics. 2004, 36: 760-766. 10.1038/ng1381.
Regeard C, Maillard J, Dufraigne C, Deschavanne P, Holliger C: Indications for acquisition of reductive dehalogenase genes through horizontal gene transfer by Dehalococcoides ethenogenes strain 195. Appl Environ Microbiol. 2005, 71 (6): 2955-61. 10.1128/AEM.71.6.2955-2961.2005.
Smith MW, Feng DF, Doolittle RF: Evolution by acquisition: the case for horizontal gene transfers. Trends in Biological Science (TIBS). 1992, 17: 489-493. 10.1016/0968-0004(92)90335-7.
Syvanen M: Horizontal gene transfer: evidence and possible consequences. Annu Rev Genet. 1994, 28: 237-261.
Gogarten JP: Gene transfer: gene swapping craze reaches eukaryotes. Curr Biol. 2003, 13 (2): R53-4. 10.1016/S0960-9822(02)01426-4.
Doolittle WF: You are what you eat: a gene transfer ratchet could account for bacterial genes in eukariotic nuclear genomes. Trends in Genetics. 1998, 14: 307-311. 10.1016/S0168-9525(98)01494-2.
Genereux DP, Logsdon JM: Much ado about bacteria-to-vertebrate lateral gene transfer. Trends Genet. 2003, 19 (4): 191-5. 10.1016/S0168-9525(03)00055-6.
Andersson JO: Lateral gene transfer in eukaryotes. Cell Mol Life Sci. 2005, 62 (11): 1182-97. 10.1007/s00018-005-4539-z.
Sanders IR: Rapid disease emergence through horizontal gene transfer between eukaryotes. Trends Ecol Evol. 2006, 21 (12): 656-8. 10.1016/j.tree.2006.10.006.
Watkins RF, Gray MW: The frequency of eubacterium-to-eukaryote lateral gene transfers shows significant cross-taxa variation within amoebozoa. J Mol Evol. 2006, 63 (6): 801-14. 10.1007/s00239-006-0031-0.
Moran NA: Symbiosis as an adaptive process and source of phenotypic complexity. Proc Natl Acad Sci USA. 2007, 104 (Suppl 1): 8627-33. 10.1073/pnas.0611659104.
Keeling PJ, Palmer JD: Horizontal gene transfer in eukaryotic evolution. Nat Rev Genet. 2008, 9 (8): 605-18. 10.1038/nrg2386.
Stocking C, Kozak CA: Murine endogenous retroviruses. Cell Mol Life Sci. 2008
Van Blerkom LM: Role of viruses in human evolution. Am J Phys Anthropol. 2003, 14-46. 10.1002/ajpa.10384. Suppl 37
Hedges DJ, Batzer MA: From the margins of the genome: mobile elements shape primate evolution. Bioessays. 2005, 27 (8): 785-94. 10.1002/bies.20268.
Bannert N, Kurth R: The evolutionary dynamics of human endogenous retroviral families. Annu Rev Genomics Hum Genet. 2006, 7: 149-73. 10.1146/annurev.genom.7.080505.115700.
Dunning Hotopp JC: Widespread Lateral Gene Transfer from Intracellular Bacteria to Multicellular Eukaryotes. Science. 2007, 317: 1753-6. 10.1126/science.1142490.
Doolittle WF, Boucher Y, Nesbo CL, Douady CJ, Andersson JO, Roger AJ: How big is the iceberg of which organellar genes in nuclear genomes are but the tip?. Philos Trans R Soc Lond B Biol Sci. 2003, 358 (1429): 39-57. 10.1098/rstb.2002.1185. discussion 57-8.
Bock R, Timmis JN: Reconstructing evolution: gene transfer from plastids to the nucleus. Bioessays. 2008, 30 (6): 556-66. 10.1002/bies.20761.
Burke JM, Arnold ML: Genetics and the fitness of hybrids. Annu Rev Genet. 2001, 35: 31-52. 10.1146/annurev.genet.35.102401.085719.
Vlassov VV, Laktionov PP, Rykova EY: Extracellular nucleic acids. Bioessays. 2007, 29 (7): 654-67. 10.1002/bies.20604.
Rosewich UL, Kistler HC: Role of Horizontal Gene Transfer in the Evolution of Fungi. Annu Rev Phytopathol. 2000, 38: 325-363. 10.1146/annurev.phyto.38.1.325.
Backert S, Meyer TF: Type IV secretion systems and their effectors in bacterial pathogenesis. Curr Opin Microbiol. 2006, 9 (2): 207-17. 10.1016/j.mib.2006.02.008.
Butler MI, Gray J, Goodwin TJ, Poulter RT: The distribution and evolutionary history of the PRP8 intein. BMC Evol Biol. 2006, 6: 42-10.1186/1471-2148-6-42.
Paoletti M, Buck KW, Brasier CM: Selective acquisition of novel mating type and vegetative incompatibility genes via interspecies gene transfer in the globally invading eukaryote Ophiostoma novo-ulmi. Mol Ecol. 2006, 15 (1): 249-62. 10.1111/j.1365-294X.2005.02728.x.
Podell S, Gaasterland T: DarkHorse: a method for genome-wide prediction of horizontal gene transfer. Genome Biol. 2007, 8 (2): R16-10.1186/gb-2007-8-2-r16.
Richmond GS, Smith TK: A novel phospholipase from Trypanosoma brucei. Mol Microbiol. 2007, 63 (4): 1078-95. 10.1111/j.1365-2958.2006.05582.x.
Lecointre G, Rachdi L, Darlu P, Denamur E: Escherichia coli molecular phylogeny using the incongruence length difference test. Mol Biol Evol. 1998, 15: 1685-1695.
Wolf YI, Aravind L, Grishin NV, Koonin EV: Evolution of aminoacyl-tRNA synthetases - Analysis of unique domain architectures and phylogenetic trees reveals a complex history of horizontal gene transfer events. Genome Res. 1999, 9: 689-710.
Ponger L, Li WH: Evolutionary diversification of DNA methyltransferases in eukaryotic genomes. Mol Biol Evol. 2005, 22 (4): 1119-28. 10.1093/molbev/msi098.
Andersson JO: A genomic survey of the fish parasite Spironucleus salmonicida indicates genomic plasticity among diplomonads and significant lateral gene transfer in eukaryote genome evolution. BMC Genomics. 2007, 8: 51-10.1186/1471-2164-8-51.
Kamikawa R, Inagaki Y, Sako Y: Direct phylogenetic evidence for lateral transfer of elongation factor-like gene. Proc Natl Acad Sci USA. 2008, 105 (19): 6965-9. 10.1073/pnas.0711084105.
Richards TA, Dacks JB, Campbell SA, Blanchard JL, Foster PG, McLeod R, Roberts CW: Evolutionary origins of the eukaryotic shikimate pathway: gene fusions, horizontal gene transfer, and endosymbiotic replacements. Eukaryot Cell. 2006, 5 (9): 1517-31. 10.1128/EC.00106-06.
Klotz MG, Loewen PC: The molecular evolution of catalatic hydroperoxidases: evidence for multiple lateral transfer of genes between prokaryota and from bacteria into eukaryota. Mol Biol Evol. 2003, 20 (7): 1098-112. 10.1093/molbev/msg129.
Hanekamp K, Bohnebeck U, Beszteri B, Valentin K: PhyloGena--a user-friendly system for automated phylogenetic annotation of unknown sequences. Bioinformatics. 2007, 23 (7): 793-801. 10.1093/bioinformatics/btm016.
Khaldi N, Collemare J, Lebrun MH, Wolfe KH: Evidence for horizontal transfer of a secondary metabolite gene cluster between fungi. Genome Biol. 2008, 9 (1): R18-10.1186/gb-2008-9-1-r18.
Bowler C: The Phaeodactylum genome reveals the evolutionary history of diatom genomes. Nature. 2008, 456 (7219): 239-44. 10.1038/nature07410.
Karlin S, Mrazek J, Campbell AM: Codon usages in different gene classes of the Escherichia coli genome. Mol Microbiol. 1998, 29 (6): 1341-55. 10.1046/j.1365-2958.1998.01008.x.
Medigue C, Rouxel T, Vigier P, Henaut A, Danchin A: Evidence for horizontal gene transfer in Escherichia coli speciation. J Mol Biol. 1991, 222: 851-6. 10.1016/0022-2836(91)90575-Q.
Sharp PM, Matassi G: Codon usage and genome evolution. Current Opinions in Genetics and Development. 1994, 4: 851-860. 10.1016/0959-437X(94)90070-1.
Carbone A, Zinovyev A, Kepes F: Codon adaptation index as a measure of dominating codon bias. Bioinformatics. 2003, 19 (16): 2005-15. 10.1093/bioinformatics/btg272.
Karlin S, Burge C: Dinucleotide relative abundance extremes: a genomic signature. Trends In Genetics. 1995, 11: 283-290. 10.1016/S0168-9525(00)89076-9.
Deschavanne PJ, Giron A, Vilain J, Fagot G, Fertil B: Genomic signature: characterization and classification of species assessed by Chaos Game Representation of sequences. Molecular Biology and Evolution. 1999, 16: 1391-1399.
Deschavanne P, Giron A, Vilain J, Dufraigne C, Fertil B: Genomic signature is preserved in short DNA fragments. BIBE2000 IEEE international Symposium on bio-informatics & biomedical engineering, Washington USA, 8-10 november 2000. 2000, 161-167.
Sandberg R, Winberg G, Bränden C-I, Kaske A, Ernberg I, Cöster J: Capturing whole-genome characteristics in short sequences using a naïve bayesian classifier. Genome Research. 2001, 11: 1404-1409. 10.1101/gr.186401.
Teeling H, Meyerdierks A, Bauer M, Amann R, Glockner FO: Application of tetranucleotide frequencies for the assignment of genomic fragments. Environ Microbiol. 2004, 6 (9): 938-47. 10.1111/j.1462-2920.2004.00624.x.
Chapus C, Dufraigne C, Edwards S, Giron A, Fertil B, Deschavanne P: Exploration of phylogenetic data using a global sequence analysis method. BMC Evol Biol. 2005, 5: 63-83. 10.1186/1471-2148-5-63.
Hayes WS, Borodovsky M: How to interpret an anonymous bacterial genome: machine learning approach to gene identification. Genome Res. 1998, 8: 1154-1171.
Karlin S: Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes. Trends in Microbiology. 2001, 9 (7): 335-343. 10.1016/S0966-842X(01)02079-0.
Lawrence JG, Ochman H: Molecular archaeology of the Escherichia coli genome. Proc. Natl. Acad. Sci. USA. 1998, 95: 9413-9417. 10.1073/pnas.95.16.9413.
Azad RK, Lawrence JG: Use of artificial genomes in assessing methods for atypical gene detection. PLoS Comput Biol. 2005, 1 (6): e56-10.1371/journal.pcbi.0010056.
Cortez D, Delaye L, Lazcano A, Becerra A: Composition-based methods to identify horizontal gene transfer. Methods Mol Biol. 2009, 532: 215-25. full_text.
Mrazek J, Karlin S: Detecting alien genes in bacterial genomes. Ann N Y Acad Sci. 1999, 870: 314-29. 10.1111/j.1749-6632.1999.tb08893.x.
Nicolas P, Bize L, Muri F, Hoebeke M, Rodolphe F, Ehrlich S, Prum B, Bessieres P: Mining Bacillus subtilis chromosome heterogeneities using hidden Markov models. Nucleic Acids Res. 2002, 30: 1418-26. 10.1093/nar/30.6.1418.
van Passel MW, Bart A, Thygesen HH, Luyf AC, van Kampen AH, Ende van der A: An acquisition account of genomic islands based on genome signature comparisons. BMC Genomics. 2005, 6: 163-10.1186/1471-2164-6-163.
Tsirigos A, Rigoutsos I: A new computational method for the detection of horizontal gene transfer events. Nucleic Acids Res. 2005, 33 (3): 922-33. 10.1093/nar/gki187.
Scherer S, McPeek MS, Speed TP: Atypical regions in large genomic DNA sequences. Proc Natl Acad Sci USA. 1994, 91 (15): 7134-8. 10.1073/pnas.91.15.7134.
Merkl R: SIGI: score-based identification of genomic islands. BMC Bioinformatics. 2004, 5: 22-10.1186/1471-2105-5-22.
Lawrence J, Ochman H: Reconciling the many faces of lateral gene transfer. Trends Microbiol. 2002, 10: 1-4. 10.1016/S0966-842X(01)02282-X.
Ragan MA: On surrogate methods for detecting lateral gene transfer. FEMS Microbiology letters. 2001, 201: 187-191. 10.1111/j.1574-6968.2001.tb10755.x.
Becq J, Gutierrez MC, Rosas-Magallanes V, Rauzier J, Gicquel B, Neyrolles O, Deschavanne P: Contribution of horizontally acquired genomic islands to the evolution of the tubercle bacilli. Mol Biol Evol. 2007, 24 (8): 1861-71. 10.1093/molbev/msm111.
Sharp PM, Lloyd AT: Regional base composition variation along yeast chromosome III: ecolution of chromosome primary structure. Nucleic Acids Research. 1993, 21: 179-183. 10.1093/nar/21.2.179.
Karlin S, Brocchieri L, Trent J, Blaisdell BE, Mrazek J: Heterogeneity of genome and proteome content in bacteria, archaea, and eukaryotes. Theor Popul Biol. 2002, 61 (4): 367-90. 10.1006/tpbi.2002.1606.
Gentles AJ, Karlin S: Genome-scale compositional comparisons in eukaryotes. Genome Res. 2001, 11 (4): 540-6. 10.1101/gr.163101.
Richards TA, Dacks JB, Jenkinson JM, Thornton CR, Talbot NJ: Evolution of filamentous plant pathogens: gene exchange across eukaryotic kingdoms. Curr Biol. 2006, 16 (18): 1857-64. 10.1016/j.cub.2006.07.052.
Slot JC, Hibbett DS: Horizontal transfer of a nitrate assimilation gene cluster and ecological transitions in fungi: a phylogenetic study. PLoS ONE. 2007, 2 (10): e1097-10.1371/journal.pone.0001097.
Slot JC, Hallstrom KN, Matheny PB, Hibbett DS: Diversification of NRT2 and the origin of its fungal homolog. Mol Biol Evol. 2007, 24 (8): 1731-43. 10.1093/molbev/msm098.
Morgenstern I, Klopman S, Hibbett DS: Molecular evolution and diversity of lignin degrading heme peroxidases in the Agaricomycetes. J Mol Evol. 2008, 66 (3): 243-57. 10.1007/s00239-008-9079-3.
Fedorova ND: Genomic islands in the pathogenic filamentous fungus Aspergillus fumigatus. PLoS Genet. 2008, 4 (4): e1000046-10.1371/journal.pgen.1000046.
Denning DW, Anderson MJ, Turner G, Latge JP, Bennett JW: Sequencing the Aspergillus fumigatus genome. Lancet Infect Dis. 2002, 2 (4): 251-3. 10.1016/S1473-3099(02)00243-8.
Nierman WC: Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus. Nature. 2005, 438 (7071): 1151-6. 10.1038/nature04332.
Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22 (22): 4673-80. 10.1093/nar/22.22.4673.
Saitou N, Nei M: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987, 4 (4): 406-25.
Felsenstein J: PHYLIP (Phylogeny Inference Package). ver. 3.573c Mac.
Fertil B, Massin M, Lespinats S, Devic C, Dumee P, Giron A: GENSTYLE: exploration and analysis of DNA sequences with genomic signature. Nucleic Acids Res. 2005, W512-5. 10.1093/nar/gki489. 33 Web Server
Joseph J, Sasikumar R: Chaos game representation for comparison of whole genomes. BMC Bioinformatics. 2006, 7: 243-10.1186/1471-2105-7-243.
van Passel MW, Kuramae EE, Luyf AC, Bart A, Boekhout T: The reach of the genome signature in prokaryotes. BMC Evol Biol. 2006, 6: 84-10.1186/1471-2148-6-84.
Wang Y, Hill K, Singh S, Kari L: The spectrum of genomic signatures: from dinucleotides to chaos game representation. Gene. 2005, 346: 173-85. 10.1016/j.gene.2004.10.021.
Rementeria A, Lopez-Molina N, Ludwig A, Vivanco AB, Bikandi J, Ponton J, Garaizar J: Genes and molecules involved in Aspergillus fumigatus virulence. Rev Iberoam Micol. 2005, 22 (1): 1-23. 10.1016/S1130-1406(05)70001-2.
Brinkman FS, Macfarlane EL, Warrener P, Hancock RE: Evolutionary relationships among virulence-associated histidine kinases. Infect Immun. 2001, 69 (8): 5207-11. 10.1128/IAI.69.8.5207-5211.2001.
Panaccione DG, Scott-Craig JS, Pocard JA, Walton JD: A cyclic peptide synthetase gene required for pathogenicity of the fungus Cochliobolus carbonum on maize. Proc Natl Acad Sci USA. 1992, 89 (14): 6590-4. 10.1073/pnas.89.14.6590.
Butler G: Evolution of pathogenicity and sexual reproduction in eight Candida genomes. Nature. 2009
Gardiner DM, Waring P, Howlett BJ: The epipolythiodioxopiperazine (ETP) class of fungal toxins: distribution, mode of action, functions and biosynthesis. Microbiology. 2005, 151 (Pt 4): 1021-32. 10.1099/mic.0.27847-0.
Patron NJ, Waller RF, Cozijnsen AJ, Straney DC, Gardiner DM, Nierman WC, Howlett BJ: Origin and distribution of epipolythiodioxopiperazine (ETP) gene clusters in filamentous ascomycetes. BMC Evol Biol. 2007, 7: 174-10.1186/1471-2148-7-174.
Frost LS, Leplae R, Summers AO, Toussaint A: Mobile genetic elements: the agents of open source evolution. Nature Rev. Microbiol. 2005, 3: 722-732. 10.1038/nrmicro1235.
Lawrence JG, Ochman H: Amelioration of bacterial genomes: rates of change and exchange. Journal of Molecular Evolution. 1997, 44: 383-397. 10.1007/PL00006158.
Cortez DQ, Lazcano A, Becerra A: Comparative analysis of methodologies for the detection of horizontally transferred genes: a reassessment of first-order Markov models. In Silico Biol. 2005, 5 (5-6): 581-92.
Silva JC, Loreto EL, Clark JB: Factors that affect the horizontal transfer of transposable elements. Curr Issues Mol Biol. 2004, 6 (1): 57-71.
Deng J, Carbone I, Dean RA: The evolutionary history of cytochrome P450 genes in four filamentous Ascomycetes. BMC Evol Biol. 2007, 7: 30-10.1186/1471-2148-7-30.
Cramer RA, Stajich JE, Yamanaka Y, Dietrich FS, Steinbach WJ, Perfect JR: Phylogenomic analysis of non-ribosomal peptide synthetases in the genus Aspergillus. Gene. 2006, 383: 24-32. 10.1016/j.gene.2006.07.008.
JB and LM were supported by grants from the French Education and Research Ministry.
LM and JB carried out the experiments. PD designed the study and wrote the paper. LM and JB helped in the redaction of the paper. All the authors read and approved the final manuscript.