HC degradation across domain bacteria
alkB/M and almA/ladA genes are alkane mono-oxygenases that initiate the degradation of short (C5–C15) and long-chain alkanes (> C15), respectively. The alkB/M is rubredoxin-dependent, while almA and ladA are flavin-dependent mono-oxygenases.
The genes pheA (phenol 2-monooxygenase), xylM (xylene monooxygenase), xylX (toluate/benzoate 1,2-dioxygenase subunit alpha), todC1 (benzene 1,2-dioxygenase subunit alpha), and tmoA (toluene-4-monooxygenase system, hydroxylase component subunit alpha) for monocyclic, and bphA1 (biphenyl 2,3-dioxygenase subunit alpha), ndoB (naphthalene 1,2-dioxygenase system, large oxygenase component) for polycyclic compounds code for catalytic domains of ring hydroxylating oxygenases (RHOs) that add -OH group(s) to compounds undergoing degradation (Supplementary Figure S1). We explored the distribution of these genes and associated degradation pathways in a total of 23,446 representatives out of 143,512 bacterial genomes available in release 89 of the GTDB database that has been annotated via Annotree [12]. These annotated genomes are dominated by representatives of phyla Proteobacteria (32.5%), Actinobacteriota (13.3%), Bacteroidota (12.13%), and Firmicutes (8.01%) (Supplementary Figure S2 and Supplementary Table S4). Among the 123 represented bacterial phyla, 58 phyla had ≤ five genomes available per phylum and combined only represented 0.57% of the explored genomes. To avoid misinterpretations due to this uneven taxonomic distribution of representative genomes, we explored the contribution of members of each phylum in the HC degradation process by showing what proportion of microbes containing each HC degrading enzyme exist in each phylum (panel A of Supplementary Figures S3-S6). We also analyze the percentage of members of each phylum containing each HC degrading enzyme to ensure that we consider the contributions of underrepresented phyla in the HC degradation (panel B of Supplementary Figures S3-S6).
As expected, representatives of the phylum Proteobacteria (Pseudomonadales and Burkholderiales orders) presented the highest abundance of aliphatic and aromatic HC degrading enzymes, followed by Actinobacteriota and Bacteroidota for aliphatic and Actinobacteriota and Firmicutes for aromatic HC degrading enzymes (Supplementary Figures S3 and S4, panel A).
Underrepresented phyla remain mainly uncultured and are notably underexplored for metabolic potential (58 of 123 phyla, n = 131 genomes). Our analyses revealed that representatives of these taxa contain HC degrading enzymes involved in both the initiation and downstream steps of HC degradation processes. For example, phyla Tectomicrobia (Entotheonella), Binatota, Firmicutes_K, and Firmicutes_E contained mono-aromatic HC degradation enzymes (Fig. 2). In addition to these phyla, we annotated enzymes involved in the degradation of aliphatic HC in representatives of phyla SAR324, Eremiobacterota (Baltobacterales), Bdellovibrionota_B, and Chloroflexota_B (Fig. 1).
Other enzymes in the degradation pathways beyond the key genes for the initial degradation (Supplementary Table S1) are typically involved in several degradation pathways and are broadly distributed accordingly. As an example, the process of converting catechol to non-aromatic compounds with further conversion to intermediates of the TCA cycle (e.g., acetaldehyde and pyruvate) (Supplementary Figure S1) is shared among degradation pathways of xylene, naphthalene/phenanthrene, and phenol (blue color in Fig. 2). The name of genes involved in mentioned part are xylE/dmpB/nahH (catechol oxygenase), xylF/dmpD/nahN (2-hydroxymuconate semialdehyde hydrolase), xylG/dmpC/nahI (2-hydroxymuconic semialdehyde dehydrogenase), xylH/nahJ (2-hydroxymuconate tautomerase), xylI/dmpH/nahK (4-oxalocrotonate decarboxylase), xylJ/nahL (2-hydroxypent-2,4-dienoate hydratase), xylK/bphI/nahM (4-hydroxy-2-oxovalerate aldolase), xylR/pdeR (transcriptional regulatory protein). These ring-cleavage enzymes are also involved in the degradation of aromatic amino acids. Our analysis showed that representatives of phyla Firmicutes (mainly from the orders Bacillales and Staphylococcales), Firmicutes_I, Firmicutes_K, Firmicutes_E, Firmicutes_G, Firmicutes_H, Eremiobacterota, Deinococcota, Chloroflexota, Campylobacterota, Myxococcota and Bdellovibrionota play a significant role in this part of HC degradation process (blue color in Fig. 2).
Distribution of key genes involved in the degradation of alkanes
At lower taxonomic rank, the alkB/M and ladA genes were differently distributed across members of phyla Gammaproteobacteria, Alphaproteobacteria, and Actinobacteriota, hinting at their capacity for degrading hydrocarbons of variable chain length. Altogether 2089 genomes in orders Mycobacteriales (23.95%), Rhodobacterales (20.46%), Pseudomonadales (17.13%), Flavobacteriales (8.3%), Burkholderiales (6.16%), Cytophagales (3.66%), Propionibacteriales (2.47%), Rhizobiales (1.89%), and Chitinophagales (1.81%) contained alkB/M genes, while ladA was present in 2154 genomes from Pseudomonadales (21.05%), Rhizobiales (16.27%), Burkholderiales (14.44%), Actinomycetales (13.44%), Mycobacteriales (13.05%), Bacillales (4.74%), Enterobacterales (3.7%), Acetobacterales (2.31%), Streptomycetales (1.91%). There were also several representative genomes with ladA gene which have not been previously reported and belonged to Tectomicrobia, UBA8248, and SAR324 phyla (Figs. 3 and 4, panel B, Supplementary Table S6).
An indirect role of Cyanobacteria in HC degradation, especially in microbial mats, has been previously reported. These primary producers often have the nitrogen-fixing ability and can fuel and promote aerobic and anaerobic sulfate/nitrate-reducing HC degrading microorganisms in microbial mats [13]. There are also reports of a minor role of some Cyanobacteria members like Phormidium, Nostoc, Aphanothece, Synechocystis, Anabaena, Oscillatoria, Plectonema, and Aphanocapsa in direct HC degradation [14, 15]. In this study, we detected the presence of long-chain alkane degrading genes, ladA, in different members of Cyanobacteria with 0.31 and 12.54% of genomes in this phylum containing ladA (in Elainella saxicola, Nodosilinea sp000763385) and almA genes (in Synechococcales, Cyanobacteriales, Elainellales, Phormidesmiales, Thermosynechococcales, Gloeobacterales, Obscuribacterales), respectively.
Phylogenetic reconstruction of recovered alkB/M and ladA genes grouped them into five and nine main clades, respectively (Figs. 3 and 4, panel A). The branching pattern of these clades partially followed the taxonomic signal of the genomes they were retrieved from, specifically for the most dominant phyla. However, as it is evident in Figs. 3 and 4 some branches contained alkB/M and ladA genes originating from genomes belonging to distantly related phyla. The placement of phylogenetically diverse groups in one branch hint at horizontal transfer of these genes between microbial taxa [16]. Additionally, apart from the chromosomal type, both alkB/M and ladA genes have previously been reported to be also located on plasmids (OCT and pLW1071), corroborating their potential for horizontal transfer. For instance, there are reports on the intraspecies transfer of alkB/M among Pseudomonas members [17]. Placement of ladA gene originating from genomes affiliated to rare microbial taxa among clusters V-IX of the ladA phylogeny put forward the possibility of a prominent role for Actinobacteriota and Firmicutes members in expanding the distribution of this gene among rare taxa as representatives of Actinobacteriota and Firmicutes taxa frequently contain these genes (Fig. 4).
We also detected several genomes with multiple copies of the alkB gene that were not necessarily branching together in the reconstructed alkB phylogeny, hinting at the probability of either gene duplication, paralogue occurrence, or HGT. Examples of these genomes with more than 6 copies of alkB/M are Polycyclovorans algicola (10), Nevskia ramose (7), Zhongshania aliphaticivorans (7), Solimonas aquatic (7), Immundisolibacter cernigliae (6), and Rhodococcus qingshengii (6). Multiple copies have also been detected in representatives of the genera Nocardia, Rhodococcus, and Alcanivorax (the full list of genomes with multiple copies of alkB gene is available in Supplementary Table S6).
Furthermore, the ladA gene was also detected in Mycolicibacterium dioxanotrophicus, Cryobacterium_A sp003065485, Kineococcus rhizosphaerae, Microbacterium sp003248605, Paenibacillus_S sp001956045, Pararhizobium polonicum, Mycolicibacterium septicum, and Microbacterium sp000799385 with six copies in each genome. Several examples were also present in genera Pseudomonas_E, Bradyrhizobioum, Rhizobioum, and Paraburkholderia, which had more than one copy (904 genomes) (Supplementary Table S6).
The presence of multiple copies of alkane hydroxylase genes has been hypothesized to enable cells to use an expanded range of n-alkanes or to adapt to different environmental conditions. However, the exact evolutionary rationale has not yet been established [18, 19]. To evaluate this hypothesis, we compared different sequences of each gene in an individual genome (mentioned above for ladA and alkB) using BLAST (Supplementary Table S7). The results showed that the identity of multiple gene copies in a single genome was in the range of 30 to 70 percent, while they are still predicted to have the same function. This wide range of BLAST identity between these gene copies suggests that these genes potentially originated from different sources and were transferred horizontally. We further explore in more detail the case for 21 xylX gene copies in Immundisolibacter cernigliae below.
Distribution of key genes (ring-hydroxylating oxygenases (RHOs)) involved in the degradation of aromatic HCs
Genomes containing RHOs (2761 genomes, 16 phyla) present an overall lower phylogenetic diversity than alkane mono-oxygenases (4669 genomes, 21 phyla for both alkB/M and ladA). In general, AlkB/M and LadA enzymes consist of FA_desaturase (PF00487) and Bac_luciferase-like mono-oxygenase (PF00296) domains, respectively (Supplementary Table S5). They act non-specifically on a wide range of alkanes of different chain lengths. Therefore, they are likely to be more widespread in genomes, especially because alkane compounds do not exclusively originate from petroleum. For instance, in pristine marine ecosystems, primary producers such as Cyanobacteria can release long chain-length aliphatic compounds (e.g., pentadecane, heptadecane). Alkane-producing Cyanobacteria include prominent and globally abundant genera such as Prochlorococcus and Synechococcus [20, 21]. Therefore, marine microorganisms are broadly exposed to aliphatic compounds with different chain lengths, even in environments without oil spills or industrial influence. This can explain why marine ecosystems host a plethora of hydrocarbonoclastic bacteria [22, 23]. This would probably be the reason why the number of genomes with alkB/M is higher than RHO-bearing genomes.
Enzymes XylX, NdoB, BphA1, and TodC1 are composed of two pfam domains, PF00355 (Rieske center) and PF00848 (Ring_hydroxyl_A). These common domains impact the branching in the phylogenetic tree and lead to the neighboring branching of these mentioned genes (Fig. 5).
RHO enzymes are predominantly present in Burkholderiales, Pseudomonadales, Sphingomonadales, Caulobacterales, and Nevskiales orders of the phylum Proteobacteria (35 different Proteobacterial orders) (Fig. 5, B part). However, a significant number of pheA (phenol 2-monooxygenase) gene and, to a lesser degree, xylX and tmoA genes were also present in Actinobacteriota phylum (9 different Actinobacteriotal orders) (Fig. 5, B part).
Sphingomonadales are prominent bacteria in the rhizosphere and are also abundant in littoral zones of inland waters. Accordingly, we suggest that these bacteria may have evolved a capacity to degrade different aromatic compounds in response to the high concentrations of aromatic secondary metabolites typically seen in the plant rhizosphere. Additionally, Sphingomonadales are known for their large plasmids with intraspecies transmission [24].
Representatives of Burkholderiales were one of the other prominent members with RHO genes. This potential was mainly attributed to Paraburkholderia, Caballeronia, Burkholderia, Cupriavidus, Ralstonia and Massilia genera. All of the mentioned genera in our study had a genome size between 6.5–7 Mb and have been reported to have multi-chromosomes (multipartite) [25]. According to the reports, the second chromosome has a prominent role in niche adaptation in Burkholderiaceae family [25]. Their HC degradation capability might be related to this feature.
Among Pseudomonadales, representatives of Pseudomonas Acinetobacter, Halomonas, Marinobacter, Marinobacterium and Psychrobacter genera were detected to contain RHO genes in their genome. There are multiple reports of representatives of the Pseudomonas genus to degrade various HCs, where they mainly organize their degrading genes on conjugal plasmids [26, 27].
Acinetobacter representatives also had RHO genes. Acinetobacter spp. can produce bioemulsifier/biosurfactants and are naturally transformable species that might be relevant for their high degradation potential [28, 29]. Marinobacter and Halomonas genera were frequently reported as hydrocarbonoclastic bacteria which were isolated from oil-polluted marine water and sediment samples. Representatives of the Halomonas genus are also capable of EPS/biosurfactant production that accelerates the degradation of HC compounds, especially under saline conditions [30,31,32].
In our study the main RHO-bearing genome in the Caulobacterales order belonged to Hyphomonas genus. Representatives of this genus have been isolated from marine aromatic hydrocarbon polluted sites with the capability to degrade HCs [33, 34]. Nevskiales aromatic degradation potential was mainly detected in Nevskia, Polycyclovorans, and Solimonas genera. A Nevskia representative was reported to have biosurfactant production ability and Polycyclovorans representatives were shown to prefer HC as a nutrient compared to glucose [35, 36].
Among all investigated RHO genes, the highest phylogenetic diversity was observed in tmoA (208 genomes in 12 phyla and 38 orders) and xylX (1486 genomes in 9 phyla and 38 orders) genes (Fig. 5, B part). In the case of tmoA gene, it might be due to the wide range of HC compounds susceptible to this enzyme (e.g., benzenes, some PAHs, and alkenes) [37, 38]. Therefore, in our survey here we see that a diverse set of genera harbor tmoA gene and could potentially degrade different types of HCs (Fig. 5, B part).
Underrepresented microbial groups with a limited number of RHO genes also featured tmoA, xylX, and pheA genes. Myxococcota, Acidobacteriota, Chloroflexota, Firmicutes_I,E,K, and Cyanobacteria with tmoA gene were clustered separately, reflecting their distinct protein sequence and the lower possibility of HGT among these groups. For xylX, Eremiobacterota affiliated genes were placed together with genes from Gammaproteobacteria, and Tectomicrobia, Binatota, Chloroflexota, and Firmicutes_I were placed in separate branches near Actinobacteriota. In addition, Acidobacteriota, Eremiobacterota, and Campylobacteria with pheA gene were nested within Alphaproteobacteria members. The phylogeny of RHO genes was also more consistent with taxonomy than the phylogeny of alkB/M and ladA.
Bionatota, a recently described phylum shown to be efficient in HC degradation, harbored todC1, bphA1 (in Binatales order), and xylX (Bin18, Binatales) genes from RHOs and ladA (in Bin18) from alkane hydroxylases. Representatives of this phyla have been reported to play a role in methane and alkane metabolism [39]. However, we also noted the further potential of Binatales and Bin18 orders of this phylum in aromatic HC degradation.
RHOs can be located either on the chromosome or plasmid, depending on the organism. For instance, todC1, bphA1, and tmoA genes were reported to be on the chromosome [40], while in another study, they were detected on a plasmid [41]. Other RHOs, including xylX, xylM, pheA, and ndoB have mainly been reported to be hosted by plasmids [40, 38]. One of the main examples is TOL plasmid (carry genes for xylene and toluene degradation) which has been reported to be transferable between Pseudomonas spp. Rhizobium spp. and Erwinia spp. [42, 43].
Multiple copies of RHO genes in one genome were detected for xylX and pheA. Immundisolibacter cernigliae surprisingly contained 21 variants of xylX. This genome also had six copies of alkB/M and was isolated from a PAH-contaminated site [44]. The high HC degradation potential of other members of this genus has also been reported in the marine ecosystem [45, 46]. Rugosibacter aromaticivorans (containing 5, 2 and 2 copies of xylX, ndoB, and tmoA genes, respectively), Pseudoxanthomonas_A spadix_B (with 4, 2 and 2 copies of xylX, todC1 and bphA1 genes, respectively), Thauera sp002354895 (4), Pigmentiphaga sp002188635 (4) are other examples of genomes that have multiple copies of the xylX gene. Although xylX gene was detected in Actinobacteriota, multiple copies in a genome were seen only among the Proteobacteria phylum.
The pairwise BLAST identity among 21 different variants of the xylX gene present in the Immundisolibacter cernigliae genome ranged between 35 to 81 percent. Among these 21 xylX copies, three sequences (i.e., xylX numbers 18, 19, and 22 in Supplementary Figure S7) showed higher BLAST identity percentages with the xylX gene of the Rugosibacter genus compared to other xylX copies present in the Immundisolibacter cernigliae genome itself (Supplementary Table S7 and Supplementary Figure S7). Several xylX copies of I. cernigliae (i.e., xylX numbers 10, 11, 13, and 15 in Supplementary Figure S7) had more edges than others in the network, and their interactions (Supplementary Figure S7, highlighted in red) represent their similarity with xylX copies of Caballeronia, Sphingobium, and Pseudoxanthomonas, Pseudomonas, and Thauera genera. In addition, xylX number 5 and 7 of Immundisolibacter had almost similar blast identities with the xylX gene of Pigmentiphaga genus and other xylX copies in I. cernigliae Itself. These results suggest that copies of the xylX gene in I. cernigliae potentially originate from horizontal transfer from other microbial groups.
On the other hand, Glutamicibacter mysorens (4), Enteractinococcus helveticum (4), and many other genomes from the Castellaniella, Kocuria, and Halomonas genera, had several pheA copies in their individual genomes. To a lesser degree, tmoA gene was present in multiple copies in Pseudonocardia dioxanivorans (4), Rhodococcus sp003130705 (3), Amycolatopsis rubida (3) and Zavarzinia compransoris_A (3) genera.
While bphA1 and todC1 have different KO identifiers (Supplementary Table S1), our manual checks showed that they had the same conserved domain based on NCBI CD-Search [47]. We kept both annotations for cases where one gene was annotated with both KO identifiers. Previous studies also report similar homology and substrate specificity between todC1 and bphA1 [41].
XylM, as one of the enzymes mediating the initial steps in toluene/xylene degradation, showed the lowest abundance and phylogenetic diversity (27 genomes in 1 phylum and 6 orders). Toluene/benzene can generally be degraded through different routes and three of the most prevalent approaches were studied here. XylX, TodC1, and TmoA are the initial oxygenase enzymes of these three pathways. They are diverse in starting the degradation and composed of different domains, while downstream degradation converges to catechol derivatives as intermediates. xylM can also initiate toluene degradation in addition to xylene. In this pathway xylX then converts produced benzoate to catechol. Therefore, while we see a lower diversity and abundance of genomes harboring xylM, we want to highlight the possibility of the presence of alternative degradation enzymes in different microorganisms that can degrade the same compound and initiate the degradation process.
As the number of rings in aromatic compounds increases, the number and diversity of microbial groups capable of degrading them decreases, and microbial groups with ndoB (naphthalene 1,2-dioxygenase) accordingly showed the lowest abundance after xylM gene. The genomes hosting ndoB had limited phylogenetic diversity (35 genomes in 1 phylum and 6 orders) and were found mainly in representatives of Alphaproteobacteria (Sphingomonadales (17) and Caulobacterales (2)) and Gammaproteobacteria (Pseudomonadales (5), Burkholderiales (1), Nevskiales (1)).
Ecological strategy of HC degrading bacteria
Microorganisms are broadly divided into two main functional growth categories, i.e., oligotrophic/slow-growing/K-strategist or copiotrophic/fast-growing/r-strategist. These ecological strategies are associated with the genome size that, in turn, directly correlates with the GC content [48]. To get further insights into the ecological strategies of organisms that feature HC degrading genes, we compared the distribution of GC content and estimated genome size. This analysis revealed that HC degrading genes were present in genomes with a broad genome size range (1.34 to 16.9 Mb) and GC content (26.9 to 76.6%) (Supplementary Figure S8, data available in Supplementary Table S8). Genomes containing HC degrading enzymes with GC percent equal to or lower than 30% mainly had alkB gene and taxonomically belonged to Flavobacteriales order (genome sizes in the range of 1.4 to 4.2 Mb). The largest genome included in this study was Minicystis rosea from the phylum Myxococcota (genomes size of 16.9 Mb), which also contained alkB. In this phylogenetic reconstruction, the alkB gene of Minicystis rosea clustered together with homologs from representatives of phylum Gammaproteobacteria (Immundisolibacter and Cycloclasticus genera) (Fig. 3). The large genome size of Minicystis rosea, together with the phylogenetic placement of its alkB gene together with phylum Gammaproteobacteria representatives, hint at the horizontal transfer of this gene to the Minicystis rosea genome. These analyses indicate that HC degradation ability is present in both K-strategist and r-strategists microorganisms. Earlier studies have shown that r-strategist serves as the principal HC degraders after oil spills and at other point sources of pollution in marine environments [49,50,51]. Indeed, most obligate hydrocarbonoclastic bacteria are r-strategists (Proteobacteria domain) and are mainly reported to be isolated from marine samples [52]. The r-strategists are adapted to live in oligotrophic environments with transient nutrient inputs and rapid consumption of substrates upon episodic inputs by means of fast growth and population expansion [53]. In contrast, studies on oil-polluted soil samples suggest a predominance of K-strategists, especially in the harsh conditions (High concentration of HC, soil dryness, etc.) commonly seen in many such soil environments [54,55,56]. Hosting multiple copies of genes coding for HC degrading enzymes seems to be a shared feature in both r- and K-strategists with small and large genome sizes alike and appears to be a universal evolutionary strategy for HC degradation.
Genome-level analysis of HC degradation
Microorganisms are known to use division of labor or mutualistic interactions to perform HC degradation in the environment [57, 58]. However, 92 genomes (less than 0.5%) of 23,446 investigated bacterial genomes do in fact, contain all the enzymes required to degrade at least one HC compound completely. These 92 genomes all belong to Actinobacteriota (n = 25) and Proteobacteria (n = 67) (Fig. 6).
Microorganisms have evolved two pathways for naphthalene degradation that involve the production of either catechol or gentisate as aromatic degradation intermediate (Supplementary Figure S1). Catechol can in turn, be further degraded via meta- or -ortho cleavage. Several microorganisms, including Novosphingobium naphthalenivorans, Pseudomonas_E fluorescens_AQ, Pseudomonas_E frederiksbergensis_E, and Herbaspirillum sp000577615, feature both of the mentioned pathways and even have the genes to potentially perform ortho and meta cleavage simultaneously (Fig. 6).
Moreover, Cupriavidus pauculus_A (long-chain alkanes and also biphenyl), Cycloclasticus sp002700385 and Paraburkholderia_B oxyphila (cycloalkane and xylene/benzene), Pigmentiphaga sp002188465 (cycloalkane and phenol), Rhodococcus sp003130705, Burkholderia puraquae, and Paraburkholderia_B mimosarum (toluene and biphenyl) contain the genetic potential to degrade more than one HC compound autonomously (Fig. 6).
Members of Burkholderiales had the genetic potential to degrade even more diverse compounds individually, while Actinobacteriota representatives mainly had the potential to contribute to the degradation of aliphatic compounds. This ability was also apparent in Figs. 1, 3, and 4. The potential for autonomous HC degradation wasn’t detected in genomes of more rare bacterial phyla. Moreover, none of the archaeal genomes investigated in this study contained all genes for the complete degradation of HCs.
HC degradation across domain archaea
Generally, HC degradation ability seems to be less prevalent among archaea as compared to bacteria. The phylum Halobacterota had the highest proportion of enzymes involved in the degradation of both aliphatic (n = 14 enzymes of aliphatic degradation pathway) and aromatic (n = 25 enzymes of aromatic degradation pathway) compounds among the studied archaea (Supplementary Figure S9). The alkB enzyme, responsible for short-chain alkane degradation, was detected in two copies in a single member of the phylum Nanoarchaeota (ARS21 sp002686215). This gene was clustered together with alkB identified in Gammaproteobacteria representatives (GCA-002705445 order) (Fig. 3). Genes needed to initiate degradation of long-chain alkanes and cyclododecane/cyclohexane as well as cyclopentane degradation via ladA and cddA/chnB genes were more prevalent among Halobacterota representatives (75 genomes in 7 families; Haloferacaceae, Haloarculaceae, Natrialbaceae, Halococcaceae, Halalkalicoccaceae, Haloadaptaceae, and Halobacteriaceae) (Figs. 4 and Supplementary Figure S9). Among investigated RHOs, only tmoA that initiates toluene degradation was present in 5 Sulfolobales and 2 Thermoproteales genomes of the phylum Crenarchaeota (Fig. 5). Detected archaeal tmoA and ladA genes branched separately from bacteria in the phylogenetic trees (Figs. 4 and 5). Apart from alkB, gene duplications were present in several genomes for both tmoA (Sulfolobus and Acidianus genera) and ladA (Halopenitus persicus and Halopenitus malekzadehii).
Key enzymes needed to initiate HC degradation were rarely present in archaea (Figs. 3, 4, and 5), indicating that Archaea might not play a significant role in the typically rate-limiting initial degradation of HCs. However, several studies report the ability of halophilic archaeal isolates (e.g., Halorubrum sp., Halobacterium sp., Haloferax sp., Haloarcula sp.) to degrade both aliphatic (n-alkanes with chain lengths up to C18 and longer) and aromatic (e.g., naphthalene, phenanthrene, benzene, toluene and p-hydroxybenzoic acid) HCs and use them as their sole source of carbon [59,60,61]. This may imply that archaea carry alternative and hitherto unknown enzymes for triggering HC degradation. However, there is no complete genome information available for the mentioned isolates to screen them for the presence of alternative degrading enzymes [11]. The Haloferax sp., capable of using a wide range of HCs as its sole source of carbon, present in the AnnoTree database (RS_GCF_000025685.1), contained none of the key degrading genes. The AnnoTree website chooses representative genomes having completeness of higher than 90%, which reduces the likelihood of incompleteness of the studied genome as a reason for the absence of these genes. Therefore, alternative HC degrading genes that are present in the accessory part of the genomes might be responsible for the observed degradation.
On the other hand, the recent reconstruction of three metagenome-assembled Thermoplasmatota genomes (Poseidonia, MGIIa-L2, MGIIb-N1) from oil-exposed marine water samples (not included in the GTDB release89) contained enzymes involved in alkane (alkB) and xylene (xylM) degradation [46]. Hence as these global genome depositories continue to expand, we may have to revise or update our findings.
A total number of 597 archaeal genomes contain enzymes involved in the degradation of aromatic compounds regarding the conversion of catechol to TCA intermediates. This is observed in the phyla Halobacterota (176 genomes in Haloferacaceae, Haloarculaceae, Natrialbaceae, Halococcaceae, Halobacteriaceae, Methanocullaceae, Methanoregulaceae, Methanosarcinaceae, Archaeoglobaceae, and some other methano-prefixed families), Thermoplasmatota (175 genomes in Poseidoniales, Marine Group III, Methanomassiliicoccales, UBA10834, Acidiprofundales, DHVEG-1, UBA9212), and Crenarchaeota (110 genomes in Nitrospherales, Desulfurococcales, Sufolobales, Thermoproteales). This widespread capacity for degrading downstream intermediates in aromatic HC degradation implies that archaea interact closely with bacteria in HC degradation.