A comparative genomics approach to understanding the biosynthesis of the sunscreen scytonemin in cyanobacteria
© Soule et al. 2009
Received: 06 January 2009
Accepted: 24 July 2009
Published: 24 July 2009
Skip to main content
© Soule et al. 2009
Received: 06 January 2009
Accepted: 24 July 2009
Published: 24 July 2009
The extracellular sunscreen scytonemin is the most common and widespread indole-alkaloid among cyanobacteria. Previous research using the cyanobacterium Nostoc punctiforme ATCC 29133 revealed a unique 18-gene cluster (NpR1276 to NpR1259 in the N. punctiforme genome) involved in the biosynthesis of scytonemin. We provide further genomic characterization of these genes in N. punctiforme and extend it to homologous regions in other cyanobacteria.
Six putative genes in the scytonemin gene cluster (NpR1276 to NpR1271 in the N. punctiforme genome), with no previously known protein function and annotated in this study as scyA to scyF, are likely involved in the assembly of scytonemin from central metabolites, based on genetic, biochemical, and sequence similarity evidence. Also in this cluster are redundant copies of genes encoding for aromatic amino acid biosynthetic enzymes. These can theoretically lead to tryptophan and the tyrosine precursor, p-hydroxyphenylpyruvate, (expected biosynthetic precursors of scytonemin) from end products of the shikimic acid pathway. Redundant copies of the genes coding for the key regulatory and rate-limiting enzymes of the shikimic acid pathway are found there as well. We identified four other cyanobacterial strains containing orthologues of all of these genes, three of them by database searches (Lyngbya PCC 8106, Anabaena PCC 7120, and Nodularia CCY 9414) and one by targeted sequencing (Chlorogloeopsis sp. strain Cgs-089; CCMEE 5094). Genomic comparisons revealed that most scytonemin-related genes were highly conserved among strains and that two additional conserved clusters, NpF5232 to NpF5236 and a putative two-component regulatory system (NpF1278 and NpF1277), are likely involved in scytonemin biosynthesis and regulation, respectively, on the basis of conservation and location. Since many of the protein product sequences for the newly described genes, including ScyD, ScyE, and ScyF, have export signal domains, while others have putative transmembrane domains, it can be inferred that scytonemin biosynthesis is compartmentalized within the cell. Basic structural monomer synthesis and initial condensation are most likely cytoplasmic, while later reactions are predicted to be periplasmic.
We show that scytonemin biosynthetic genes are highly conserved among evolutionarily diverse strains, likely include more genes than previously determined, and are predicted to involve compartmentalization of the biosynthetic pathway in the cell, an unusual trait for prokaryotes.
The UV-absorbing ability of scytonemin is based on its chemical structure, a symmetrical indole-alkaloid consisting of fused heterocyclic units . The biosynthesis of scytonemin likely involves tryptophan and tyrosine derivatives , both of which are known to absorb ambient UVB irradiation [9, 10]. Although much is known about the biochemistry and ecology of scytonemin, very little was known until recently concerning its biosynthesis and molecular genetics.
In our previous study, using the model organism Nostoc punctiforme ATCC 29133 (N. punctiforme), we were able to characterize an 18-gene region associated with the biosynthesis of scytonemin , and compare that genomic region to a similar gene cluster in Anabaena PCC 7120 (Anabaena). Since then, two additional cyanobacterial genomes were sequenced, Lyngbya PCC 8106 (Lyngbya) and Nodularia spumigena CCY 9414 (Nodularia), which also contain orthologues to the scytonemin-associated genes from N. punctiforme , and the putative roles of the initial genes in scytonemin biosynthesis have been corroborated in a recent study . Additionally, we were able to sequence several putative biosynthetic genes from this region in another scytonemin-producing cyanobacterium, Chlorogloeopsis sp. strain Cgs-089 . Chlorogloeopsis can also be identified by the strain number CCMEE 5094, maintained by the Culture Collection of Microorganisms from Extreme Environments at the University of Oregon http://cultures.uoregon.edu/. Genomic comparisons of the scytonemin-associated genes from all five cyanobacteria above suggest many similarities and have resulted in the discovery of additional genes in N. punctiforme that may be associated with scytonemin biosynthesis and regulation. Here we describe and characterize genes that appear to be essential for scytonemin biosynthesis, and develop the first hypothetical model for the cellular compartmentalization of scytonemin biosynthesis.
The predicted structural features found in some of these genes are also interesting and support a cellular compartmentalization of scytonemin biosynthesis. For example, ScyD, ScyE, and ScyF, none of which had been assigned a protein function by annotation, each contain a signal peptide export domain in their derived protein sequence. These N-terminal signature sequences are often associated with periplasmic proteins, suggesting that some stages of scytonemin biosynthesis may occur in the periplasm. Furthermore, the protein sequences of ScyA, TyrP, and NpR1259 all contain at least one transmembrane domain. The software program PSLpred , which predicts the subcellular localization of bacterial proteins based on their protein sequences, suggests that TyrP may also function on the periplasmic side, while ScyA and NpR1259 likely function on the cytoplasmic side. While the protein sequence of NpR1268 does not have an N-terminal export domain, the fact that it resembles dsbA, a dithiol-disulfide isomerase (oxidoreductase) that facilitates the formation of disulfide bridges in the folding of periplasmic proteins , suggests that it may also localize to the periplasm. This leads us to speculate that a dithiol-disulfide isomerase of this kind could be important as an accessory to the other proteins predicted to be active in the periplasm. Thus, the upstream region of the cluster is comprised of novel genes likely involved directly in the assembly of scytonemin biosynthesis, where early condensing reactions occur in the cytoplasm and presumably later steps appear to be localized to the periplasm.
Most of the genes located towards the downstream portion of the cluster are clearly associated by similarity with the biosynthesis of aromatic amino acids [11, 20]. Furthermore, they do not contain structural motives that predict their association with cellular membranes or their transport to the periplasmic space. In this region of the cluster are genes predicted to code for the first two enzymes of the shikimic acid pathway (aroG, aroB), leading to the formation of 5-dehydroquinate. All of the genes necessary for the biosynthesis of tryptophan from chorismate (trpE, trpC, trpA, trpB, trpD) are also present, while only prephenate dehydrogenase (encoded by tyrA) is present from the tyrosine biosynthesis pathway, thus ending that pathway at p-hydroxyphenylpyruvate, one amination short of tyrosine [21, 22]. In fact, on the basis of chemical structures , p-hydroxyphenylpyruvate is a theoretically more direct precursor for scytonemin than tyrosine.
One of the most significant observations regarding these aromatic amino acid genes is that there is at least one other copy of each of them elsewhere in the genome of N. punctiforme at dispersed loci. Genes in this dispersed set find homologues in all other cyanobacteria sequenced so far and thus likely have a housekeeping function . The cluster of redundant copies of aromatic amino acid biosynthetic genes, by contrast, appears to be unique and always spatially associated with the scytonemin cluster in the few cyanobacterial genomes that have it. Therefore, it is reasonable to hypothesize that the downstream region of the scytonemin cluster is likely dedicated to supplying the building blocks for the biosynthesis of scytonemin, while the standard housekeeping copies remain important for central metabolism. This is supported by the differential up-regulation of these redundant genes along with the induction of scytonemin synthesis in N. punctiforme, while the expression levels of the housekeeping genes remain unaltered .
Two genes in the downstream region of the cluster have previously been assigned putative protein functions not related to aromatic amino acid biosynthesis. NpR1270 shows similarity to a putative glycosyltransferase, with 77% identity to a glycosyltransferase in Nodularia. Interestingly, some glycosyltransferases in bacteria have been linked to exopolysaccharide biosynthesis . Specifically, in Nostoc commune, the synthesis of scytonemin is coupled to the synthesis of the exopolysaccharide . The protein sequence of NpR1263 has a transmembrane domain and is annotated as a putative tyrosinase, TyrP, a copper monooxygenase that can hydroxylate monophenols and oxidize o-diphenols to o-quinols . Indeed, NpR1263 has the essential conserved residues for Cu2+ binding and is a putative tyrosinase-like protein. It is unique, in that it does not have any cyanobacterial protein sequence homologs in GenBank, and it can be predicted to play an important role in scytonemin biosynthesis, as explained below. The other downstream gene is NpR1259, the last gene in this cluster. It has two putative transmembrane domains and was annotated as a hypothetical membrane protein, since it lacks real homologies with known genes.
Upstream from the gene cluster are two genes that might be involved in the regulation of scytonemin biosynthesis, given their high degree of conservation in sequence and location among distantly related strains (see below). These protein sequences reveal strong similarities to two-component signal transduction systems. These systems typically involve the autophosphorylation of a histidine kinase (in our case, NpF1277) and the subsequent transfer of the phosphate group to an aspartate on the protein. This phosphorylated aspartate then acts as a phospho-donor to a response regulator protein (in our case, NpF1278), which ultimately turns on the transcription of the genes the system regulates [27, 28]. NpF1277 likely belongs to class II histidine kinases, which are characterized by the presence of PAS/PAC sensory domains that are generally sensitive to oxygen, redox, or light . NpF1278 is a class II response regulator (RR)  predicted to be a positive transcriptional regulator . A working hypothesis is that NpF1277 and NpF1278 might regulate the adjacent genomic region (NpR1276 to NpR1259) associated with scytonemin biosynthesis.
Cyanobacterial orthologs to the scytonemin-associated genes of N. punctiforme.
Gene in Nostoc
Description in Nostoc
% Identity a to Anabaena
% Identity to Nodularia
% Identity to Lyngbya
In the scytonemin gene cluster of Anabaena, Lyngbya, and Nodularia, there are five conserved genes downstream of scyF that are absent from the N. punctiforme cluster (shown in black in Figure 3A). In hindsight inspection, orthologs of these genes could readily be identified elsewhere on N. punctiforme's chromosome. There, they comprised a five-gene satellite cluster with all five genes oriented in the same transcriptional direction (NpF5232 to NpF5236). In N. punctiforme, NpF5232 and NpF5235 are annotated as unknown hypothetical proteins, while NpF5233, NpF5234, and NpF5236 are annotated as a putative metal-dependent hydrolase, prenyltransferase (ubiA), and type I phosphodiesterase, respectively. However, these annotations are based on weak similarity, and the orthologs of each of these genes are annotated as unknown hypothetical proteins in the Anabaena, Lyngbya, and Nodularia genomes. At this point, it seems that ambiguity calls for a cautious approach by postponing a specific annotation for these genes.
In a previous study we determined that Anabaena was unable to produce scytonemin , even though it contained many of the genes in the scytonemin cluster, and interpreted this as a case of relic genetic information. It was thus important to test if scytonemin was produced in the other strains used in the comparisons. We could elicit the production of scytonemin neither in Lyngbya nor in Nodularia, upon exposing cultures of each strain to UVA radiation, which is the standard procedure to achieve biosynthetic induction (see Methods). It is possible that these strains may have had the ability to produce scytonemin at some point in their evolutionary history, but have now lost it, since laboratory strains are rarely, if ever, exposed to the doses of UVA required for scytonemin biosynthesis. Furthermore, since scytonemin is a passive sunscreen it is most effective in environments with pulsed resource availability as explained above. Since Anabaena and Nodularia are planktonic , their need for a passive sunscreen is not as crucial as it is for the Nostoc and Chlorogloeopsis strains of terrestrial habitats . Although some strains of Lyngbya produce scytonemin, Lyngbya PCC 8106 does not produce it. This may be because the marine inter-tidal zone that it was isolated from had varying degrees of resource availability and UV exposure, thus this Lyngbya strain may have not needed a passive sunscreen.
Given these results, it seemed important to obtain sequences for the scytonemin-associated region from another scytonemin-producing strain besides N. punctiforme. Chlorogloeopsis sp. strain Cgs-O-89 , a cyanobacterium known to produce scytonemin , was selected for this purpose. Using targeted PCR based on primers designed from the N. punctiforme genome, we were able to amplify and sequence several genes from the genomic region associated with scytonemin biosynthesis of Chlorogloeopsis, and found that their genomic arrangement was very similar to that of N. punctiforme (Figure 3B). Additionally, the five-gene satellite cluster from N. punctiforme was found and sequenced in Chlorogloeopsis as a continuous segment. As in N. punctiforme, the Chlorogloeopsis satellite gene cluster was not continuous with the scytonemin-associated gene cluster. Although we were unable to link all of the scytonemin-associated gene orthologs of Chlorogloeopsis into a single contig, we could establish clear similarities between the Chlorogloeopsis and N. punctiforme gene clusters (Figure 3).
Scytonemin is a symmetrical dimeric molecule, and it is expected that each monomer is synthesized separately before condensing to form the dimer. In theory, if tryptophan and tyrosine were used as building blocks, the biosynthesis of scytonemin could involve as little as four to six biosynthetic steps. In fact, structural, genetic, and preliminary radiotracer evidence indicates that the biosynthesis of scytonemin starts from aromatic amino acid (or related) precursors [7, 8, 11]. Previously isolated natural products, with structural similarities to putative scytonemin subunits, also provide useful biosynthetic clues. Nostodione A (Figure 1B) has not only been isolated by ozonolysis of scytonemin , but has also been isolated from Nostoc commune and Scytonema hofmanni , two typical scytonemin-producing strains. It is thus logical to assume that nostodione A is the most likely monomeric intermediate of scytonemin. Prenostodione (Figure 1C), the methylated carboxylic acid precursor of nostodione A, has been reported from Nostoc sp. TAU strain IL-235, further suggesting that the origin of the biosynthetic pathway of scytonemin is from a condensation of tryptophan and phenylpropanoid derived subunits . Indeed, a recent study found that deaminated tryptophan and tyrosine (indole-3-pyruvic acid and p-hydroxyphenylpyruvate, respectively) condense, through the action of ScyA and ScyB, to form an intermediate that is structurally similar to diolmycin A1 (Figure 1D) . Diolmycin A1 has been isolated from Streptomyces sp.  and is a plausible intermediate in the scytonemin biosynthetic pathway. Furthermore, oxidation of the tyrosine moiety appears to be essential for the biosynthesis of nostodione A, an essential precursor to scytonemin as mentioned above. We propose that this oxidation could be carried by the tyrosinase-like TyrP encoded for in the scytonemin gene cluster, since tyrosinases are known to promote monooxygenation in similar moieties . It is interesting to note that the only scytonemin-associated gene in common between N. punctiforme and Chlorogloeopsis (the two proven scytonemin producers), that is absent from the other three strains (which, in our hands, do not produce it), is tyrP (putative tyrosinase). In fact, the gene appears to be absent from the genomes of these Lyngbya, Anabaena,and Nodularia strains altogether, as is the case for all other fully sequenced cyanobacterial genomes. We do note, however, that while the genome of Anabaena is complete, the Lyngbya and Nodularia genome projects are almost complete, and because of this we cannot determine with absolute certainty at the time of this publication if tyrP is absent from these genomes.
The conservation of genes and genomic arrangements between the N. punctiforme scytonemin biosynthesis gene cluster and the Chlorogloeopsis gene cluster allows us to predict which genes are important in the biosynthesis of scytonemin. Since scyA to scyF are conserved across all of the strains described above, and are either unknown in function or putatively assigned a function, we expect that these six genes will provide the most useful information for determining the scytonemin biosynthetic pathway. Additionally, we have reason to associate the N. punctiforme genes NpF5232 to NpF5236 with the biosynthesis of scytonemin, and it is likely that the response regulator (NpF1278) and sensor kinase (NpF1277) upstream from the cluster are involved in regulating this system.
Furthermore, protein sequence data from several of the genes in the cluster provide us with clues regarding scytonemin biosynthesis and localization. While the roles of ScyA and ScyB in the preliminary stages of scytonemin biosynthesis are predicted to occur in the cytoplasm, a working model of scytonemin biosynthesis suggests periplasmic compartmentalization of the later biosynthetic stages. Overall, our analyses have increased our understanding of scytonemin biosynthesis and will facilitate the construction of more direct and efficient hypotheses for future experiments. Furthermore, as scytonemin has been documented as having anti-inflammatory  and antiproliferative properties , our work also helps those working on the biomedical potential of scytonemin and related compounds. This study constitutes a step forward in understanding the biosynthesis of secondary metabolites in bacteria and contributes a novel example of a biosynthetic pathway for a microbial indole-alkaloid. We hope that our contributions to understanding secondary metabolite biosynthesis in cyanobacteria will ultimately lead to the discovery of additional natural products and the pathways by which they are synthesized.
Axenic stock cultures of each strain were maintained on plates solidified with 1.5% Noble agar. N. punctiforme was grown in Allen and Arnon medium (AA)  prepared at full strength for solid media or diluted four-fold for liquid media (AA/4). Anabaena, Nodularia, and Chlorogloeopsis cultures were grown in BG-11 , while Lyngbya was grown in a 1:1 mixture of BG-11 and ASN-III  supplemented with 10 μg L-1 vitamin B12. Cultures were grown in sterile flasks, under constant white light, at an intensity of 7 W m-2 provided by cool-white fluorescent tubes (General Electric), while shaking at 25°C.
Amino acid sequences of each protein involved in the biosynthesis of scytonemin from N. punctiforme was used in a BLASTp analysis in order to find orthologs in GenBank. Orthologous genes were mapped to establish their arrangement in the genomes of the strains harboring them. To determine whether or not these strains were capable of producing scytonemin, cultures were grown from stocks in liquid cultures  and acclimated to white light only (10 W m-2) for three days, followed by exposure to white light supplemented with UVA for five continuous days. The UVA was provided by 20-W black-light fluorescent tubes (General Electric) at an intensity of 10 W m-2 with a spectral output of 365 nm, as previously determined . In some cases, the UVA intensity was gradually increased over the course of several days to acclimate more sensitive strains to 10 W m-2 of UVA. Additionally, a control culture for each strain was set under white light only. Following UVA exposure, the cells were harvested and the lipid-soluble pigments were extracted from whole cells in acetone. Extracts were analyzed on a commercial spectrophotometer for absorption from 350 nm to 750 nm, a strong absorption peak at 384 nm indicated scytonemin had accumulated in the cells. Cultures were also observed microscopically for changes in extracellular pigmentation .
Total DNA was extracted from cultures of Chlorogloeopsis using a PCI (phenol; chloroform; isoamyl alcohol) extraction protocol . Presence of DNA in the extracts was confirmed on ethidium bromide-stained 1% agarose gels and quantified with a Nanodrop spectrophotometer (Thermo Fisher Scientific). The DNA was used as template for PCR with primers based on N. punctiforme sequences that were designed to bridge adjacent genes in the cluster. This approach was taken in order to capture the sequences of the corresponding genes and their flanking non-coding regions in Chlorogloeopsis. For PCR, 20 ng of Chlorogloeopsis DNA was used in 50 μL reactions consisting of 1 μM of each specific primer, 5 μL 10× Ex Taq DNA polymerase buffer, 4 μL dNTP mixture (2.5 mM each), and 1.25 units Ex Taq DNA polymerase (all from Takara Bio Inc.). N. punctiforme genomic DNA was the positive control while the negative control had no template DNA. PCR was done in a Bio-Rad iCycler Thermal Cycler with the following parameters: 95°C for 5 min then 35 cycles of 95°C for 1 min, 55°C for 1 min, and 72°C for 1 min, followed by an extension at 72°C for 10 min. Products were confirmed on 1% agarose gels and the band of the expected size for each sample was excised using a sterile scalpel. The PCR products were purified using the QIAquick Gel Extraction Kit (Qiagen Sample and Assay Technologies) and sequenced commercially (Applied Biosystems). Sequences were used in a BLASTn analysis against the N. punctiforme genomic database http://www.jgi.doe.gov to verify that the correct region had been amplified. Gene sequences were used to construct the genomic arrangement of the scytonemin gene cluster in Chlorogloeopsis. Nucleotide sequences were submitted to GenBank under accession numbers FJ601359 to FJ601364 and FJ605302 to FJ605317.
We would like to thank Scott Bingham and the Arizona State University DNA Laboratory staff for their assistance.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.