Divergence of metabolites in three phylogenetically close Monascus species (M. pilosus, M. ruber, and M. purpureus) based on secondary metabolite biosynthetic gene clusters

Background Species of the genus Monascus are considered to be economically important and have been widely used in the production of yellow and red food colorants. In particular, three Monascus species, namely, M. pilosus, M. purpureus, and M. ruber, are used for food fermentation in the cuisine of East Asian countries such as China, Japan, and Korea. These species have also been utilized in the production of various kinds of natural pigments. However, there is a paucity of information on the genomes and secondary metabolites of these strains. Here, we report the genomic analysis and secondary metabolites produced by M. pilosus NBRC4520, M. purpureus NBRC4478 and M. ruber NBRC4483, which are NBRC standard strains. We believe that this report will lead to a better understanding of red yeast rice food. Results We examined the diversity of secondary metabolite production in three Monascus species (M. pilosus, M. purpureus, and M. ruber) at both the metabolome level by LCMS analysis and at the genome level. Specifically, M. pilosus NBRC4520, M. purpureus NBRC4478 and M. ruber NBRC4483 strains were used in this study. Illumina MiSeq 300 bp paired-end sequencing generated 17 million high-quality short reads in each species, corresponding to 200 times the genome size. We measured the pigments and their related metabolites using LCMS analysis. The colors in the liquid media corresponding to the pigments and their related metabolites produced by the three species were very different from each other. The gene clusters for secondary metabolite biosynthesis of the three Monascus species also diverged, confirming that M. pilosus and M. purpureus are chemotaxonomically different. M. ruber has similar biosynthetic and secondary metabolite gene clusters to M. pilosus. The comparison of secondary metabolites produced also revealed divergence in the three species. Conclusions Our findings are important for improving the utilization of Monascus species in the food industry and industrial field. However, in view of food safety, we need to determine if the toxins produced by some Monascus strains exist in the genome or in the metabolome.


Background
Species of the genus Monascus are economically important because they have been widely used in the production of yellow and red food colorants. In particular, M. pilosus, M. purpureus, and M. ruber are commonly used for food fermentation in the cuisine of East Asian countries including China, Japan, and Korea [1][2][3]. These species are also utilized to produce various kinds of natural pigments (reviewed in [4,5]), including yellow pigments such as ankaflavin, monascin and rubropunctatin, orange pigments such as monascorubrin, purple pigments such as rubropunctamin and monascorubramin, and red pigments such as monascorubramine, N-glucosylmonascorubramine, N-glucosylrubropunctamine, Nglutarylmonascorubramine and N-glutarylrubropunctamin. One example is the traditionally fermented rice that contains at least 6 pigments from Monascus spp., including rubropunctatin, monascorbrin, rubropunctamin, monascorubramin, ankaflavin, and monascin [6]. The prediction of biosynthetic pathways for structurally diverse azaphilone pigments has recently been reported [7]. Recently, azaphilone pigment has been found to be produced by Penicillium marneffei and Talaromyces atroroseus [8,9].
M. pilosus is a well-known fungus that produces several bioactive metabolites, such as monacolins K and L, as well as several pigments that are related with biological activities including anti-obesity, regulation of lipid metabolism, and Alzheimer's disease at the in vitro and in vivo levels [8]. M. purpureus contains unsaturated fatty acids, sterols, monacolin and azaphilone pigments. It has been reported that these compounds are effective in lowering cholesterol, as well as in the treatment of diabetes, cardiovascular diseases, and some cancers [10]. M. ruber contains monacolin and azaphilone pigment. Recent studies have investigated M. ruber as an alternative to nitrite substitutes in meat processing [11].
The complete genome sequence of the industrial strain M. purpureus YY-1 is already available [12]. Here, we determined the draft genome sequences of M. ruber and M. pilosus in order to compare their different phenotypes. Understanding the diversity of the secondary metabolites produced by these species at the genome level is crucial for their industrial applications. We analyzed M. pilosus NBRC4520, M. purpureus NBRC4478, and M. ruber NBRC4483 to determine the diversity of the pigments based on metabolome data and secondary metabolite-related gene clusters such as monacolins, citrinin and azaphilone pigment. Several pigments are synthesized by PKS (poly ketide synthase) enzyme and NRPS (nonribosomal peptide) enzyme systems, which are encoded by large gene clusters in the genome. Comparison of such gene clusters between the three species will provide new insights into the potential production of novel pigments.
Monascus species produce a multitude of compounds, including polyketides, unsaturated fatty acids, and phytosterols. Monacolins, especially monacolin K, inhibit 3hydroxy-3-methylglutaryl-coenzyme A reductase, which is the rate-limiting step in cholesterol biosynthesis. These compounds found in red yeast rice prevent high cholesterol levels that causes atherosclerosis [13]. Hence, it is expected that metabolites related to Monascus pigments can contribute to human health. However, citrinin was found as an undesirable contaminant in red yeast rice [14]. Specifically, citrinin has been reported to be nephrotoxic and must be strictly controlled.
Thus, it is important to define the diversity of biosynthetic pathways responsible for secondary metabolite production in economically important species such as Monascus. Red yeast rice is used in many foods around the world. Foods using red yeast rice must contain substances such as monacolin K that contribute to health, but must not contain citrinin, which causes nephrotoxicity. Therefore, further studies of genes involved in the synthesis of secondary metabolites by Monascus need to be carried out. In the present study, we determined the genome sequences of M. pilosus, M. purpureus, and M. ruber. The phylogenetic and chemotaxonomic differences between these three species were characterized by analyzing the gene clusters associated with secondary metabolites.

Cultivation of Monascus and LCMS analysis
Monascus species can produce several types of azaphilones, including nitrogenated azaphilones, N-glucosyl azaphilones, amino acid derivative azaphilones, and citrinins [4]. We cultivated the three Monascus species to compare the production of the pigments and their related metabolites using both agar and liquid medium, i.e., potato dextrose agar (PDA) and potato dextrose liquid (PDL) medium, which is the most frequently used culture medium for Monascus growth and metabolite production [15]. As shown in Fig. 1a, distinct colony shapes and colors could be observed on PDA and in PDL among M. pilosus, M. ruber, and M. purpureus. In order to quantify the difference of the pigment contents, we analyzed the medium using LC-MS and identified 14 pigments in total. Figure 1b showed the concentration of 14 metabolites quantified in three biological replicates in the three species as a heatmap with two-dimensional hierarchical clustering to display their similarity. This result confirmed that not all these metabolites are synthesized in the three species. The numbers of pigments commonly identified in each species are summarized as the Venn diagram (Fig. 1c). Dehydromonacolin K, rubropunctatin, monascin, and ankaflavin 2 were commonly produced by all three Monascus species. Of the three species, M. pilosus produced the greatest number of pigments (12 in total; Fig. 1c). Ten pigments, except monascorubramine, were produced by M. pilosus, while ankaflavin 1 and rubropunctamine were only produced by M. pilosus. Citrinin, a mycotoxin with nephrotoxic activity in mammals [16], was only produced by M. purpureus.

Biosynthetic pathway of detected metabolites
In order to understand the difference of the metabolites produced in these three species, we investigated their biosynthesis pathways in detail and illustrated in Fig. 2. It has been identified that the biosynthesis of the precursor of these metabolites, 1Hisochromenes are started from malonyl-CoA, in Penicillium marneffei and M. ruber [17,18]. Citrinin polyketide synthase (PKS) converts the PKS-bound product citrinin [17]. The biosynthetic pathway from malonyl-CoA to monacolins was also determined in Aspergillus terreus, M. ruber, and M. purpureus [19,20]. It should be noted that PKS-bound products are acted upon by two different types of PKS enzymesone is an enzyme to produce Monascus azaphilone pigments, which corresponds to the pathway from malonyl-CoA to 1H-isochromenes and nitrogenated azaphilones [17], and the other is citrinin polyketide synthase, which corresponds to the pathway from  PKS-bound product to citrinin [21]. We also added the amount of the metabolites associated with each precursor as color bars. Among the three Monascus species, all eight metabolites related to 1Hisochromenes were only detected in M. pilosus. As for the other pathways, while citrinin was observed only in M. purpureus, Dehydromonacolin K, which is a precursor for monacolin K production, was detected in all three species. These resutls showed that the production of secondary metabolites are distinct among these three Monascus species. Azaphilone pigments related to 1H-isochromenes and nitrogenated azaphilones were observed among the three species, but azaphilone pigments related to ankaflavin 1 and rubropunctatin were observed in M. pilosus, while monascorubramine was detected in the other two species. Azaphilone pigments related to ankaflavin 1 and rubropunctatin were observed in M. pilosus, while monascorubramine was detected in the other two species.
Diversity and classification of secondary metabolites produced by the three Monascus species Fungal polyketides are the largest and most structurally diverse type of secondary metabolites, ranging from simple aromatic compounds to complex macrocyclic lactones [4]. Azaphilones are pigments with pyronequinone structures containing a highly oxygenated bicyclic core and a chiral quaternary center and produced mainly by Monascus species [21]. To examine the metabolite divergence in the three Monascus species, we examined the diversity of azaphilones in the abovementioned five major metabolic groups. Metabolites belonging to these groups are synthesized by PKS enzymes [19,20]. Similar studies have been performed for 373 fungi azaphilones [4], monacolins [5,[22][23][24][25], and monaphilones [26]. We examined a total of 74 secondary metabolites (Fig. 3a) and identified the ubiquitous and speciesspecific metabolites in the five groups using Venn diagrams. Results showed that the three Monascus species can produce unique 1H-isochromene compounds. M.

Comparison of secondary metabolite gene clusters in the three Monascus species
In order to measure genetic and evolutional similarity of the enzymes related in these three species, we have analyzed their whole genome sequences and annotated the sequences of the enzymes related in the biosysthesis pathways of these metabolites. Illumina MiSeq 300 bp paired-end sequencing generated 17 million high-quality short reads in individual Monascus species, which corresponds to 200 times the genome size. We can expect that the genome size for each species is around 24 Mb, since in the literature, the genome sizes reported for M. purpureus YY-1 and M. ruber NRRP 1597 were 24.1 Mb [12] and 24.8 Mb (https://genome.jgi.doe.gov/Monru1 /Monru1.info.html), respectively. About 94.7% of the short reads satisfied the sequencing error rate (Q20 = 1%). There were a total of 8891 genes identified in M. purpureus, 8645 in M. ruber, and 8647 in M. pilosus. These numbers are similar to the number of genes (7491) identified in M. purpureus YY-1 [12].
Furthermore, to evaluate the clusters of the enzymes related with the secondary metabolic pathways we analyzed their genome sequences using antiSMASH 4.0 [31], and we obtained 24 gene clusters for M. pilosus, 48 for M. ruber, and 20 for M. purpureus. Subsequently, we constructed a dendrogram using the BLAST alignment scores of the 92 gene clusters (Fig. 4), obtaining 22 groups with multiple gene clusters and 32 singletons. Gene clusters with identical gene organization were merged into a single group. The gene-coding regions were predicted using BLASTX search against Uni-ProtKB/Swiss-Prot database (E-value < 1.0 × 10 − 10 ).
We also obtained 54 groups of gene clusters ( Table 1) that are displayed as Venn diagrams (Fig. 5). Gene clusters were characterized as those for secondary metabolites or ABC transporters based on similarity to known genes [32]. ABC transporters are associated with PKS and NRPS gene clusters in several fungi and are responsible for the export of the corresponding secondary metabolites [33,34].
Taking gene organization in gene clusters into consideration, only two gene clusters were common in the three species. In all, 85% of the identified secondary metabolic biosynthetic gene clusters were assigned to each Monascus species. Thus, the organization of genes in the genome of each Monascus species is highly diverse (Fig. 5a1). Otherwise, if gene-clusters are classified based on DNA sequence similarity, 16 (30%) were found to be common in the three species. M. ruber has 27 intrinsic gene clusters, while M. pilosus and M. purpureus only have two and three intrinsic gene clusters, respectively ( Fig. 5a2), indicating that the number of gene clusters in M. ruber is different from the other two species. Classification for individual gene clusters was conducted using antiSMASH, which identified 8 Type I polyketide biosynthetic systems (T1PKS), 8 non-ribosomal peptide biosynthesis systems (NRPS), three complex systems of T1PKS and NRPS (T1PKS-NRPS), three terpenoid Venn diagrams of Monascus-specific metabolites reported by previous studies. Each Venn diagram classifies metabolies into reported species using a total of 74 previously reported secondary metabolites (a),citrinins (b1),monaphilones (b2),1H-isochromenes (b3),nitrogenated azaphilones (b4),and monacolins (b5), specific to Monascus-species [17,[27][28][29][30] biosynthetic systems, 27 putative fatty acid biosynthetic systems, and five others (Fig. 5b). The results of the present study strongly suggest that the three Monascus species have greatly diverged gene clusters; thus, they should be regarded as different species based on chemical taxonomy associated with the production of secondary metabolites.
We further analyzed the differences in the Monascus species at the DNA level by comparing the 8144-bp region where a Monascus azaphilone pigment biosynthetic gene cluster was localized [35]. The sequence analysis was performed for three M. ruber strains (NRRP 1596, JF83291.6, and NBRC 4483), two M. purpureus strains (NRRP 1597 and NBRC 4478), and two M. pilosus strains (NCBI and NBRC 4520), which identified nucleotide differences, including substitutions and insertions/ deletions, at 276 bp positions ( Table 2). In total, 275 positions (99.6%), except the 5194th nucleotide position, were different between the strains of (i) M. ruber (NRRP 1596 and NBRC 4483) and M. pilosus (NBRC 4520 and NCBI) and strains of (ii) M. purpureus (NRRP 1597 and NBRC 4478) and M. ruber (JF83291.6). It should be noted that NRRP 1597 and NBRC 4483 were assigned to group (i) and JF8329.6 to group (ii). M. pilosus is treated as a synonym of M. ruber in the concatenated phylogeny based on the ITS, BenA, CaM LSU, and RPB2 gene regions of 46 relative strains [36]. However, the evidence obtained in this study indicates that M. pilosus and M. purpureus are different, while M. ruber has similar biosynthetic gene clusters for citrinin, monacolin K, and Monascus azaphilone pigments with M. pilosus and M. purpureus.

Comparison of citrinin biosynthetic gene clusters in the three Monascus species
The metabolite analysis using LC-MS shown in Fig. 1b, it was shown that citrinin was produced only by M. purpureus. This raises the question of whether the biosynthetic pathway of citrinin can be regulated negatively via PD culture and whether this pathway exists in M. pilosus and M. ruber genomes. Previous studies have reported that the production of citrinin by M. ruber can be changed by altering the medium and/or culture conditions [37,38]. To address this issue, we compared the peptide sequences of citrinin biosynthetic genes in the three Monascus species with those in M. purpureus reported by Shimizu et al. [39] and Chen et al. [40]. There are six genes associated with the citrinin biosynthetic pathway, consisting of

ID Type of GCs
Pi Ru Pu Secondary metabolic pathways detected in DNA sequence homology ATP-binding cassettes Identity level The type of GCs was determined using antiSMASH software. Pi, Ru, and Pu represent the cluster ID in Fig. 4. Identical gene organization is denoted by red numbers. The secondary metabolic pathways represent the secondary metabolite information based on DNA sequence homology. Type of ATP-binding cassettes detected in individual groups is represented as ATP-binding cassettes. Gene-cluster groups with both identical gene organization and high DNA sequence similarity are denoted by '**' and groups with only high DNA sequence similarity are denoted by '*'.  pksCT (encoding polyketide synthase, PksCT/CitS), mpl1 (a serine hydrolase, CitA), mpl2 (an iron II oxidase, CitB), mpl7 (an oxidreductase, CitC), mpl4 (encoding an aldehyde dehydrogenase, CitD), and mpl6 (a short chain dehydrogenase, CitE). Table 3 shows the homologous regions of individual genes that aligned with the reference genes (E-value < 10 − 44 ). Two proteins, CitB and CitS, were shorter in M. ruber and M. pilosus than the reference sequences. However, the same proteins were conserved in M. purpureus compared with the reference sequences. It suggests that the citrinin biosynthetic genes, mpl2 (CitB), and pksCT (CitS) were incomplete in M. pilosus NBRC 4520 and M. ruber NBRC 4483, and consequently, citrinin production is blocked in these species.

Comparison of monacolin biosynthetic gene clusters in the three Monascus species
Monacolin K was first isolated from the medium of M. ruber [41] and its biosynthesis pathway was determined, in M. pilosus (BCRC 387072), composed of nine enzymes that have a high level of homology with genes in the monacolin K biosynthetic gene cluster of Aspergillus terreus [42]. Three strains of M. purpureus, specifically NRRP 1596, YY-1 (an industrial strain), and KACC 42430 (a laboratory strain), lack an intact monacolin K gene cluster [43]. In the present study, we also examined the monacolin gene clusters in three Monascus species. As shown in Table 4

Discussion
The three Monascus species examined in the present study are commonly used for food fermentation in the cuisine of East Asian countries [1][2][3]. Citrinin, a nephrotoxic agent, was reportedly produced in M. purpureus but not in M. pilosus [30,44,45]. This is corroborated by the present results from the metabolome and genome analyses revealing that citrinin biosynthetic pathways in M. pilosus were incomplete compared with those from M. purpureus. The three Monascus species can produce ubiquitous and species-specific pigment-related compounds (Figs. 2 and 3). Analysis of gene-organization revealed 54 greatly diverged gene clusters in the three Monascus species studied (Fig. 5a2) Table 2 suggests that there may some M. ruber strains which can be related with each clade. The mycotoxin citrinin is produced by various Penicillium, Aspergillus, and Monascus species [40,44,45]. Previously studied M. purpureus strains (ATCC 16365 in Java, 16379 in Taiwan, IFO 30873, and DSM 1379 by [40,47]; YY1 by [14] can produce citrinin as a secondary metabolite. However, among the Monascus species, two M. pilosus strains (BCRC 38072 in Taiwan by [40]; NBRC 4520 in this study) cannot produce citrinin. Interestingly, several previously reported M. ruber strains, particularly ATCC 16246, 16378, 16366, 18199, 16371, and 18199 by Chen et al. [40], AUMC 4066 (CBS109.07) and AUMC 5705 by Moharram et al. [48], NRRP 1597 by Kwon [43], and NBRC 4483 in this study, lack citrinin production activities, but other strains, such as Tiegh by Ostry et al. [47] and ATCC 96218 by Hajjaj et al. [38] have the potential to produce this secondary metabolite. Thus, M. ruber can be classified into citrinin-producing and non-citrinin producing types. Based on the comparison of citrinin biosynthetic proteins, the former type might correspond to M. purpureus strains and the latter to M. pilosus strains.
In the analysis of the monacolin K gene cluster, four M. purpureus strains, specifically NRRP 1596, YY-1, KACC 42430 [43], and NBRC 4478 (in this study), lack an intact monacolin K gene cluster. By contrast, M. pilosus NBRC 4520 and M. ruber NBRC 4483 have a complete set of monacolin K gene clusters. Thus, it should be noted that M. pilosus NBRC 4520 and M. ruber NBRC 4483 can produce monacolin K but lack a complete set of citrinin biosynthetic gene clusters.
The classification of strains according to the two clade groups designated as (i) M. pilosus and (ii) M. purpureus may play an important role in the food industry and industrial field through the improved utilization of Monascus species. However, in view of food safety, we need to confirm whether the toxins produced by some Monascus strains exist in the genome or metabolome. Metabolites are generally classified into primary metabolites that are essential for growth and reproduction and secondary metabolites that are usually involved in mechanisms for ecological adaptation but are not essential to regular cellular processes. Metabolic pathways can be divided into  two types: one is the general pathway shared by most fungi and the other are specialized pathways that have evolved in response to specific ecologies of certain lineages and are consequently more narrowly distributed at the taxonomic level. The citrinin pathway belongs to the former as it is present in many Penicillium, Aspergillus, and Monascus species [30,44,45]. However, the biosynthetic gene cluster of Monascus azaphilone pigments is limited in the Monascus genera. The biosynthetic process of secondary metabolites forms a cluster or non-clustered gene organization that is integral to the entire spectrum of fungal ecological strategies (e.g., saprotrophic, pathogenic, and symbiotic). Gene duplication (GD) is often implicated in the evolution of fungal metabolism (Floudas et al., 2012). A second source of metabolic gene innovation in fungi is horizontal gene transfer (HGT), which includes xenobiotic catabolism [49], toxin production [50], and degradation of plant cell walls [51]. GD and HGT were more frequently found in clustered genes than in their non-clustered counterparts [52]. In the biosynthetic gene clusters of Monascus azaphilone pigments and citrinin, the common trends in the strains of the three Monascus species are explained by the suggested M. pilosus and M. purpureus clades, whereas M. ruber has either M. pilosus or M. purpureus characteristics. Monascus-specific diverged pigments may have evolved because of GD and HGT events, resulting in the creation of clustered genes in their genomes; thus, a large number of gene clusters was observed (Table 1). Chemotaxonomy, including pigment production, is the most useful way to study the divergence of Monascus genera. Here, we compared the PKS responsible for the biosynthesis of azaphilone pigment from three Monascus species (M. pilosus, M. purpureus and M. ruber) and six strains. More genome sequences of Monascus species will need to be determined to better understand the production of secondary metabolites in these organisms.

Conclusions
In this study, the complete genome sequences of M. pilosus NBRC 4520, M. purpureus NBRC 4478, and M. ruber NBRC 4483 were obtained. Three biosynthetic gene clusters, specifically monacolin K, citrinin, and  azaphilone pigments that are involved in secondary metabolism, were analyzed and compared. The grouping of strains according to the two clade groups, designated as (i) M. pilosus and (ii) M. purpureus, may play an important role in the food industry and industrial field through the improved utilization of Monascus species. The PKS genes responsible for the biosynthesis of azaphilone pigment from the three species were compared. This genome-based analysis showed M. ruber could not be clearly grouped as a species. However, in view of food safety, further studies are needed to confirm whether the toxins produced by some Monascus strains originate from the genome and not from the metabolome.

Strains, culture conditions, and metabolite detection
Three Monascus species, specifically, M. pilosus NBRC 4520, M. purpureus NBRC 4478, and M. ruber NBRC 4483, were obtained from the National Institute of Technology and Evaluation in Japan. The three species were cultured in potato dextrose liquid medium at 30°C for 7 days with 140 rpm shaking in TAITC BR-23FP. A solution of 10 mg freeze-dried PDL medium added with 1 mL methanol was sonicated for 30 min to extract secondary metabolites. The extracted metabolites were measured using a Shimadzu LCMS-8040 system (Shimadzu, Kyoto, Japan) with 300 mm ODS MonoBis columns (Kyoto Monotech Co., Ltd., Kyoto, Japan). Each metabolite was estimated based on the m/z of each peak, referring to the m/z of the metabolites previously reported in Monascus. We measured three replicates for each species and applied two-dimensional hierarchical clustering to visualize their similarity, using the Euclidian distance of their profiles of the observed pigments concentration as a measure of similarity score and applying the Ward method.

Genome sequencing and assembly
We isolated genomic DNA from the three species individually and sequenced them using Illumina MiSeq paired-end libraries (300 bp read each end and 500 inserts). Approximately 8.5 million reads (around 5 Gb) for each sample were obtained and assembled using ABySS 2.0 de novo assembler [28]. We obtained 5000 to 10,000 assembled scaffolds for each samples and the N50 value of the total scaffolds were 133 Kb. The accumulated total length of the assembled contigs was 24.8 M bp, which is close to that of M. purpureus NRRP 1596 genome (ATCC 16365) with 23.4 Mb [12] and M. ruber NRRP 1597 (ATCC 13692) with 24.9 Mb [12]. The raw read sequences and the assembled contig sequences are deposited to the DNA Data Bank of Japan (DDBJ) and available under accession numbers DRX224643, DRX224644, DRX224645, respectively. To identify the gene-coding regions, the nucleotide sequence of the assembled scaffolds was annotated using DIAMOND, a high throughput BLASTX compatible sequence alignment algorithm [53]. The assembled sequences were also analyzed by BLASTed using the library of the whole UniProtKB/Swiss-Prot database [54]. Annotated genes of M. purpureus NRRP 1596 and M. ruber NRRP 1597 for were used for validation, with a cutoff E-value <1E-10. We further analyzed the genomes using antiSMASH pipeline [55] to extract the functional gene clusters such as PKS, in each Monascus species.