Comparative genomics and functional study of lipid metabolic genes in Caenorhabditis elegans

Background Animal models are indispensable to understand the lipid metabolism and lipid metabolic diseases. Over the last decade, the nematode Caenorhabditis elegans has become a popular animal model for exploring the regulation of lipid metabolism, obesity, and obese-related diseases. However, the genomic and functional conservation of lipid metabolism from C. elegans to humans remains unknown. In the present study, we systematically analyzed genes involved in lipid metabolism in the C. elegans genome using comparative genomics. Results We built a database containing 471 lipid genes from the C. elegans genome, and then assigned most of lipid genes into 16 different lipid metabolic pathways that were integrated into a network. Over 70% of C. elegans lipid genes have human orthologs, with 237 of 471 C. elegans lipid genes being conserved in humans, mice, rats, and Drosophila, of which 71 genes are specifically related to human metabolic diseases. Moreover, RNA-mediated interference (RNAi) was used to disrupt the expression of 356 of 471 lipid genes with available RNAi clones. We found that 21 genes strongly affect fat storage, development, reproduction, and other visible phenotypes, 6 of which have not previously been implicated in the regulation of fat metabolism and other phenotypes. Conclusions This study provides the first systematic genomic insight into lipid metabolism in C. elegans, supporting the use of C. elegans as an increasingly prominent model in the study of metabolic diseases.


Background
Lipids are hydrophobic molecules with thousands of species including fatty acids and fatty acids derivates, e.g., triglycerides, phospholipids, sphingolipids, sterols, and the like. Generally, lipids function as energy storage, components of biological membrane, and signaling molecules, and likewise play a variety of biological roles in almost all aspects of life: growth, development, reproduction, stresses resistance, aging and longevity, etc. [1]. Accordingly, the homeostasis of lipid metabolism must be tightly regulated. Misregulation of lipid metabolism is often associated with many human diseases, particularly metabolic diseases, such as obesity, type 2 diabetes (T2D), cardiovascular diseases (CVD), and non-alcoholic fatty liver (NAFL) [1,2], all of which are increasingly prevalent both in both developed and developing countries, such as China [3].
Previous studies have illustrated that the human systems which control appetite, energy partitioning, and the integration of lipid metabolic processes are highly complex and redundant [4,5]. Therefore, employing animal models is indispensable in further understanding the lipid metabolism and lipid metabolic diseases. In some situations, lower and less complex organisms are particularly useful in providing simpler and clearer views of the fundamental regulation of lipid metabolism as opposed to higher, more complex organisms. Chiefly due to the advantages of rapid generation time over the last decade, easy genetic manipulation and visualization of lipid droplets at whole animal level has made the tiny nematode, C. elegans an emerging model for exploring the biological functions of lipids and the genetic basis of fat storage regulation. To date, research on this organism has uncovered numerous findings in lipids metabolism and the associated fundamental regulatory mechanisms [6][7][8][9][10][11], thereby prompting further exploration of its utility as an animal model in these types of studies.
As the first multicellular organism to have its genome completely sequenced [12] and further deeply analyzed [13] C. elegans possesses for biosynthesis of polyunsaturated fatty acids [14], mono-methyl branched fatty acids [15], ceramides [16], phospholipids [17][18][19][20][21], and also the encoding of many lipid binding proteins and transporters [22]. This organism, however, lacks some key genes involved in the biosynthesis of sterol, forcing the worm to obtain it from its diet [23,24]. To date, however, how many potential lipid metabolic genes and pathways do exist in the C. elegans genome is not known. Likewise and more importantly for its application as an animal model, the genomic and functional conservation of lipid metabolism from C. elegans to humans is unclear, limiting C. elegans' utility for in depth exploration of the fundamental mechanisms of lipid metabolism and human metabolic diseases.
In the present study, we analyzed genes involved in lipid metabolism in the C. elegans genome by comparative genomics. We first constructed a database containing 471 lipid metabolic genes from the C. elegans genome, and further assigned most of these into different lipid metabolic pathways. Afterward, we systematically compared these genes in the human, Drosophila, mouse, and rat genomes. Ultimately, we took the advantage of RNA-mediated interference (RNAi) to disrupt the expression of most lipid metabolic genes, allowing us to explore the biological roles these play in C. elegans.

Results
Database construction of C. elegans lipid metabolic genes To date, the KEGG database (http://www.genome.jp/kegg/) has collected 168 lipid metabolic genes from the C. elegans genome; this may be far from a complete listing, given the diversity and complexity of lipids in any organism. To garner a full list of lipid metabolic genes from the C. elegans genome, we retrieved lipid metabolic genes by manually searching the WormBase website (www.wormbase.org) and published papers collected from NCBI PubMed, as well as the C. elegans orthologs of 407 human lipid metabolic genes and 168 lipid genes from the KEGG database [22,[25][26][27][28]. Eventually, 471 lipid metabolic genes were retrieved from the C. elegans genome including 147 orthologs of human lipid metabolic genes and 156 genes collected from the literature and WormBase (Figure 1 and Additional file 1: Table S1). In doing so, we constructed the first complete database of lipid metabolic genes of C. elegans, covering most potential lipid metabolic genes that could be found out in C. elegans genome at present (Figure 1 and Additional file 1: Table S1).
The biochemical function of the 471 lipid metabolic genes was further annotated with Ensembl BioMart (http:// asia.ensembl.org/biomart/martview/dd89899865f812aff4dcd 9ad0dfede49), though the 147 human orthologs and 168 lipid genes from KEGG were already annotated (http://www.genome.jp/kegg/). All 471 genes were also checked again in WormBase (Additional file 1: Table S1). Fuxrthermore, the function annotation obtained from ClueGo [29] showed that 471 lipid genes in C. elegans were not only involved in distinct lipid metabolic processes, but also play multiple roles in response to oxidative stress, aging and longevity, as well as other biological processes (Additional file 2: Figure S1).
Most of the 471 lipid genes in this database are predicted to encode specific enzymes that play roles in defined reactions, though experimental biochemical evidences for this conclusion may be lacking. Some genes predicted to encode the same or similar enzymes were classified into a family. However, they may actually have same, distinct, or no biological conversion/function, as well as may involve into one or more reactions (Table 1). For example, 7 fatty acid desaturase genes were present in the C. elegans genome, but they display completely different preference of substrates except fat-6 and fat-7 [14]. Cytochrome P450 enzymes are a superfamily of haem-containing monooxygenases with extraordinary diversity in their biochemical function across all kingdoms of life. Within the C. elegans genome, 81 full length cytochrome P450 (CYP) genes have been found, but the exact biological functions of the vast majority of CYPs are largely unknown as of yet [30]. Only 38 of the 81 CYPs may play a role in lipid metabolism, based on our findings from KEGG, WormBase, and the published literature. Another super family is lipase, consisting of 66 genes (Table 1), which may encode hormone-sensitive lipase, phospholipase A2, B1, C, D,   Figure 2).
To gain a clearer picture of lipid metabolic pathways in C. elegans, all 16 pathways were further integrated into a lipid metabolism network ( Figure 2). Additionally, 303 lipid genes including 147 human orthologs and 156 genes reported in the literature were assigned into the 16 pathways based on their annotated or reported biological function ( Figure 2). However, due to a lack of both clear evidence of the precise biological function and detailed information, some genes could not be assigned into any pathway, even though their paralogs were assigned in the same pathway. Subsequently, we constructed the first lipid metabolism network with 16 pathways containing 471 lipid genes from C. elegans ( Figure 2).
Only 35 of the 81 CYPs could be assigned into different lipid metabolic pathways ( Figure 2). Based on KEGG annotation, cyp-33E2 may function in fatty acid metabolism, though it was reported to be involved in the biosynthesis of eicosanoides derived from eicosapentaenoic acid (EPA) [31]. Therefore, we reassigned cyp-33E2 into the arachidonic acid metabolism pathway ( Figure 2). Despite cyp-35B1 being shown as a target of insulin/ IGF-I-like signaling (IIS) [32], and Cyp-31A2 and Cyp-31A3 being involved in the production of lipids required for eggshell formation [33], their real catalyzing reactions were unclear enough that they could not be assigned into any lipid metabolic pathway. Research on animal models has brought numerous powerful and valuable insights to both basic human biology and disease pathology [34][35][36]. However, under many circumstances, animal models fall short of our expectations, going so far as to result in contradictory or opposite findings. These discrepancies may be due in part to the changes in the conservation of genetically controlled pathways from human to model animals [36][37][38]. This reality spurred us to inquire about the conservation of lipid metabolism that exists between C. elegans and humans, as well as mice [39,40], rats [41,42], and Drosophila [43][44][45][46], all of which are commonly used as models to understand the mechanisms and potential therapies for human metabolic diseases.
All 471 C. elegans lipid genes were used to search for orthologs in the human, mouse, rat and Drosophila genomes using Ensembl BioMart (Methods). Eventually, 581, 585, 563, and 428 lipid genes respectively were obtained in the human, mouse, rat, and Drosophila genomes (Additional file 3: Table S2 and Additional file 4: Figure S2). Comparison of these lipid genes revealed that 237 genes are conserved in all five organisms. Over 70% of C. elegans lipid genes have orthologs in humans, mice, and rats (Table 2). Conversely, 72.12% (419/581) human lipid genes are conserved in C. elegans (Table 2), suggesting a high conservation of lipid metabolism genes between C. elegans and humans, as well as between C. elegans and both mice and rats.

Conservation of C. elegans lipid metabolic genes in human disease
An imbalance of major lipid signaling pathways has previously been implicated in obesity, type 2 diabetes, atherosclerosis, hypertension, non-alcoholic fatty liver disease, etc. [1,47]. To analyze the relationship between lipid genes and human metabolic diseases, human disease genes were retrieved from the OMIM database (http://www.ncbi.nlm. nih.gov/omim/). At present, the OMIM database has collected 19,994 MIM IDs including genes, phenotypes, and locus which we downloaded from the newly updated NCBI FTP. Using Ensembl Biomart ID Mapping, we found 15,181 genes that could be considered human disease genes. Comparison between humans and the other four organisms revealed that mice, rats, C. elegans, and Drosophila have 13259, 12717, 5397, and 5889 respective orthologs, in which 5085 genes are conserved in all five organisms ( Figure 3A). Furthermore, 1264 of these 15181 genes may be considered metabolic disease genes that were filtered within the database by keywords 'Obesity, Diabetes, Metabolic disease, and others major obesity related diseases'. Similarly, 1152, 1051, 638, and 632 orthologs of human metabolic disease genes could be found among mouse, rat, C. elegans, and Drosophila, respectively, in which 346 orthologous genes are conserved in all five organisms ( Figure 3B).
In humans, 97 of the 581 lipid genes are overlapped with metabolic disease genes ( Figure 3C). Similarly, 94 of the 471 C. elegans lipid genes are orthologs of human metabolic disease genes ( Figure 3D). Crucially, all 94 C. elegans lipid genes associated with metabolic diseases are completely overlapped with 97 human metabolic disease genes.
In mice, rats, and Drosophila, respectively there are 128, 107, and 97 lipid genes associated with metabolic diseases, with 71 genes conserved across all five species (Additional file 5: Table S3). Moreover, 327 of the 471 lipid metabolism genes in C. elegans are also orthologs of human disease genes ( Figure 3E), implying that great number of lipid genes are also involved in other disease processes aside from metabolic syndrome.

Investigated of the biological functions of lipid metabolic genes in C. elegans by RNAi
Among the 471 C. elegans lipid metabolic genes, only a small number have been biochemically and functionally characterized, demonstration that they participate into growth, development, dauer formation, aging and longevity, stress responses, and the like. However, the biological functions of most lipid metabolic genes remain unknown. The current Argilent C. elegans RNAi library covers about 86% of the known and predicted genes, providing a powerful tool to further investigate the biological functions of the lipid genes. We opted to take advantage of RNA-mediated  interference (RNAi) to disrupt the expression of each lipid gene and investigate easily visible phenotypes including growth, sterility, and fat storage, indicated by Nile Red staining of fixation [48].
Of the 471 lipid genes in C. elegans, 356 have available RNAi strains. Of those with available strains, RNAi knockdown of 21 genes consistently displayed remarkable phenotypes with significantly altered fat storage, growth defect, lethal, or sterility (Table 3), while10 of these 21 genes significantly affect fat storage. Inactivation of sams-1, mboa-7, and C05D11.7 led to large lipid droplet size (Figure 4). LET-767 has been reported to be a major 3-ketoacyl-CoA reductase required for the production of monomethyl branched and long chain fatty acids [49]. DHS-16 is a novel 3-hydroxysteroid dehydrogenase regulating the production of dafachronic acids (DAs) [50]. In C. elegans, RNAi knockdown of dsh-16 and let-767 slightly but obviously increased the size of lipid droplets (Figure 4).
RNAi inactivation of lpin-1, sptl-1, acs-1, cyp-29A2 and fat-6, meanwhile, resulted in significantly decreased fat storage and a smaller size of lipid droplets. Additionally, inactivation of 9 genes led to slower growth ( Figure 5). pod-2 and fasn-1 encode acetyl-CoA carboxylase and fatty acid synthase, respectively, and are predicted to catalyze the first and second step in de novo fatty acid biosynthesis [PATH: Cel00061]. Disruption of either of these two key genes by RNAi definitely led to both lethality and larval arrest ( Figure 5). Meanwhile, inactivation of 13 other genes resulted in an egg laying variant (Table 3).

Discussion
Understanding the lipid metabolism pathways and the involved genes is critical to further study of the biological functions of lipids and the mechanisms of metabolic diseases. A well-known animal model with a sequenced genome, like C. elegans, can provide a systemic genomic review for the whole lipid metabolism network. Several research groups have bioinformatically or functionally characterized one or several lipid metabolism pathways in C. elegans [14,16,22,26,68,[74][75][76]. In our study, we created a database of 471 C. elegans lipid metabolism genes retrieved from the KEGG data, WormBase, and the published literature ( Figure 1 and Additional file 1: Table S1). According to their annotation and real biological functions, these 471 genes could be designed into 16 lipid metabolism pathways integrated into a network (Figure 2). To our knowledge, ours is the first study to systematically analyze the lipid genes, metabolism pathways, and their biological functions in the animal model C. elegans. Remarkably, aside from the 471 C. elegans lipid genes, we additionally found 581, 585, 563, and 428 respective lipid genes in the human, mouse, rat, and Drosophila genomes. Comparison of these genes revealed that 237 lipid genes are conserved in all five species. As such, this work provides useful information regarding lipid metabolism that will aid research utilizing C. elegans and other organisms as animal models. Among the 471 C. elegans lipid genes, 345 (73.25%) have orthologs in humans. Conversely, 380 of the 581 (65.4%) human lipid genes had orthologs in C. elegans. Notably, of 97 human lipid genes related to metabolic diseases, 94 have orthologs in C. elegans and 71 genes are conserved in the human, mouse, rate, C. elegans and Figure 4 Inactivation of 10 lipid genes by RNAi dramatically changed fat storage in C. elegans. Late L4 worms were fixed with paraformaldehyde and stained with Nile Red [48]. Images were captured using identical settings and exposure time for each image. Inactivation of sams-1, mboa-7, and C05D11.7 led to increased lipid size; RNAi knockdown of dsh-16 and let-767 slightly though noticeably increased the size of lipid droplets in C. elegans; Meanwhile, RNAi inactivation of lpin-1, sptl-1, acs-1, cyp-29A2 and fat-6 resulted in significantly decreased fat storage and comparatively smaller size of lipid droplets. Bar, 10 μm. Arrow indicates the lipid droplet.
Drosophila genomes, suggesting that lipid genes involved into metabolic diseases are highly conserved among model organisms. Altogether, these results indicate that C. elegans and humans share high conservation in terms of lipid genes and their roles in metabolic diseases, suggesting that C. elegans is likely a useful and insightful model for use not only in research on the fundamental biology of lipid metabolism, but also exploring the mechanisms of human metabolic diseases.
Although mboa-7 affects PUFA incorporation into PI [18], dhs-16 regulates dafachronic acids metabolism [50], and sptl-1 is involved in the biosynthesis of ceramide [16], they have not been previously implicated in fat storage. RNAi inactivation of mboa-7 and dhs-16 displayed increased lipid droplets size (Figure 4), whereas RNAi inactivation of sptl-1decreased lipid droplet size. The fat content of all three genes should likewise be measured by other methods (e.g., thin-layer chromatography and gas chromatography (TLC/GC)) in future studies. acs-1 affects the incorporation of monomethyl branched chain fatty acid (C17iso) into phospholipids [56] and was reported to increase fat mass [57]. RNAi of acs-1, however, indeed decreased the density and size of lipid droplet (Figure 4). Inactivation of let-767 by RNAi led to slower growth ( Figure 5), but once the worms reached the late L4 or early young adult stage, they had increased lipid droplets size ( Figure 4). Additionally, RNAi of K12H4.5, kat-1, and F25B4.6 resulted in slower growth ( Figure 5), but fat storage looked normal once the worm entered into late L4 or early young adulthood. Several members of cytochrome P450 superfamily were reported to functionally affect lipid metabolism [31,33,85,86]. We only found that RNAi of cyp-29A2 significantly decreased fat storage with small lipid droplets as compared with the control (Figure 4). Consequently, these lipid genes should potentially be considered as a starting point to further investigate their role in the regulation of fat storage.

Conclusions
We created a database of 471 C. elegans lipid metabolism genes that could be designed into 16 lipid metabolism pathways integrated into a network. Additionally, 581, 585, 563, and 428 respective lipid genes in the human, mouse, rat, and Drosophila genomes were also retrieved. Over 70% of C. elegans lipid genes have human orthologs, with 237 of 471 C. elegans lipid genes being conserved in humans, mice, rats, and Drosophila, of which 71 genes are specifically related to human metabolic diseases. Furthermore, RNA-mediated interference (RNAi) of 21 lipid genes affects fat storage, development, and reproduction. This study provides the first genomic view on lipid metabolism in C. elegans, supporting the use of C. elegans as an increasingly prominent model in the study of metabolic diseases.

Retrieval of C. elegans lipid metabolic genes and database construction
Most of lipid metabolism genes used in this study were initially downloaded from the KEGG pathway database (http://www.genome.ad.jp/kegg/pathway.html), which has to date collected 168, 407, 383, 333, and 170 lipid metabolic genes respectively from the C. elegans, human, mouse, rat, and Drosophila genomes. Of the 407 human lipid genes present in the C. elegans genome, 364 were orthologs. Another 156 lipid metabolism genes in C. elegans were manually taken from WormBase (www. wormbase.org) and the published literature, mainly retrieved from PubMed (http://www.ncbi.nlm.nih.gov/ pubmed/). Any overlapping genes were excluded with Perl program manually and online using the Venny program (http://bioinfogp.cnb.csic.es/tools/venny/ index. html). In total, 471 C. elegans lipid genes were been retrieved. The BioMart Interface (http://asia.ensembl.org/ biomart/martview/53ac7015dbd78f9a5c2f0a3479fa4dc4) was applied to standardize all gene IDs.
Identification of lipid metabolic genes in human, mouse, rat, and Drosophila We used 471 C. elegans lipid metabolic genes to search for orthologs in the human, mouse, rat and Drosophila genomes by Ensembl BioMart and found 370, 389, 375, and 352 orthologs in human, mouse, rat, and Drosophila, respectively ( Table 2). The KEGG database has collected 407 human lipid metabolic genes to date; after combining these and 370 human orthologs of C. elegans lipid genes, the overlapped genes were discarded. Within the human genome, 581 lipid genes have been found, and we used these in reversely to search for orthologs in C. elegans, mouse, rat, and Drosophila genomes (419, 539, 517, and 340 genes, respectively (Table 2)). The KEGG database contains 383, 333, and 170 lipid metabolic genes for the mouse, rat, and Drosophila genomes, respectively. After combining both sets of lipid genes and discarding any overlapped gene in each organism. In total, 585, 563, and 428 lipid genes could be identified in mouse, rat, and Drosophila genomes, respectively, with the additional 581 lipid genes within the human genome (Additional file 3: Table S2 and Additional file 4: Figure S2).

Annotation of lipid metabolic genes in C. elegans
The 168 lipid metabolic genes we retrieved from the KEGG database were already annotated. The remaining 303 lipid genes were annotated via Ensembl BioMart (http:// asia.ensembl.org/biomart/martview/53ac7015dbd78f9a5c2f 0a3479fa4dc4), Swiss-Prot/TrEMBL (http://www.expasy. org), and Gene Ontology (http://www.geneontology.org/). However, since the annotation databases are incomplete, if a gene could not be annotated from a database, they were annotated by hand using previously published papers (primarily retrieved from PubMed) and WormBase (http:// www.wormbase.org/). Following annotation, the biological functions of all 471 genes were manually back checked via WormBase.
The biological processes, KEGG path, gene cellular localization, and molecular function annotation of the 471 genes were processed through ClueGo [29] plug-in of Cytoscape [87].
Retrieval and comparison of disease associated genes in humans, mice, rats, C. elegans, and Drosophila From the NCBI database (ftp://ftp.ncbi.nih.gov/gene/ DATA/), 19994 MIM disease ID genes, phenotypes, and locus were downloaded from NCBI ftp. In total, 15181 genes could be mapped by Ensembl Biomart ID Mapping and were therefore considered human disease genes. All metabolic disease genes were searched from OMIM database (http://www.ncbi.nlm.nih.gov/omim/) using the following keywords: Obesity, Diabetes, Metabolic disease, Sleep apnea, Atherosclerosis, Coronary disease, Hyperlipidemia, and Hypertension. After excluding redundant genes, some further 1264 human metabolic disease genes were identified in our database.
Orthologs information of human gene inferred by Ensembl (release 66) ortholog detection pipeline was obtained from Ensemble's BioMart interface (http://asia. ensembl.org/biomart/martview/53ac7015dbd78f9a5c2f0a34 79fa4dc4). Specifically, we retrieved Ensembl identifiers of orthologs to human genes together with identifiers of their translated protein products. Details of the ortholog detection pipeline are described at http://aug2007.archive. ensembl.org/info/data/compara/homology_method.html. Briefly, gene families were identified from all sequences in the database by WU-Blastp and Smith-Waterman searches, followed by construction of a phylogenetic tree for each gene family to identify orthologous and paralogous relationships between gene pairs. All orthologs of lipid metabolism genes, human disease genes, and human metabolic genes were then compared by using VENNY, an interactive tool for comparing lists with Venn diagrams (http://bioinfogp.cnb.csic.es/tools/venny/ index.html).
Worm strains, growth condition, and RNA interference Standard protocols were used to maintain the C. elegans N2 strain growing at 20°C with E. coli OP50 for food, unless specifically noted. RNAi was performed by feeding bacterial strains from the Ahringer C. elegans RNAi library, obtained from Gene Services (Source BioScience) [88,89]. The empty vector L4440 in the Ahringer library E. coli HT115 strain was used as the negative control for RNAi experiments. The current Ahringer C. elegans RNAi library covers about 86% of both known and predicted genes. In our C. elegans lipid metabolism genes database, only 356 genes have available RNAi bacterial strains. Generally, eggs were isolated from gravid adults using hypochlorite treatment and hatched in M9 buffer overnight, and then synchronized L1 worms plated onto NGM plates seeded with E. coli strain HT115 carrying the RNAi clone of a specific lipid gene. Worms were occasionally checked after 48 hr. RNAi of 356 genes was carried out for four biological rounds.
Nile Red staining of fixed nematodes to visualize fat storage Late L4s or young adult nematodes with one or two eggs within their uterus were washed off growth plates, fixed and then stained with Nile Red as described previously [48,90]. All images were captured using identical settings and exposure time.

Growth rate and sterility analysis
Growth rate analysis was conducted in line with our previous report [90]. All images were captured using identical settings and exposure times. Worms which clearly laid fewer eggs were recorded as having a "egg laying defect", and those with no eggs were recorded as "sterile".

Additional files
Additional file 1: Table S1. C. elegans lipid metabolic gene database. 471 lipid metabolism genes were retrieved from the C. elegans genome. Different colors were used to highlight the origin of the various lipid genes. The metabolism pathways of each gene mainly come from the KEGG database.
Additional file 2: Figure S1. Annotation of biological processes of 471 C. elegans lipid metabolism genes using ClueGO. The chart displays part of significant enrichment analysis of Gene Ontology molecular function in C. elegans lipid metabolic gene database. The x axis stands for the amount molecular function terms in Gene Ontology. One star denotes P < 0.05, while two stars denote P < 0.01.