Transcriptome wide studies on a variety of organisms have recently been conducted on a large scale, following the revolution introduced by the emergence of Next Generation Sequencers. Whole transcriptome sequencing using an Illumina GAIIx sequencer and analysis of the C. pictus plant leaves were reported for the first time in this study, in order to understand molecular signatures related to the anti-diabetic principles. We obtained about 3.2 Gb of raw sequence data, which was processed and de novo assembled into contigs and further into transcripts. De novo assemblies are highly dependent on k-mer lengths. In general, plant assemblies are very hard and difficult owing to the complex gene contents, higher ploidy, higher rates of repeats and heterozygosity . Longer k-mers are advantageous in distinguishing repeats from real overlaps  and are accurate, and in general suit the assembly of highly expressed transcripts  while shorter k-mers are preferred for assembly of low expression genes. To balance between higher accuracy from longer k-mers and better assemblies for low expressed genes from short k-mers, we ran multiple assemblies to arrive at an optimal k-mer length for a better assembly. Specific care was taken to remove adapters and low quality sequences from reads, such that a high quality assembly is obtained (Table 1). The N50 value of the assembled data was comparable to other plant transcriptome assemblies indicating a high quality assembly (Table 2).
The complete and accurate transcriptome assembly of plants is difficult and is limited to the currently available de novo assembly tools. Hence, in our study, a single transcript might be present redundantly as multiple isoforms or in multiple fractions and some of the transcripts might have been lost during the assembly due to low coverage. For instance, 4-coumarate-CoA ligase is present redundantly in multiple copies, whereas transcripts encoding lycopene cleavage dioxygenase - an important component of the bixin biosynthetic pathway were not observed at all. Nonetheless, once newer efficient assembly tools with improved algorithms are developed in the future, the publicly available raw data can be re-used to create a better transcriptome assembly. The attempt was made to not only computationally characterize the transcriptome, but also to derive molecular clues to the medicinal properties of the plant. We were successful in establishing a relationship of the anti-diabetic property with the genetic makeup. Interpreting high-throughput data is a challenging aspect and we have suggested ways to analyse and interpret a plant transcriptome. It has been estimated that 15 to 25% of the plant genome specifies pathways of natural product biosynthesis . The high number of transcripts that have been annotated as secondary metabolite profiles from C. pictus is a clear indication of the genetic complexity of the species.
Our primary focus has been to understand the transcripts involved in biosynthesis of the anti-diabetic principles. The surprising presence of high number of transcripts corresponding to bixin, norbixin and geraniol indicate possible involvement of these active constituents in the plant's anti-diabetic activities (Figure 3). The presence of the transcript for Putative norbixin methyltransferase further confirms these findings (Figure 8). Bixia orellana (Annato) is currently reported to be the sole source of the natural pigment bixin , but our findings on the presence of significant levels of bixin in C. pictus leaves suggests that the leaves could be used as an alternative source of Bixin for commercial supply. Bixin and norbixin from Annato has been indicated to activate Peroxisome Proliferator-Activated Receptor α (PPARα), which in turn stimulates adipocyte differentiation and increases the insulin dependent glucose uptake in differentiated 3T3-L1 adipocytes . The identification of bixin synthase transcripts from our current annotations was corroborated from results suggesting presence of bixin from HPLC (Figure 9). Geraniol activates both PPARγ and PPARα thereby improving hyperlipidemia and glucose uptake . ABA is another notable terpenoid observed in our transcript annotations which has anti-diabetic, anti-inflammatory, anti-obesity and immuno-modulatory properties. ABA was observed to be an endogenous stimulator of insulin release from human pancreatic islets . ABA is also known to significantly increase the expression of PPAR and its associated genes CD36 and aP2 . An earlier report states that the administration of aqueous extract of C. pictus leaves in rats have significantly reduced the levels of triglycerides and cholesterol, along with reduction in glucose . Purified methyl tetracosanoate from C. pictus treatments in cells at 18 hours exhibited PPARα expression equivalent to rosiglitazone (50 lM) and the methanolic extracts exhibited anti-diabetic activity as well as anti-adipogenic activity . It is possible that the reduction in the levels of glucose, triglycerides and cholesterol might have occurred through the activation of both PPARγ and PPARα pathways by ABA, bixin, norbixin or geraniol. These terpenoids might act as insulin sensitizers in a way similar to thiazolidinedione drugs. Ginger (Zingiber officinalis), a taxonomically closely related species, is shown to be effective against the development of cataract, a diabetic complication, in rats through its anti-glycating potential . C. pictus is also reported to be an anti-glycation agent , which might be due of the presence of geraniol and farnesene derivatives (geranylgeranyl, farnesylacetone, geranylgeranyl octadecanoato, geranylgeranyl formiate and geranylgeranyl acetate) which were observed to inhibit glycation and Advanced Glycation End-product (AGE) formation , thereby inhibiting certain diabetic complications. Aldose reductase, an enzyme of polyol pathway, is involved in diabetic complications and docking studies show that citral (a mixture of geraniol, geranial and neral) as well as geraniol inhibit aldose reductase activity . The frontline anti-diabetic drug “Metformin” also known as “Dimethylbiguanide” was developed from a plant based molecule from Galega officinalis. Current leads reported for the first time from C. pictus might also emerge as a powerful anti-diabetic and anti-glycation agents, if researched further. Validation at the biochemical, cellular and pharmacological levels will supplement the transcriptomic observations.
Reactive Oxygen Species (ROS) are beneficial to the organism and they are involved in signalling pathways and are also toxic to pathogens . But an increase in ROS may be observed in many metabolic disorders and are harmful. Oxidative stress and an increase in ROS are common events accompanied with type II DM. In fact, ROS have been shown to have a casual role in insulin resistance and a decrease in ROS suppressed the insulin resistance activity . Hence, it is common to note that most anti-diabetic herbal remedies are also potential anti-oxidants. The anti-oxidant properties of C. pictus have already been reported . ROS may have potential role in either cell proliferation or cell death which is dependent on the intensity/location of oxidative burst and also the anti-oxidant activities. In cancer cells, an increased constitutive oxidative stress supports tumor growth and protects the tumor from pro-apoptotic signals promoting tumor progression . A reduction in oxidative stress leads to suppressing tumors. C. pictus is also shown to have anti-oxidant as well as antitumor properties . A number of secondary metabolites were reported in this study which corresponded to anti-oxidant and antitumor properties of C. pictus leaves. Compounds classified as anti-oxidants generally reduce the oxidative stress, but under certain conditions they act as pro-oxidants. For instance, under non-physiological conditions, although norbixin, a precursor of Bixin was able to protect DNA from damage by ROS, it might also create circumstances that amplify damaging oxidative signal, unless some other anti-oxidant comes to the defence . This leads us to suggest that a single isolated compound might not have the desired effect and might also turn out to be toxic while promoting DNA damage as a pro-oxidant. Hence, a combination of plant compounds at optimal dosage is probably necessary for a beneficial effect on a system.
C. pictus plants are known for their excellent insect resistance potentials. They are also reported to have anti-microbial properties . The same is supported by the secondary metabolite pathway annotations. It should be noted that secondary metabolites from plants are generally expressed in minimal quantities by the plants, in contrast to the expression of primary metabolites. The fragmentation of the mRNAs during library preparation could lead to the potential loss of whole or part of some important genes, if their expression is very low. Low expression also means that considerable sequence coverage will not be available and the fragmented sequences might not be assembled into complete transcripts. Hence, we chose to include any pathway hit in the annotation, even if only fewer enzymes were captured in sequencing. For instance, lycopene cleavage dioxygenase which converts lycopene to bixin aldehyde was cloned in Escherichia coli and it subsequently activated bixin biosynthetic pathway . In our study, we did not observe transcripts corresponding to lycopene cleavage dioxygenase enzyme, whereas transcripts corresponding to the other two enzymes bixin aldehyde dehydrogenase and norbixin carboxyl methyltransferase were observed. One possibility could be that the transcript was not expressed at adequate levels and might have been lost during the de novo assembly or during cDNA fragmentation before sequencing. The other possibility might be the presence of an alternate precursor for bixin biosynthesis. At this level, we could only attribute these reasons for the missing transcripts. Critical annotations from GO (Figure 4) and KOG (Figure 5) supported evidences of signal transduction mechanisms, resistance properties, DNA binding functions and defense mechanisms. Pfam annotations (Figure 6) abounded with Protein kinase domains. There is evidence that C. pictus initiates an insulin secretory response by increasing Ca2+ influx through VGCC in mouse and human islets cell cultures . In human granulocytes, ABA has been shown to bind to plasma membrane through a pertussis toxin (PTX)-sensitive receptor-G protein complex, which leads to an increase in cAMP, activation of protein kinase, phosphorylation of the ADPRC CD38 with cADPR overproduction, eventually leading to an increase of the Ca2+. The presence of ABA biosynthesis transcripts (Figure 3) in the present study involving pathway annotations could be functionally correlated with the anti-diabetic activity of C. pictus possibly through activation of protein kinases.
The expression study gives us some clues about the assembly. The transcripts with least expression values could either be novel genes of interest with very low copy numbers or they could be mis-assemblies which did not find any similarity with the sequence databases. Apart from just annotating the data, we have also mined the data for other information like SNPs and SSRs which will be invaluable, especially because C. pictus is a non-model plant without genome sequences being available. The reported SNPs and SSRs could be used as molecular markers for the construction of genetic linkage maps in the future. Substantial quantities of oxalate content and oxalate oxidase activity were reported in fresh leaf extracts . The annotation results, however, did not pick up oxalate oxidase or oxaloacetate acetylhydrolase (enzyme involved in conversion of oxaloacetate to oxalate) in our transcripts. Our analysis indicates only the presence of malate dehydrogenase, the enzyme involved in the conversion of malate to oxaloacetate.