Skip to main content
Figure 1 | BMC Genomics

Figure 1

From: Codon-triplet context unveils unique features of the Candida albicans protein coding genome

Figure 1

Schematic representation of the bioinformatics system. Gene sequences were downloaded from genome databases (Table 1) and filtered into a local database to eliminate false Open Reading Frames. Sequences were then processed by counting all codon-triplets, excluding the first and the last ones of each ORF, which have specific translation initiation and termination contexts. These data were transferred to a 3-dimensional 61 × 61 × 61 matrix and were saved as a Microsoft Access Database file. The processed data were then analyzed using Weka-3 data mining tools [19] and direct database queries. This methodology allowed us to handle very large data sets and identify differences in codon-triplet context between fungal species. These differences were finally subjected to statistical analyses.

Back to article page