Fig. 2From: Functional annotation of a divergent genome using sequence and structure-based similarityThe annotated genome of V. necatrix (a) Pie chart summarizing the functional annotation output using a combination of sequence and structure-based hits and experimental data. Compared to ProtNLM or eggNOG (yellow, marked by black dashed lines), our complementary approach improved the genome annotation by an additional 319 final curated gene functions, here shown in yellow. Further, 107 experimentally solved protein structures (black) from PDB are listed as structural matches. 220 genes that have homologs in other microsporidia, but are of unknown function, are presented in dark grey. Light grey represents 928 hypothetical V. necatrix genes that have no matches to the known genes of other microsporidia. (b) Approximate localization of the rDNA genes 16Â S/23S (blue) and 5Â S (green) on the 12 chromosomes of the two predominant pseud-haplotypes 1 (black) and 2 (grey). The insert depicts one rDNA in shades of blue (light blue for the 16Â S, dark blue for the 23Â S) and one 5Â S gene in green. The internal transcribed spacer (ITS) is shown in yellow. (c) Structure-based network of highly abundant protein-fold families encoded by our V. necatrix genome. AlphaFold-predicted protein models were analyzed for structural relatedness in a Foldseek all-against-all search. The structural similarity is represented by the TM score which is used as a measure for the protein network graph generated in Gephi (v0.9.2). Each node represents a protein colored according to its fold family. Proteins with inverted surrounding and filling color compared to the main cluster have an additional common domain besides the one unifying the main cluster i.e., Clp R domain-containing proteins and actin(-like) proteins. Connecting lines indicate structural relation of proteins and thicker lines indicate greater structural similarity. PTP6, polar tube protein 6; RBL, ricin B lectin; MCM, minichromosome maintenance; Serpin-type protein, serine-protease inhibitor type protein; MULE domain, Mutator-like elements domain; Tr-type G domain, translation-type guanosine-binding domain; SP, signal peptide; Clp R domain, caseinolytic protease repeat domain; AAA+, ATPases associated with diverse cellular activitiesBack to article page