Table 5 Gene set quality measurements, including deviation of protein size from the group median, and maximal bit score per species in pairwise comparisons within the arthropod orthology groups. The bit score measures both gene model artefacts of alternative gene sets within species, and evolutionary divergence. Protein sizes may be more evolutionarily conserved, and may detect artefacts across and within speciesa

From: OGS2: genome re-annotation of the jewel wasp Nasonia vitripennis

Gene set Average homology bitscore Protein size deviation from median Percent shorter than 2 standard deviations from median
Nasonia OGS2 727.6 −7.7 3.2
Nasonia NCBI 722.3 −7.8 2.7
Nasonia OGS1.2 683.5 −12.7 4
Apis 733.9 −0.3 2.4
Harpegnathos 694.3 −30 7.3
Tribolium 552 −26.1 4.5
Drosophila 508.7 54.5 1.3
  1. aFor each orthology group, the median protein size of all genes among the species within the group is determined. Then for each species gene set, the maximal BLASTp bit score of a gene within that group is recorded as metric #1, and the protein size difference from the group median of that maximal match is recorded as metric #2. These metrics are averaged for all groups per species, and reported as average bit score, as average size deviation, and as percentage of size outliers (2 standard deviations below median sizes). These gene set quality measurements are provided by the Evigene scripts: “” and “”. Partial gene models are a common artefact of draft gene sets, indicated by both a negative deviation from group median sizes, and larger percentage of outliers. A similar calculation is part of the OrthoDB methodology [108]