Figure 3

Fingerprint of seven Rickettsia genomes. BLAST hit weights of all protein-coding genes in the seven Rickettsia genomes are plotted. (A-C) Kernel density functions of the self, close and distal weights. The x-axis represents the weight of each gene. The y-axis represents the probability density of genes with the corresponding weight in the genomes. In this example, rule 1 (close weight < cutoff) and rule 2 (distal weight > =cutoff) were applied. The close and distal cutoffs computed under the conservative criterion are indicated by dashed lines. The values of the cutoffs are denoted in each panel. (D) A scatter plot of the distal weight against the close weight, showing the clustering pattern of the genes. Each dot represents one gene. Genes predicted to be HGT-derived are framed by a red rectangle. (E) A zoom-in view of the left part of the previous plot. Genes that fall within the atypical region in the close weight distribution are colored by a blue-red color scheme based on the density-based silhouette (dbs), a measure of confidence that this gene belongs to the atypical cluster of genes (red = high confidence). The close cutoff used in the subsequent analyses is indicated by a dashed line.