Taxa counting vs minimal distance. Taxa counting for leading SPs as function of minimal distance d for (A) SYV 220.127.116.11 and (B) SYQ 18.104.22.168 for the fused strings that are full protein candidates. Based on the statistics of Swiss-Prot displayed in Fig. 4, we estimate from (B) that 400 of the total count may be due to different strains, exhibiting distance ≤ 2. In comparing with Fig. 4 note that the latter starts with a bin of zero difference between sequences, whereas here the first bin refers to differences larger or equal to 1.