Statistical properties of bias and precision of the estimate D caused by alignment-free issues only.
a Mean and (b) coefficient of variation (CV) of D between two species as a function of genome size. k-mer lengths are k = 9 (red), 11 (orange), 13 (green), 15 (blue) and 17 (black). The true distance between species is d = 0.1. In (b) the dashed line is the approximate CV of D calculated assuming that all mutations were identified. c Average number of topological mistakes generated by AAF from simulated sequences on the phylogeny depicted in Fig. 1b
, with different ancestral genome lengths and different k. One hundred simulations were performed for each length using ancestral genomes taken from random starting positions on a 1.9 Mbp sequence of the rabbit genome from Prasad et al. (2008)