From: An assembly and alignment-free method of phylogeny reconstruction from next-generation sequencing data

Statistical properties of the estimate D with combined effect from assembly and alignment-free issues. a Mean and (b) standard deviation in the estimate D from simulations with incomplete coverage and 1 % sequencing error. The true distance between species is d = 0.1, and k-mer lengths are k = 9 (red), 11 (orange), 13 (green), 15 (blue) and 17 (black), with solid and dashed lines corresponding to no filtering and filtering, respectively. c Average number of topological mistakes generated by AAF from simulated sequences on the phylogeny depicted in Fig. 1b . One hundred simulations were performed using 80 kbp of a 1.9 Mbp sequence of the rabbit genome from Prasad et al. (2008) with random starting positions. For each sequence simulation, reads of length 76 were simulated and each read dataset was analyzed using different k-mer lengths

