Skip to main content
Fig. 4 | BMC Genomics

Fig. 4

From: The performance of coalescent-based species tree estimation methods under models of missing data

Fig. 4

Impact of missing data on the internode distance matrix used by ASTRID. This figure shows the distance (denoted ∥E∥2) between the additive internode matrix DT computed using the true species tree with unit branch lengths and the internode distance matrix DA computed by ASTRID using true gene trees, as a function of the amount of missing data; see the main text for additional details. Each column represents a different level of incomplete lineage sorting (ILS): panel (a) shows datasets with low ILS, panel (b) shows datasets with high ILS, and panel (c) shows datasets with very high ILS. Lines represent the average over 20 replicate datasets, and filled regions indicate the standard error. Line color indicates the number of genes: datasets with 50, 200, and 1000 genes are shown in blue, orange, and green, respectively. Solid lines represent Miid model of missing data, and dashed lines represent Mclade model of missing data. Note that datasets with 55 and 95% of genes with clade-based missing data had 34% and 59% total missing data, respectively. Datasets shown here have deep speciation events; results for datasets with estimated gene trees as well as recent speciation are shown in Additional file 1

Back to article page