Skip to main content

Table 6 Comparisons with LASSO and DAMBE on real-world datasets: mean NRF values and p-values obtained in the statistical testing of PhyloMissForest samples over the alternative approaches. Lower NRF values denote better quality. N/A denotes scenarios where DAMBE was not able to find any suitable solution

From: PhyloMissForest: a random forest framework to construct phylogenetic trees with missing data

Dataset

%Missing

NRF scores

p-values

  

PhyloMissForest

LASSO

DAMBE

vs. LASSO

vs. DAMBE

9x9

5%

3.33

9.17

23.61

0.00

0.00

 

10%

3.50

8.33

29.17

0.00

0.00

 

15%

9.67

14.17

41.67

0.25

0.00

 

20%

10.67

13.33

38.54

0.35

0.00

 

25%

15.00

16.67

39.58

0.53

0.01

 

30%

14.00

17.50

36.11

0.48

0.01

 

35%

17.83

15.00

N/A

0.44

N/A

 

40%

19.83

21.67

N/A

0.58

N/A

 

45%

24.33

20.17

N/A

0.35

N/A

 

50%

25.00

22.83

N/A

0.63

N/A

 

55%

23.83

29.17

N/A

0.35

N/A

 

60%

33.17

30.00

N/A

0.68

N/A

37x37

5%

3.94

7.35

3.19

0.01

0.79

 

10%

5.56

7.50

6.72

0.06

0.54

 

15%

10.15

9.71

10.54

0.91

0.96

 

20%

10.88

10.74

N/A

0.53

N/A

 

25%

15.29

12.50

N/A

0.11

N/A

 

30%

17.56

15.00

N/A

0.19

N/A

 

35%

18.56

16.76

N/A

0.25

N/A

 

40%

20.24

18.24

N/A

0.48

N/A

 

45%

25.59

21.32

N/A

0.00

N/A

 

50%

24.74

22.21

N/A

0.14

N/A

 

55%

27.47

24.12

N/A

0.63

N/A

 

60%

31.53

30.15

N/A

0.53

N/A

55x55

5%

2.73

20.58

N/A

0.00

N/A

 

10%

5.38

21.63

N/A

0.00

N/A

 

15%

7.04

22.79

N/A

0.00

N/A

 

20%

8.67

22.31

N/A

0.00

N/A

 

25%

13.02

24.90

N/A

0.00

N/A

 

30%

14.52

26.73

N/A

0.00

N/A

 

35%

14.02

27.21

N/A

0.00

N/A

 

40%

17.79

30.38

N/A

0.00

N/A

 

45%

19.23

28.94

N/A

0.00

N/A

 

50%

24.02

33.27

N/A

0.00

N/A

 

55%

26.79

35.10

N/A

0.00

N/A

 

60%

29.73

35.10

N/A

0.00

N/A

  1. Bold values in the “NRF scores” columns denote the best NRF scores in the comparison, while in the p-values columns they refer to p-values denoting statistically significant improvements