Skip to main content

Table 4 Results on six plant genomes: We tested three tools on one model organism, A. thaliana, and five important crops of varying genomic size and repeat content

From: LtrDetector: A tool-suite for detecting long terminal repeat retrotransposons de-novo

Tool Total TP GT FP Sensitivity Precision F1 Time (hr:min:sec) Memory (GB)
A. thaliana          
 LTR_Finder 399 106 248 0 0.427 1.000 0.599 0:30:46 0.86
 LTRharvest 2301 180 248 6 0.726 0.968 0.829 0:01:08 0.24
 LtrDetector 1714 187 248 9 0.754 0.954 0.842 0:04:02 4.45
O. sativa          
 LTR_Finder 5324 1163 1760 14 0.661 0.988 0.792 5:19:03 0.95
 LTRharvest 9761 1392 1760 182 0.791 0.884 0.835 0:03:09 0.34
 LtrDetector 7343 1442 1760 119 0.819 0.924 0.868 0:15:30 5.31
S. bicolor          
 LTR_Finder 11734 4219 6565 67 0.643 0.984 0.778 10:43:26 1.62
 LTRharvest 22700 4476 6565 502 0.682 0.899 0.776 0:04:23 0.60
 LtrDetector 24682 5285 6565 214 0.805 0.961 0.876 1:10:47 6.1
G. max          
 LTR_Finder 12141 1748 3130 7 0.558 0.996 0.716 25:18:21 1.88
 LTRharvest 29016 2171 3130 20 0.694 0.991 0.816 0:09:35 0.48
 LtrDetector 25537 2542 3130 12 0.812 0.995 0.894 0:43:06 6.11
Z. mays          
 LTR_Finder 60860 11411 16839 13 0.678 0.999 0.807 111:21:37 12.36
 LTRharvest 101943 11244 16839 102 0.668 0.991 0.798 0:15:00 2.36
 LtrDetector 116923 13122 16839 71 0.779 0.995 0.874 5:53:08 9.62
H. vulgare          
 LTR_Finder
 LTRharvest 207016 4378 9164 492 0.478 0.899 0.624 1:33:29 * 5.12
 LtrDetector 213367 6824 9164 199 0.745 0.972 0.843 17:24:04** 14.15
Total (Excluding H. vulgare)          
 LTR_Finder 90458 18647 28542 101 0.653 0.995 0.789 153:13:13
 LTRharvest 165721 19463 28542 812 0.682 0.960 0.797 0:33:15
 LtrDetector 176197 22578 28542 425 0.791 0.982 0.876 8:06:33
Total (Including H. vulgare)          
 LTRharvest 372737 23841 37706 1304 0.632 0.948 0.759 02:06:44
 LtrDetector 389564 29402 37706 624 0.780 0.979 0.868 25:30:37
  1. Parameters used for each tool can be found in the “Implementation” section. We used an additional utility to process each of LTR_Finder and LTRharvest in parallel because neither supports multi-threading. We did so to ensure fair comparison in terms of time since our tool, LtrDetector, is concurrent by default. Total is the number of proposed LTR-RTs, TP is number of true positives, GT is number of elements in the ground truth, FP are false positives. Sensitivity, Precision, and F1 are defined by Eqs. 1, 2, and 3. We report all measures for each genome and in total. Note: Results for LTR_Finder are unavailable for the Hordeum vulgare (barley) genome because memory demands repeatedly caused the computer to crash on four computer cores, and a subsequent trial on one core was unable to finish over two weeks of run time (2/7 chromosomes finished). All trials run on four cores unless otherwise noted. * LTRharvest run on one thread for H. vulgare. ** LtrDetector run on three threads for H. vulgare