Skip to main content

Table 1 Comparison between automated and human annotation of TEs

From: Automated paleontology of repetitive DNA with REANNOTATE

  repeata hitsb nestsc K ± s.d. (× 103)d time ± s.d. (Mya)e type
a Ji-6 PREM2_ZM 2 2 2 ... - ... - LTR
b Tekay TEKAY_ZM 1 1 1 ... - ... - LTR
c Rle REINA 1 0 0 - - - - LTR
d Cinful-2 CINFUL2_ZM 2 0 0 - - - - LTR
e Milt 00081 3 0 0 20.3 ± 5.5 >2.4 ± 1.4 1.56 ± 0.42 > .18 ± .15 LTR
  * 00081 1 0 0 - - - - LTR
f Opie-2 OPIE2_ZM 3 1 1 2.4 ± 1.6 2.4 ± 1.4 0.18 ± 0.11 0.18 ± 0.15 LTR
g Fourf 00098 5 0 0 18.1 ± 4.1 18.2 ± 4.1 1.39 ± 0.32 1.40 ± 0.44 LTR
h Huck-2 HUCK1 3 1 1 12.3 ± 2.9 15.3 ± 3.1 0.95 ± 0.22 1.18 ± 0.34 LTR
i Victim 00093 6 0 0 31.4 ± 19 30.7 ± 18 2.42 ± 1.44 2.36 ± 1.92 LTR
j Ji-2 PREM2_ZM 1 1 1 - < 31 ± 18 - < 2.4 ± 1.9 LTR
k Ji-3 PREM2_ZM 5 1 1 24.2 ± 4.8 24.7 ± 4.7 1.86 ± 0.37 1.90 ± 0.51 LTR
l Opie-3 OPIE2_ZM 3 2 2 6.4 ± 2.3 6.4 ± 2.3 0.49 ± 0.18 0.49 ± 0.25 LTR
m Ji-5 PREM2_ZM 1 2 2 - < 25 ± 5 - < 1.9 ± 0.5 LTR
n Ji-4 PREM2_ZM 3 1 1 21.1 ± 4.2 20.8 ± 4.1 1.62 ± 0.32 1.60 ± 0.44 LTR
o Reina REINA 4 0 0 27.0 ± 9.8 26.4 ± 9.4 2.08 ± 0.75 2.03 ± 1.02 LTR
p Cinful-1 CINFUL1/2_ZM 4 1 1 3.4 ± 2.4 3.4 ± 2.4 0.26 ± 0.18 0.26 ± 0.26 LTR
q Kake-1 00243 2 1 1 ... - ... - LTR
1 Angela_F2-2 ANGELA1_TM 2 1 0 - - - - LTR
2 RIRE2 (rice) SABRINA2_TM 1 0 0 - - - - LTR
3 SabrinaF_2-2 SABRINA2_TM 4 0 0 25.9 ± 4.2 26.6 ± 4.2 1.99 ± 0.32 2.04 ± 0.46 LTR
   SABRINA3_TM 1 - 1 - < 27 ± 4 - < 2.0 ± 0.5 LTR
   SABRINA_HV 1 - 1 - < 27 ± 4 - < 2.0 ± 0.5 LTR
4 Nusif_F2-1 NUSIF1_TM 1 1 1 - < 27 ± 4 - < 2.0 ± 0.5 LTR
5 RIRE2 (rice) RIRE2 1 0 0 - - - - LTR
6 MITE 1-4 THALOS_HV 1 0 0 - - - - MITE
7 MITE 2-5 TREP220 1 0 0 - - - - MITE
8 Veju_F2-1 VEJU1_TM 3 0 0 10.8 ± 5.5 10.8 ± 5.4 0.83 ± 0.42 0.83 ± 0.59 LTR
9 Claudia_F2-1 CLAUDIA1_TM 3 0 0 - > 41 ± 6 - > 3.2 ± 0.6 LTR
10 Latidu F2-1 LATIDU2_TM 3 1 1 13.3 ± 5.3 13.1 ± 5.4 1.01 ± 0.41 1.01 ± 0.58 LTR
11 Wham F2-1 WHAM3_TM 3 1 1 40.6 ± 5.6 41.4 ± 5.6 3.12 ± 0.43 3.18 ± 0.60 LTR
12 Fatima_F2-1 FATIMA_TM 6 0 0 - > 31 ± 4 - > 2.4 ± 0.4 LTR
13 Sukkula_F2-1 SUKKULA3_TM 1 1 1 29.9 ± 2.6 - 2.30 ± 0.20 - LTR
   SUKKULA3_TM 4 1 1 29.9 ± 2.6 30.7 ± 3.5 2.30 ± 0.20 2.36 ± 0.37 LTR
14 Angela_F2-3 ANGELA1_TM 2 2 2 - < 31 ± 4 - < 2.4 ± 0.4 LTR
15 Angela_F2-1 ANGELA1_TM 3 2 2 19.9 ± 3.4 19.9 ± 3.4 1.53 ± 0.26 1.53 ± 0.37 LTR
16 Sabrina_F2-1 SABRINA3_TM 2 0 0 - - - - LTR
17 Wis_F2-1 WIS4_TM 3 0 0 58.1 ± 6.0 57.0 ± 6.0 4.47 ± 0.46 4.38 ± 0.64 LTR
18 Sabrina_G1-1 SABRINA1_TM 3 0 0 55.8 ± 6.1 > 39 ± 5 4.29 ± 0.47 > 3.0 ± 0.6 LTR
   SABRINA1_TM 3 0 0 55.8 ± 6.1 - 4.29 ± 0.47 - LTR
19 Wham_G1-2 WHAM2_TM 5 1 1 39.1 ± 5.5 39.1 ± 5.4 3.01 ± 0.42 3.01 ± 0.58 LTR
20 Sabrina_G1-2 SABRINA2_TM 4 2 2 34.7 ± 4.8 35.9 ± 4.9 2.67 ± 0.37 2.76 ± 0.52 LTR
21 Wham_G1-1 WHAM1_TM 3 3 3 32.2 ± 4.9 31.6 ± 4.8 2.48 ± 0.38 2.43 ± 0.52 LTR
22 Miuse_G1-1 MIUSE1_TM 1 2 2 - < 39 ± 5 - < 3.0 ± 0.6 LINE
23 Latidu_G1-1 LATIDU2_TM 3 1 1 - - - - LTR
24 Eway_G1-1 EWAY1_TM 3 0 0 0 73.1 ± 18 0 5.62 ± 1.87 LTR
25 MITE 4A-10 TREP216 1 0 0 - - - - MITE
26 MITE 4A-4B TREP216 1 0 0 - - - - MITE
27 Barbara BARBARA_TM 2 0 0 - - - - LTR
28 Angela_G1-1 ANGELA6_TM 2 0 1 - - - - LTR
  1. Manual annotation results of maize [45] and diploid wheat [51] sequences are shown in italics. REANNOTATE results are shown in regular font style. Only elements spanning sequences that were annotated as TEs both in the manual annotation and in the input (REPEAT MASKER) to the automated re-annotation are listed. In the first column letters indicate maize elements and correspond to labels in Figure 3C, numbers indicate wheat elements and labels in Figure 4.
  2. aUppercase names correspond to reference element sequences in REP BASE UPDATE (RU), numbers correspond to reference sequences in the TIGR ZEA REPEAT DATABASE. Rows without an entry for the manually annotated repeat name indicate that REANNOTATE constructed multiple models (one model per row) corresponding to a single element in the manual annotation: for instance, Sabrina_F2-2 corresponds to three automated models, a result due to the fact that (parts of) different RU reference elements, SABRINA2_TM, SABRINA3_TM and SABRINA_HV are closely related, and were best matches (annotated by REPEAT MASKER) to different segments of Sabrina_F2-2.
  3. bNumber of similarity hits reported by REPEAT MASKER that were defragmented into a single element model by REANNOTATE.
  4. cNumber of repetitive elements nesting a given element. () The first wheat element listed was annotated in [51] to be inserted into a TE sequence with no detectable similarity to reference elements in RU (absent form the input REPEAT MASKER annotation); the last wheat element was annotated by REANNOTATE to be interrupting a fragment of an element homologous to CLAUDIA1_TM, which is not present in the manual annotation.
  5. dEstimated number of nucleotide substitutions between intra-element LTRs. (*) REANNOTATE did not date Milt because the 3' LTR is in inverse orientation relative to the rest of the element: an element model was built including the Milt 5' LTR and internal region, and another model for the 3' LTR. () Eway G1-1 was originally annotated as having identical LTRs, but they are in fact quite divergent. (...) Elements Ji-6, Tekay and Kake-1 were dated in the original annotation, but these elements are truncated at the ends of the available 160 Kb of contiguous sequence re-annotated here.
  6. eEstimated time of insertion (million years ago), obtained with the substitution rate for the adh loci of grasses [66]. The standard deviations computed by REANNOTATE are larger than in the manual annotation: in the latter the variance in time was propagated from the variance in K, whilst REANNOTATE additionally accounts for the Poisson variance (stochasticity) in the accumulation of nucleotide substitutions.