Skip to main content

Table 1 Comparison between automated and human annotation of TEs

From: Automated paleontology of repetitive DNA with REANNOTATE

 

repeata

hitsb

nestsc

K ± s.d. (× 103)d

time ± s.d. (Mya)e

type

a

Ji-6

PREM2_ZM

2

2

2

...

-

...

-

LTR

b

Tekay

TEKAY_ZM

1

1

1

...

-

...

-

LTR

c

Rle

REINA

1

0

0

-

-

-

-

LTR

d

Cinful-2

CINFUL2_ZM

2

0

0

-

-

-

-

LTR

e

Milt

00081

3

0

0

20.3 ± 5.5

>2.4 ± 1.4

1.56 ± 0.42

> .18 ± .15

LTR

 

*

00081

1

0

0

-

-

-

-

LTR

f

Opie-2

OPIE2_ZM

3

1

1

2.4 ± 1.6

2.4 ± 1.4

0.18 ± 0.11

0.18 ± 0.15

LTR

g

Fourf

00098

5

0

0

18.1 ± 4.1

18.2 ± 4.1

1.39 ± 0.32

1.40 ± 0.44

LTR

h

Huck-2

HUCK1

3

1

1

12.3 ± 2.9

15.3 ± 3.1

0.95 ± 0.22

1.18 ± 0.34

LTR

i

Victim

00093

6

0

0

31.4 ± 19

30.7 ± 18

2.42 ± 1.44

2.36 ± 1.92

LTR

j

Ji-2

PREM2_ZM

1

1

1

-

< 31 ± 18

-

< 2.4 ± 1.9

LTR

k

Ji-3

PREM2_ZM

5

1

1

24.2 ± 4.8

24.7 ± 4.7

1.86 ± 0.37

1.90 ± 0.51

LTR

l

Opie-3

OPIE2_ZM

3

2

2

6.4 ± 2.3

6.4 ± 2.3

0.49 ± 0.18

0.49 ± 0.25

LTR

m

Ji-5

PREM2_ZM

1

2

2

-

< 25 ± 5

-

< 1.9 ± 0.5

LTR

n

Ji-4

PREM2_ZM

3

1

1

21.1 ± 4.2

20.8 ± 4.1

1.62 ± 0.32

1.60 ± 0.44

LTR

o

Reina

REINA

4

0

0

27.0 ± 9.8

26.4 ± 9.4

2.08 ± 0.75

2.03 ± 1.02

LTR

p

Cinful-1

CINFUL1/2_ZM

4

1

1

3.4 ± 2.4

3.4 ± 2.4

0.26 ± 0.18

0.26 ± 0.26

LTR

q

Kake-1

00243

2

1

1

...

-

...

-

LTR

1

Angela_F2-2

ANGELA1_TM

2

1 †

0

-

-

-

-

LTR

2

RIRE2 (rice)

SABRINA2_TM

1

0

0

-

-

-

-

LTR

3

SabrinaF_2-2

SABRINA2_TM

4

0

0

25.9 ± 4.2

26.6 ± 4.2

1.99 ± 0.32

2.04 ± 0.46

LTR

  

SABRINA3_TM

1

-

1

-

< 27 ± 4

-

< 2.0 ± 0.5

LTR

  

SABRINA_HV

1

-

1

-

< 27 ± 4

-

< 2.0 ± 0.5

LTR

4

Nusif_F2-1

NUSIF1_TM

1

1

1

-

< 27 ± 4

-

< 2.0 ± 0.5

LTR

5

RIRE2 (rice)

RIRE2

1

0

0

-

-

-

-

LTR

6

MITE 1-4

THALOS_HV

1

0

0

-

-

-

-

MITE

7

MITE 2-5

TREP220

1

0

0

-

-

-

-

MITE

8

Veju_F2-1

VEJU1_TM

3

0

0

10.8 ± 5.5

10.8 ± 5.4

0.83 ± 0.42

0.83 ± 0.59

LTR

9

Claudia_F2-1

CLAUDIA1_TM

3

0

0

-

> 41 ± 6

-

> 3.2 ± 0.6

LTR

10

Latidu F2-1

LATIDU2_TM

3

1

1

13.3 ± 5.3

13.1 ± 5.4

1.01 ± 0.41

1.01 ± 0.58

LTR

11

Wham F2-1

WHAM3_TM

3

1

1

40.6 ± 5.6

41.4 ± 5.6

3.12 ± 0.43

3.18 ± 0.60

LTR

12

Fatima_F2-1

FATIMA_TM

6

0

0

-

> 31 ± 4

-

> 2.4 ± 0.4

LTR

13

Sukkula_F2-1

SUKKULA3_TM

1

1

1

29.9 ± 2.6

-

2.30 ± 0.20

-

LTR

  

SUKKULA3_TM

4

1

1

29.9 ± 2.6

30.7 ± 3.5

2.30 ± 0.20

2.36 ± 0.37

LTR

14

Angela_F2-3

ANGELA1_TM

2

2

2

-

< 31 ± 4

-

< 2.4 ± 0.4

LTR

15

Angela_F2-1

ANGELA1_TM

3

2

2

19.9 ± 3.4

19.9 ± 3.4

1.53 ± 0.26

1.53 ± 0.37

LTR

16

Sabrina_F2-1

SABRINA3_TM

2

0

0

-

-

-

-

LTR

17

Wis_F2-1

WIS4_TM

3

0

0

58.1 ± 6.0

57.0 ± 6.0

4.47 ± 0.46

4.38 ± 0.64

LTR

18

Sabrina_G1-1

SABRINA1_TM

3

0

0

55.8 ± 6.1

> 39 ± 5

4.29 ± 0.47

> 3.0 ± 0.6

LTR

  

SABRINA1_TM

3

0

0

55.8 ± 6.1

-

4.29 ± 0.47

-

LTR

19

Wham_G1-2

WHAM2_TM

5

1

1

39.1 ± 5.5

39.1 ± 5.4

3.01 ± 0.42

3.01 ± 0.58

LTR

20

Sabrina_G1-2

SABRINA2_TM

4

2

2

34.7 ± 4.8

35.9 ± 4.9

2.67 ± 0.37

2.76 ± 0.52

LTR

21

Wham_G1-1

WHAM1_TM

3

3

3

32.2 ± 4.9

31.6 ± 4.8

2.48 ± 0.38

2.43 ± 0.52

LTR

22

Miuse_G1-1

MIUSE1_TM

1

2

2

-

< 39 ± 5

-

< 3.0 ± 0.6

LINE

23

Latidu_G1-1

LATIDU2_TM

3

1

1

-

-

-

-

LTR

24

Eway_G1-1 ‡

EWAY1_TM

3

0

0

0 ‡

73.1 ± 18

0 ‡

5.62 ± 1.87

LTR

25

MITE 4A-10

TREP216

1

0

0

-

-

-

-

MITE

26

MITE 4A-4B

TREP216

1

0

0

-

-

-

-

MITE

27

Barbara

BARBARA_TM

2

0

0

-

-

-

-

LTR

28

Angela_G1-1

ANGELA6_TM

2

0

1†

-

-

-

-

LTR

  1. Manual annotation results of maize [45] and diploid wheat [51] sequences are shown in italics. REANNOTATE results are shown in regular font style. Only elements spanning sequences that were annotated as TEs both in the manual annotation and in the input (REPEAT MASKER) to the automated re-annotation are listed. In the first column letters indicate maize elements and correspond to labels in Figure 3C, numbers indicate wheat elements and labels in Figure 4.
  2. aUppercase names correspond to reference element sequences in REP BASE UPDATE (RU), numbers correspond to reference sequences in the TIGR ZEA REPEAT DATABASE. Rows without an entry for the manually annotated repeat name indicate that REANNOTATE constructed multiple models (one model per row) corresponding to a single element in the manual annotation: for instance, Sabrina_F2-2 corresponds to three automated models, a result due to the fact that (parts of) different RU reference elements, SABRINA2_TM, SABRINA3_TM and SABRINA_HV are closely related, and were best matches (annotated by REPEAT MASKER) to different segments of Sabrina_F2-2.
  3. bNumber of similarity hits reported by REPEAT MASKER that were defragmented into a single element model by REANNOTATE.
  4. cNumber of repetitive elements nesting a given element. (†) The first wheat element listed was annotated in [51] to be inserted into a TE sequence with no detectable similarity to reference elements in RU (absent form the input REPEAT MASKER annotation); the last wheat element was annotated by REANNOTATE to be interrupting a fragment of an element homologous to CLAUDIA1_TM, which is not present in the manual annotation.
  5. dEstimated number of nucleotide substitutions between intra-element LTRs. (*) REANNOTATE did not date Milt because the 3' LTR is in inverse orientation relative to the rest of the element: an element model was built including the Milt 5' LTR and internal region, and another model for the 3' LTR. (‡) Eway G1-1 was originally annotated as having identical LTRs, but they are in fact quite divergent. (...) Elements Ji-6, Tekay and Kake-1 were dated in the original annotation, but these elements are truncated at the ends of the available 160 Kb of contiguous sequence re-annotated here.
  6. eEstimated time of insertion (million years ago), obtained with the substitution rate for the adh loci of grasses [66]. The standard deviations computed by REANNOTATE are larger than in the manual annotation: in the latter the variance in time was propagated from the variance in K, whilst REANNOTATE additionally accounts for the Poisson variance (stochasticity) in the accumulation of nucleotide substitutions.