Skip to main content

Table 2 Validation of the deRIP technique comparing homology of majority- and deRIP-consensus sequences with non-RIP-affected sequences.

From: In silico reversal of repeat-induced point mutation (RIP) identifies the origins of repeat families and uncovers obscured duplicated genes

  

Blastn homology

Needleman-Wunsch Global Alignment

  

Majority consensus

deRIP consensus

deRIP improvement factor

Majority consensus

deRIP consensus

deRIP improvement to percent identity

Repeat class

Hit Accession

e-value

bitscore

e-value

bitscore

  

Percent identity

 

(A) Comparisons to active transposon sequences

Elsa

AJ277966

1.00E-51

216

1.00E-121

381

1.8 X

69.2%

73.1%

3.9%

Molly

AJ488502

7.00E-07

66

3.00E-86

329

5.0 X

72.3%

77.5%

5.2%

Pixie

AJ488503

5.00E-07

66

2.00E-28

137

2.1 X

72.5%

75%

2.5%

(B) Comparisons to RIP-protected rDNA array consensus (Figure 1: region 4)

Long, non-rDNA array repeats > 1 kb

0

12800

0

17220

1.3 X

89.5%

94.0%

4.5%

Short, non-rDNA array repeats < 1 kb

3.00E-10

58

1.00E-27

122

2.1 X

46.2% a

45.6% a

-0.6%

RIP-mutated terminal rDNA array repeat

(Figure 1: region 3)

0

8258

--

--

--

85.8%

--

--

a Needleman-Wunsch global alignment was performed using a sub-region of long rDNA repeats corresponding to the short rDNA repeat consensus

  1. Blastn hits and pairwise global percent identities to non-RIP-affected sequences were compared between the majority consensus and deRIP consensus versions. (A) The transposons Elsa, Molly and Pixie of S. nodorum SN15 were compared to active copies of an alternate strain. In all 3 cases the deRIP sequences match best to the active transposons. This is indicated by the 'deRIP improvement' factor and the differences in percent identities for global alignments. DeRIP improvement is a measure of how much better the deRIP consensus matched the hit compared to the majority consensus. DeRIP improvement > 1 indicates that the repeat family was derived from the hit or a related homolog, but was subsequently mutated by RIP. (B) RIP-protected copies of the S. nodorum rDNA repeat are located within a tandem array (Figure 1). RIP-susceptible copies were grouped by size into long (> 1 kB) and short (< 1 kB) categories and compared to the RIP-protected copies. Homology between RIP-protected repeats in rDNA array and long RIP-susceptible non-rDNA array repeats were improved by deRIP. The rDNA array also contains one RIP-affected repeat at its terminus which shows similar levels of homology to the rDNA array as the majority consensus of the long non-rDNA array repeats.