Skip to main content

Table 3 Assessment and quantification of reassembly of duplicate BAC sequences

From: Gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing

Assembly number

Parameters

Total # contigs

# contigs (> 100)a

% Coverage of old contigsb

% Identity to old contigsc

% Coverage +103e11d

% Identity +103e11d

1

standard

551

44

98.52%

99.07%

98.44%

97.39%

2

revise_greedy

2538

45

91.41%

99.08%

92.74%

98.43%

3

forcelevel 5

2140

40

96.13%

99.21%

95.56%

98.52%

4

minmatch 30

2184

50

94.77%e

98.92%e

95.51%

97.91%

5

forcelevel 3

2326

43

98.40%

98.60%

97.74%

97.96%

6

forcelevel 5 minmatch 30

1781

43

88.75%e

99.18%e

86.17%

98.04%

7

forcelevel 3 minmach30

1950

46

93.38%f

99.18%f

  
  1. a Total number of contigs that contain greater than 100 sequence traces
  2. b Total length of the resulting contigs (not including any overlapping regions) divided by the length of the originally assembled BAC
  3. c Percent identity as calculated from Vmatch
  4. d Recalculated percent coverage and percent identity to include contigs containing traces from gmw1-103e11; these contigs did not meet the 80% sequence identity cutoff for Vmatch
  5. e One contig from gmw1-103e11 met the cutoff criteria of 80% sequence identity for Vmatch and was included in this estimation. The second contig was included in the +103e11 calculations
  6. f This parameter set matches the parameter set that was determined to give the best reassembly of gmw1-103e11 as a single BAC reassembly. Both resulting contigs met the 80% sequence identity cutoff for Vmatch and are included in these averages.