Skip to main content

Table 3 Assessment and quantification of reassembly of duplicate BAC sequences

From: Gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing

Assembly number Parameters Total # contigs # contigs (> 100)a % Coverage of old contigsb % Identity to old contigsc % Coverage +103e11d % Identity +103e11d
1 standard 551 44 98.52% 99.07% 98.44% 97.39%
2 revise_greedy 2538 45 91.41% 99.08% 92.74% 98.43%
3 forcelevel 5 2140 40 96.13% 99.21% 95.56% 98.52%
4 minmatch 30 2184 50 94.77%e 98.92%e 95.51% 97.91%
5 forcelevel 3 2326 43 98.40% 98.60% 97.74% 97.96%
6 forcelevel 5 minmatch 30 1781 43 88.75%e 99.18%e 86.17% 98.04%
7 forcelevel 3 minmach30 1950 46 93.38%f 99.18%f   
  1. a Total number of contigs that contain greater than 100 sequence traces
  2. b Total length of the resulting contigs (not including any overlapping regions) divided by the length of the originally assembled BAC
  3. c Percent identity as calculated from Vmatch
  4. d Recalculated percent coverage and percent identity to include contigs containing traces from gmw1-103e11; these contigs did not meet the 80% sequence identity cutoff for Vmatch
  5. e One contig from gmw1-103e11 met the cutoff criteria of 80% sequence identity for Vmatch and was included in this estimation. The second contig was included in the +103e11 calculations
  6. f This parameter set matches the parameter set that was determined to give the best reassembly of gmw1-103e11 as a single BAC reassembly. Both resulting contigs met the 80% sequence identity cutoff for Vmatch and are included in these averages.