Skip to main content

Table 1 Melon genome sequence improvement metrics

From: Interspecific and intraspecific gene variability in a 1-Mb region containing the highest density of NBS-LRR genes found in the melon genome

  Whole genome
  v3.5   Post-PAGIT1
Scaffolds 1,599   1,599
Contigs 29,865   20,406
Length (bp) 375,485,313   375,516,019
Length, no Ns (bp) 2 336,097,046   337,325,315
Stretches of Ns 3 (#N > 1) 28,306   20,295
Stretches of Ns 3 (#N = 20) 4 9,133   2,581
Number of Ns 39,388,267 (10.5%)   38,190,704 (10.2%)
Length of scaffolds (bp) 361,983,232   362,182,953
Length of scaffolds, no Ns (bp) 2 322,595,151   323,992,316
Length of contigs (bp) 13,502,081   13,333,066
Length of contigs, no Ns (bp) 2 13,501,895   13,332,999
Illumina unmapped reads (%) 5 18.3   13.9
Corrected sequence errors 6    
       1 bp substitution errors    53.771
       1-3 bp insertion errors    46.659
       1-3 bp deletion errors    121.178
  4235-4331 fragment
  v3.5 Post-PAGIT 1 Final
Length (bp) 1,118,599 1,120,734 1,066,373
Stretches of Ns (#N > 1) 120 84 27
Number of Ns 202,018 (18%) 192,656 (17.1%) 57,617 (5.4%)
Average N-stretch size (bp) 1,656 2,007 1,859
  1. 1Immediately after prinseq processing step performed following iCORN/redundancy removal steps.
  2. 2Only A, T, C, and G, not N, are counted.
  3. 3Contiguous strings of Ns.
  4. 4Most 20 bp-long gaps produced by the assembler while building the reference assembly mark adjacent contigs that overlap but for the presence of short, low quality/wrong sequences at their ends. These gaps are good targets for IMAGE, the software responsible for closing gaps in the PAGIT toolkit, which removes end-sequence from contigs while attempting to extend them with re-sequencing data and to overlap adjacent contigs.
  5. 5Unmapped reads after aligning the set of DHL92 high-quality Illumina PE (see Methods section).
  6. 6Based on the output reports of the iCORN software.