Skip to main content

Table 4 Annotation and SNP discovery using Roche 454 reads of genomic DNA of Ae. tauschi i AL8/78 as reference sequences

From: Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

Category No. of reads(a) Length in Mb (%) Predicted single-copy reads Length in Mb No. of contigs and singletons Length in Mb No. of SNPs Nucleotides/SNP No. of annotated genes
Genes          
   Characterized 948,379 380.0 (8.0%) 734,848 298.8 378,152 153.4 153,787 997 32,307
   Uncharacterized 285,529 113.7(2.3%) 109,158 44.0 45,570 18.1 41,844 432  
   Sub total 1,233,908 493.7 (10.3%) 844,066 342.8 423,722 171.5 195,631 876 32,307
Repetitive sequences          
   Characterized repeats 7,121,948 2,818.1 (59.0%)        
Repeat junctions(b) 347,811 156.9(3.3%) 200,564 89.3 200,564 89.3 145,907 612  
   Sub total 7,121,948 2,818.1 (59.0%)        
Uncharacterized sequences 3,929,444 1,460.7 (30.7%) 2,398,762 891.2 1,236,912 271.5 155,580 1745  
Total 12,285,300 4,772. 5 (100%) 3,443,392 1,323 1,861,198 532.3 497,118 1070 32,307
  1. (a) Number of reads after removing chloroplast and mitochondrial reads and artificial replicates using the cd-hit-454 program [27] at 98% alignment identity and 90% of sequence coverage. (b) Repeat junctions are identified from characterized repeat sequences using the repeat junction annotation pipeline program (Table S1 in Additional file 1).
\