Skip to main content

Table 2 Primary classification of the 454 sequencing reads

From: A transcriptional sketch of a primary human breast cancer by 454 deep sequencing

Set Description Number of reads
Total (unfiltered) 251,262
Mapping to the genome, 70% coverage, high stringency 194,806
Subset with a single match on the genome at 98% identity and 98% coverage (98.98.1 dataset) 1 132,113
Subset with a single match on the genome and 100% coverage of the alignment2 114,427
Subset of 98.98.1 dataset matching with max 6 errors (mismacthes + indels) and 90% coverage on UCSC all_mrna and RefSeq – canonical transcripts dataset 59,632
Subset of 98.98.1 dataset matching inside an UCSC Known Gene (Intragenic dataset, intronic + exonic transcripts) 118,840
Matching with max 6 errors (mismatches + indels) and 90% coverage to the Human ORESTES EST dataset (764,587 sequences) 68,396
  1. 1 Reference dataset
  2. 287% of the reference dataset. This set was used for genomic classification (Table 3)