Skip to main content

Table 2 Primary classification of the 454 sequencing reads

From: A transcriptional sketch of a primary human breast cancer by 454 deep sequencing

Set Description

Number of reads

Total (unfiltered)

251,262

Mapping to the genome, 70% coverage, high stringency

194,806

Subset with a single match on the genome at 98% identity and 98% coverage (98.98.1 dataset) 1

132,113

Subset with a single match on the genome and 100% coverage of the alignment2

114,427

Subset of 98.98.1 dataset matching with max 6 errors (mismacthes + indels) and 90% coverage on UCSC all_mrna and RefSeq – canonical transcripts dataset

59,632

Subset of 98.98.1 dataset matching inside an UCSC Known Gene (Intragenic dataset, intronic + exonic transcripts)

118,840

Matching with max 6 errors (mismatches + indels) and 90% coverage to the Human ORESTES EST dataset (764,587 sequences)

68,396

  1. 1 Reference dataset
  2. 287% of the reference dataset. This set was used for genomic classification (Table 3)