A transcriptional sketch of a primary human breast cancer by 454 deep sequencing

BMC Genomics

Table 2 Primary classification of the 454 sequencing reads

Set Description	Number of reads
Total (unfiltered)	251,262
Mapping to the genome, 70% coverage, high stringency	194,806
Subset with a single match on the genome at 98% identity and 98% coverage (98.98.1 dataset) ¹	132,113
Subset with a single match on the genome and 100% coverage of the alignment²	114,427
Subset of 98.98.1 dataset matching with max 6 errors (mismacthes + indels) and 90% coverage on UCSC all_mrna and RefSeq – canonical transcripts dataset	59,632
Subset of 98.98.1 dataset matching inside an UCSC Known Gene (Intragenic dataset, intronic + exonic transcripts)	118,840
Matching with max 6 errors (mismatches + indels) and 90% coverage to the Human ORESTES EST dataset (764,587 sequences)	68,396

¹ Reference dataset
²87% of the reference dataset. This set was used for genomic classification (Table 3)

ISSN: 1471-2164