Skip to main content

Table 1 Sequences, alignments and SNPs: summary of data generated and analyzed in this work

From: A genomic scale map of genetic diversity in Trypanosoma cruzi

Description

Number

Sequences

 

CL-Brener Reference (CDS); TcVI

25,013

Mapped CDS from Sylvio X10 genome; TcI

4,918

Mapped transcripts from TcI transcriptome

3,486

Mapped reads from Esmeraldo cl3 shotgun; TcII §

134,424

Mapped Expressed Sequence Tags (ESTs)

13,968

Mapped misc GenBank sequences (mRNAs, CDS)

2,038

Alignments

 

Total

7,639

Containing two reference coding sequences

5,447

SNPs

 

Total

325,355

With P > 0.70 †

305,993

In good sequence neighborhood *

302,390

P > 0.70 AND good seq neighborhood †*

288,957

  Synonymous †*

125,455

  Non-synonymous †*

162,820

  Nonsense †*

113

  Non-coding †*

569

  Triallelic †*

2,990

  Tetra-allelic †*

10

  Average SNP density †*

2.4 per 100 bp

Indels

 

Total

28,316

With P > 0.70 †

11,007

In good sequence neighborhood *

10,523

P > 0.70 AND good seq neighborhood †*

1,480

  1. § reads where at least 50bp matched the reference with >= 95% identity.
  2. † SNPs with probability > 0.7 as assigned by PolyBayes[34].
  3. * SNP is located in a 10 bp window with < 2 other SNPs.