Skip to main content

Table 5 Comparison of Illumina read mapping efficacy using clinical isolates derived from different lineages using Bowtie2 and SAMtools

From: Construction of a virtual Mycobacterium tuberculosis consensus genome and its application to data from a next generation sequencer

i) Comparison of the numbers of mapped and unmapped reads to the H37Rv sequence or consensus sequence

 

LineAge

 

H37Rv

Consensus sequence

Subtraction of ratio (%)

Mapping stringency*

Local

End to end

Local

End to end

Local

End to end

(CS minus H37Rv)

F092

EAI

mapped

681561

664376

684952

667250

  
  

unmapped

22261

39446

18870

36572

  
  

ratio (%)

96.837

94.395

97.319

94.804

0.482

0.408

J156

EAI

mapped

1680156

1650866

1689673

1658917

  
  

unmapped

40162

69452

30645

61401

  
  

ratio (%)

97.665

95.963

98.219

96.431

0.553

0.468

F038

Haarlem, LAM, X etc.

mapped

1024873

997301

1029625

1000714

  
  

unmapped

75113

102685

70361

99272

  
  

ratio (%)

93.171

90.665

93.603

90.975

0.432

0.310

F070

Haarlem, LAM, X etc.

mapped

858126

840921

861393

843463

  
  

unmapped

27822

45027

24555

42485

  
  

ratio (%)

96.860

94.918

97.228

95.205

0.369

0.287

J073

Haarlem, LAM, X etc.

mapped

1534315

1503494

1537891

1504256

  
  

unmapped

11979

42800

8403

42038

  
  

ratio (%)

99.225

97.232

99.457

97.281

0.231

0.049

J147

Haarlem, LAM, X etc.

mapped

847807

836269

849747

836489

  
  

unmapped

11775

23313

9835

23093

  
  

ratio (%)

98.630

97.288

98.856

97.313

0.226

0.026

F081

other non-Beijing

mapped

1004912

974425

1008107

976556

  
  

unmapped

43978

74465

40783

72334

  
  

ratio (%)

95.807

92.901

96.112

93.104

0.305

0.203

J020

other non-Beijing

mapped

1081365

1062537

1085065

1065704

  
  

unmapped

11293

30121

7593

26954

  
  

ratio (%)

98.966

97.243

99.305

97.533

0.339

0.290

J027

other non-Beijing

mapped

751633

741219

754861

744254

  
  

unmapped

5259

15673

2031

12638

  
  

ratio (%)

99.305

97.929

99.732

98.330

0.426

0.401

F022

Ancestral Beijing

mapped

1162270

1143243

1166545

1147960

  
  

unmapped

26600

45627

22325

40910

  
  

ratio (%)

97.763

96.162

98.122

96.559

0.360

0.397

J090

Ancestral Beijing

mapped

490815

484340

492424

486326

  
  

unmapped

5153

11628

3544

9642

  
  

ratio (%)

98.961

97.655

99.285

98.056

0.324

0.400

J002

Ancestral Beijing

mapped

736473

727044

739288

730539

  
  

unmapped

5757

15186

2942

11691

  
  

ratio (%)

99.224

97.954

99.604

98.425

0.379

0.471

J029

Modern Beijing

mapped

953792

936476

957539

941221

  
  

unmapped

10220

27536

6473

22791

  
  

ratio (%)

98.940

97.144

99.329

97.636

0.389

0.492

F076

Modern Beijing

mapped

532526

519473

534742

522431

  
  

unmapped

16374

29427

14158

26469

  
  

ratio (%)

97.017

94.639

97.421

95.178

0.404

0.539

J111

Modern Beijing

mapped

719693

708076

722895

712304

  
  

unmapped

14341

25958

11139

21730

  
  

ratio (%)

98.046

96.464

98.482

97.040

0.436

0.576

 

ii) Comparison of mappping frequency ratio (%) among the MTBC lineanges

    
  

EAI

Haarlem, LAM, X etc.

other non-Beijing

    
 

Haarlem, LAM, X etc.

ns

-

-

    
 

other non-Beijing

ns

ns

-

    
 

Beijing

P < 0.05

ns

ns

    
  1. In this analysis CS based on 13 M. tuberculosis strains (Table 1) was used as the consensus sequence. i) After mapping with Bowtie2 [42] against H37Rv or CS, the idxstats command of SAMtools [43] was used to calculate the mapping efficacy (Table 5). In read mapping with Bowtie2, both of local and end-to-end mapping mode were performed, and the other parameters were set with default values. Significant differences in mapping frequencies were assessed using multiple comparisons of proportions tests [44]. For all isolates, the difference between H37Rv and CS as a reference differed significantly (p < 0.0001). For both mapping modes, the ratio of mapped to total reads was calculated, and these values used to calculate differences in mapping frequency between the consensus and H37Rv sequences by simple subtraction.
  2. ii) Based on the difference in mapping frequency in 1), the mapping frequencies of MTBC lineages were compared using Mann–Whitney U tests. Combination of Beijing and EAI sequences showed the significan difference (p < 0.05) in mapping frequencies when compared relative to the consensus and H37Rv sequences, the latter belonging to the Haarlem, LAM, X etc. lineage (linage 4).