Skip to main content

Table 6 Comparison of Illumina read mapping efficacy using clinical isolates derived from different lineages using

From: Construction of a virtual Mycobacterium tuberculosis consensus genome and its application to data from a next generation sequencer

Isolate

Lineage

 

H37Rv

Consensus sequence

Subtraction of ratio (%)

  

Mapping stringency*

Ambiguity

Medium

Strict

Ambiguity

Medium

Strict

(Consensus-H37Rv)

F092

EAI

mapped

676219

676007

674779

677079

676941

676114

   
  

unmapped

20275

20487

21715

19415

19553

20380

   
  

ratio (%)

97.089

97.059

96.882

97.212

97.193

97.074

0.123

0.134

0.192

J156

EAI

mapped

1656675

1656191

1652024

1661496

1660887

1657869

   
  

unmapped

37151

37635

41802

32330

32939

35957

   
  

ratio (%)

97.807

97.778

97.532

98.091

98.055

97.877

0.285

0.277

0.345

F038

Haarlem, LAM, X etc.

mapped

985717

985256

978713

986844

986318

980541

   
  

unmapped

43761

44222

50765

42634

43160

48937

   
  

ratio (%)

95.749

95.704

95.069

95.859

95.808

95.246

0.109

0.103

0.178

F070

Haarlem, LAM, X etc.

mapped

847048

846879

844774

847486

847257

845485

   
  

unmapped

21212

21381

23486

20774

21003

22775

   
  

ratio (%)

97.557

97.537

97.295

97.607

97.581

97.377

0.05

0.044

0.082

J073

Haarlem, LAM, X etc.

mapped

1511361

1511205

1508937

1512926

1512725

1511328

   
  

unmapped

25005

25211

27479

23490

23691

25088

   
  

ratio (%)

98.372

98.359

98.211

98.471

98.458

98.367

0.099

0.099

0.156

J147

Haarlem, LAM, X etc.

mapped

835484

835319

834077

835192

834976

834227

   
  

unmapped

14694

14859

16101

14986

15202

15951

   
  

ratio (%)

98.272

98.252

98.106

98.237

98.212

98.124

−0.034

−0.04

0.018

F081

other non-Beijing

mapped

984775

984303

981542

986321

986079

983479

   
  

unmapped

31067

31539

34300

29521

29763

32363

   
  

ratio (%)

96.942

96.895

96.623

97.094

97.07

96.814

0.152

0.175

0.191

J020

other non-Beijing

mapped

1061748

1061460

1059770

1063613

1063346

1062167

   
  

unmapped

22918

23206

24896

21053

21320

22499

   
  

ratio (%)

97.887

97.861

97.705

98.059

98.034

97.926

0.172

0.174

0.221

J027

other

mapped

741045

740904

739801

742516

742329

741694

   
 

non-Beijing

unmapped

13721

13862

14965

12250

12437

13072

   
  

ratio (%)

98.182

98.163

98.017

98.377

98.352

98.268

0.195

0.189

0.251

F022

Ancestral Beijing

mapped

1140373

1140107

1137877

1145147

1144862

1143607

   
  

unmapped

32673

32939

35169

27899

28184

29439

   
  

ratio (%)

97.215

97.192

97.002

97.622

97.597

97.49

0.407

0.405

0.488

J090

Ancestral Beijing

mapped

2087545

2086983

2082777

2095411

2094879

2092249

   
  

unmapped

47551

48113

52319

39685

40217

42847

   
  

ratio (%)

97.773

97.747

97.55

98.141

98.116

97.993

0.368

0.37

0.444

J002

Ancestral Beijing

mapped

725501

725308

724182

727822

727702

727147

   
  

unmapped

13427

13620

14746

11106

11226

11781

   
  

ratio (%)

98.183

98.157

98.004

98.497

98.481

98.406

0.314

0.324

0.401

J029

Modern Beijing

mapped

935765

935598

934129

939368

939185

938425

   
  

unmapped

21607

21774

23243

18004

18187

18947

   
  

ratio (%)

97.743

97.726

97.572

98.119

98.1

98.021

0.376

0.375

0.449

F076

Modern Beijing

mapped

523546

523438

522478

526300

526150

525618

   
  

unmapped

17480

17588

18548

14726

14876

15408

   
  

ratio (%)

96.769

96.749

96.572

97.278

97.25

97.152

0.509

0.501

0.58

J111

Modern Beijing

mapped

703968

703761

702412

708028

707765

707024

   
  

unmapped

17244

17451

18800

13184

13447

14188

   
  

ratio (%)

97.609

97.58

97.393

98.172

98.135

98.033

0.563

0.555

0.639

ii) Comparison of mappping frequency ratio (%) among the MTBC lineanges

  

EAI

HaarlemLAM, X etc.

other non-Beijing

      
 

Haarlem,LAM, X etc.

ns

-

-

       
 

non-Beijing

ns

ns

-

       
 

Beijing

P<0.05

P<0.01

P<0.05

      
  1. In this analysis CS based on 13 M. tuberculosis strains (Table 1) was used as the consensus sequence. i) The effects on mapping efficacy were tested for three combinations of parameters: mismatch cost, insertion cost, deletion cost, matching length and similarity. *Mapping stringency was defined as Ambiguous, with frequencies of mismatch cost, insertion cost, deletion cost, matching length and similarity of 2, 2, 2, 0.5, and 0.8, respectively; Medium, with frequencies of 2, 3, 3, 0.5, and 0.8, respectively; and Strict, with frequencies of 3, 3, 3, 0.5, and 0.95, respectively. Significant differences in mapping frequencies were assessed using multiple comparisons of proportions tests [44]. For all isolates, the difference between H37Rv and CS as a reference differed significantly (p < 0.0001). For each stringency setting, the ratio of mapped to total reads was calculated, and these values used to calculate differences in mapping frequency between the consensus and H37Rv sequences by simple subtraction.
  2. ii) Based on the difference in mapping frequency in 1), the mapping frequencies of MTBC lineages were compared using Mann–Whitney U tests. The Haarlem, LAM, X and Beijing sequences showed the greatest difference (p < 0.01) in mapping frequencies when compared relative to the consensus and H37Rv sequences.