Skip to main content

Table 1 Comparison of the discrepancies occurring between the current and Brown et al. whole genome sequencing of C. autoethanogenum

From: Whole genome sequence and manual annotation of Clostridium autoethanogenum, an industrially relevant bacterium

Position

Insertion

Gene

Homopolymer length

Amino acid length

Sequence identity

   

CLAU

CLJU

CLAU

BRO

BRO

CAUT

CLJU

46129

T

CAETHG_0051

6

6

412

412

119/367

NF

412/412

283331

C

CAETHG_0263

5

5

370

370a

NF

370/370

369/370

627984

C

CAETHG_0567

2

2

521

245

231/233

NF

521/521

656810

T

CAETHG_0595

6

6

722

279

269/269

722/722

717/722

928129

C

CAETHG_0862

5

5

293

250

249/249

NF

293/293

985484

C

CAETHG_0915

4

4

688

688

NF

NF

688/688

1106176

A

CAETHG_1030

6

6

172

126

109/109

NF

172/172

1457002

C

CAETHG_1363

6

6

296

254

249/249

294/295

292/296

1603900

T

CAETHG_1501

8

8

401

401

NF

NF

401/401

1620246

T

CAETHG_1521

6

NF

323

316

315/315

323/323

310/323

2222019

T

CAETHG_2078

8

8

445

326

325/325

NF

444/445

2352969

T

CAETHG_2212, CAETHG_2213

2

2

416

202

None

416/416

414/416

2596835

G

CAETHG_2429

7

7

400

382

378/378

400/400

400/400

2683087

C

CAETHG_2503

4

4

640

615

601/605

640/640

639/640

2805023

A

CAETHG_2601, CAETHG_2602

7

AAAGAAA

370

141

138/138

370/370

328/366

2852812

T

CAETHG_2647

8

NF

470

314

314/314

469/470

NF

3076804

A

CAETHG_2840

8

8

635

487

482/483

635/635

635/635

3396986

G

CAETHG_3132, CAETHG_3133

5

5

160

152

149/149

160/160

160/160

3468796

G

CAETHG_3212

5

5

271

291

270/271

270/271

270/271

3752592

G

CAETHG_3500

5

5

459

418

413/415

459/459

459/459

3786709

T

CAETHG_3531

6

NF

144

64

64/64

144/144

NF

3877937

A

CAETHG_3599

3

3

270

74

181/182

270/270

269/270

3994749

G

CAETHG_3707

6

6

261

176

172/177

NF

261/261

4180142

T

CAETHG_3902

5

5

359

99

94/95

NF

359/359

  1. This table shows a representation of the discrepancies that occur when the current Illumina sequence (CLAU) is mapped against the published Brown et al. sequence (BRO). The insertion column describes the mutation occurring in the CLAU genome compared to the BRO genome. Homopolymer length indicates the number of the same base occurring consecutively at the site of the discrepancy. Amino acid length gives the annotated protein length of the gene in which the discrepancy occurs. The sequence identity is relative to our C. autoethanogenum genome sequence when protein BLAST searched on the NCBI database. CLAU, C. autoethanogenum finished genome sequence in present study; CLJU, C. ljungdahlii DSM 13528 finished genome sequence (GCA_000143685.1); BRO, Brown et al. C. autoethanogenum finished genome sequence (GCA_000484505.1); CAUT, Bruno-Barcena et al. C. autoethanogenum draft genome sequence (GCA_000427255.1); NF not found. aindicates protein codes for multiple stop codons