Skip to main content

Table 1 Comparison of the discrepancies occurring between the current and Brown et al. whole genome sequencing of C. autoethanogenum

From: Whole genome sequence and manual annotation of Clostridium autoethanogenum, an industrially relevant bacterium

Position Insertion Gene Homopolymer length Amino acid length Sequence identity
    CLAU CLJU CLAU BRO BRO CAUT CLJU
46129 T CAETHG_0051 6 6 412 412 119/367 NF 412/412
283331 C CAETHG_0263 5 5 370 370a NF 370/370 369/370
627984 C CAETHG_0567 2 2 521 245 231/233 NF 521/521
656810 T CAETHG_0595 6 6 722 279 269/269 722/722 717/722
928129 C CAETHG_0862 5 5 293 250 249/249 NF 293/293
985484 C CAETHG_0915 4 4 688 688 NF NF 688/688
1106176 A CAETHG_1030 6 6 172 126 109/109 NF 172/172
1457002 C CAETHG_1363 6 6 296 254 249/249 294/295 292/296
1603900 T CAETHG_1501 8 8 401 401 NF NF 401/401
1620246 T CAETHG_1521 6 NF 323 316 315/315 323/323 310/323
2222019 T CAETHG_2078 8 8 445 326 325/325 NF 444/445
2352969 T CAETHG_2212, CAETHG_2213 2 2 416 202 None 416/416 414/416
2596835 G CAETHG_2429 7 7 400 382 378/378 400/400 400/400
2683087 C CAETHG_2503 4 4 640 615 601/605 640/640 639/640
2805023 A CAETHG_2601, CAETHG_2602 7 AAAGAAA 370 141 138/138 370/370 328/366
2852812 T CAETHG_2647 8 NF 470 314 314/314 469/470 NF
3076804 A CAETHG_2840 8 8 635 487 482/483 635/635 635/635
3396986 G CAETHG_3132, CAETHG_3133 5 5 160 152 149/149 160/160 160/160
3468796 G CAETHG_3212 5 5 271 291 270/271 270/271 270/271
3752592 G CAETHG_3500 5 5 459 418 413/415 459/459 459/459
3786709 T CAETHG_3531 6 NF 144 64 64/64 144/144 NF
3877937 A CAETHG_3599 3 3 270 74 181/182 270/270 269/270
3994749 G CAETHG_3707 6 6 261 176 172/177 NF 261/261
4180142 T CAETHG_3902 5 5 359 99 94/95 NF 359/359
  1. This table shows a representation of the discrepancies that occur when the current Illumina sequence (CLAU) is mapped against the published Brown et al. sequence (BRO). The insertion column describes the mutation occurring in the CLAU genome compared to the BRO genome. Homopolymer length indicates the number of the same base occurring consecutively at the site of the discrepancy. Amino acid length gives the annotated protein length of the gene in which the discrepancy occurs. The sequence identity is relative to our C. autoethanogenum genome sequence when protein BLAST searched on the NCBI database. CLAU, C. autoethanogenum finished genome sequence in present study; CLJU, C. ljungdahlii DSM 13528 finished genome sequence (GCA_000143685.1); BRO, Brown et al. C. autoethanogenum finished genome sequence (GCA_000484505.1); CAUT, Bruno-Barcena et al. C. autoethanogenum draft genome sequence (GCA_000427255.1); NF not found. aindicates protein codes for multiple stop codons