Genetic heterogeneity revealed by sequence analysis of Mycobacterium tuberculosis isolates from extra-pulmonary tuberculosis patients
- Sarbashis Das†1,
- Tanmoy Roychowdhury†1,
- Parameet Kumar2,
- Anil Kumar2,
- Priya Kalra2,
- Jitendra Singh3,
- Sarman Singh3,
- HK Prasad2Email author and
- Alok Bhattacharya1, 4Email author
© Das et al.; licensee BioMed Central Ltd. 2013
Received: 4 September 2012
Accepted: 3 June 2013
Published: 17 June 2013
Tuberculosis remains a major public health problem. Clinical tuberculosis manifests often as pulmonary and occasionally as extra-pulmonary tuberculosis. The emergence of drug resistant tubercle bacilli and its association with HIV is a formidable challenge to curb the spread of tuberculosis. There have been concerted efforts by whole genome sequencing and bioinformatics analysis to identify genomic patterns and to establish a relationship between the genotype of the organism and clinical manifestation of tuberculosis. Extra-pulmonary TB constitutes 15–20 percent of the total clinical cases of tuberculosis reported among immunocompetent patients, whereas among HIV patients the incidence is more than 50 percent. Genomic analysis of M. tuberculosis isolates from extra pulmonary patients has not been explored.
The genomic DNA of 5 extra-pulmonary clinical isolates of M. tuberculosis derived from cerebrospinal fluid, lymph node fine needle aspirates (FNAC) / biopsies, were sequenced. Next generation sequencing approach (NGS) was employed to identify Single Nucleotide Variations (SNVs) and computational methods used to predict their consequence on functional genes. Analysis of distribution of SNVs led to the finding that there are mixed genotypes in patient isolates and that many SNVs are likely to influence either gene function or their expression. Phylogenetic relationship between the isolates correlated with the origin of the isolates. In addition, insertion sites of IS elements were identified and their distribution revealed a variation in number and position of the element in the 5 extra-pulmonary isolates compared to the reference M. tuberculosis H37Rv strain.
The results suggest that NGS sequencing is able to identify small variations in genomes of M. tuberculosis isolates including changes in IS element insertion sites. Moreover, variations in isolates of M. tuberculosis from non-pulmonary sites were documented. The analysis of our results indicates genomic heterogeneity in the clinical isolates.
KeywordsExtra-pulmonary Tuberculosis Next-generation Sequencing Genetic Heterogeneity Single Nucleotide Variations Insertion Elements Phylogeny Spoligotyping
Tuberculosis is a public health challenge. It is estimated that one third of the human population harbor M. tuberculosis, however approximately only 20% of these infected individuals go on to develop clinical tuberculosis . The infection with M. tuberculosis usually results in pulmonary tuberculosis but it can also manifest in extra-pulmonary sites, such as tuberculous meningitis, endometritis, lymphadenitis, pleuritis, etc.[2, 3]. In India, around 15–20 percent cases of tuberculosis among immuno-competent adults have been reported to occur at extra-pulmonary sites, whereas among HIV co-infected patients the incidence increases to more than 50% . The classical route of infection is by inhalation of infectious droplets (respiratory route), however, infection can occasionally occur via alternate routes such as skin abrasions and open wounds . Extra-pulmonary tuberculosis has been always recognized as a sequel to primary pulmonary infection [6–8]. How exactly it occurs and its facilitation remains an enigma. The present study has been designed to sequence and analyze extra-pulmonary isolates to identify genomic patterns and features of M. tuberculosis isolated from patients with extra-pulmonary tuberculosis.
Genomic variations in M. tuberculosis have been studied using a number of different methods, such as spoligotyping and variable-number tandem repeats (VNTRs) . These studies have shown variations among different clinical isolates of M. tuberculosis. None of these approaches gives a complete picture of variations at the whole genome level, as each of the methods have their limitations. For example, the spoligotyping pattern would be determined by the strain(s) present in the sample under investigation, and it would be difficult to determine if the observed pattern is due to a dominant strain or a collective pattern of all the strains present in the sample Therefore, it is not reliable to use spoligotyping to establish mixed infection. On the other hand, MIRU-VNTR has been considered useful for detecting mixed infection since it’s based on allelic variation. However, it is possible to have different genotypes with the same VNTR pattern, particularly in case of closely related isolates. Efficacy of DNA extraction used in these assays can be potentially hampered by the presence of clumps / aggregation of mycobacteria in clinical samples / cultures.
The genome of M. tuberculosis strain H37Rv was sequenced 15 years ago using the standard approach pioneered by Cole et al. The analysis of the assembled sequences suggested that the genome size is 4.4 Mb. encoding about 4000 genes . The genome analysis also showed that there are a number of repeat families, particularly PPE and PGRS family of genes. These repeat families of proteins may have a role in pathogenesis. Subsequently, a number of different species and isolates of the M. tuberculosis complex have been sequenced. On comparative analysis with other mycobacterial species the MTB complex clustered separately showing a high degree of sequence identity among this group of mycobacterial species . Different isolates showed polymorphisms at the level of single nucleotides, number of repeats at a given loci, indels and synteny . Attempts have been made to map polymorphisms that are correlated with some of the phenotypes, such as drug resistance . Though some correlations have been found a clear cut casual relationship has not been established so far. With the introduction of next generation sequencing, genome sequences of several isolates have become available and now it is possible to identify genetic markers for specific phenotype. It has also become possible to identify evolving patterns in genomes. For example, “hotspot” and “coldspot” regions have been identified using statistical methods and sequence information from large number of isolates . However, most of the genome data available are from M. tuberculosis isolates derived from pulmonary tuberculosis patients. Therefore it is relevant that isolates from extra pulmonary isolates derived from tuberculosis patients should also be analyzed at the genome level.
Results and discussion
Sequencing of M. tuberculosisisolates from extra-pulmonary tuberculosis patients
Description of clinical isolates used in the study and basic statistics of the sequencing data
Lymph node from biopsy
Average read length
Total reads aligned after filtering
Total reference length
% Total reference covered
% Reads aligned with reference
Optimized average read depth
Spoligotype patterns of the five extra-pulmonary clinical isolates
DNA was extracted from the isolates as described before  and subjected to nucleotide sequencing using NGS technology (IlluminaGA-IIx). Gross statistics derived from sequence data are also shown in Table 1. In general the number of short reads was more than 3 million with an average length of 72 nucleotides for each isolate. We have used the complete genome sequence of M. tuberculosis H37Rv strain (NC_000962.2) as the reference sequence. The short reads were aligned to the reference genome as described in “Methods”. While genome coverage varied from 82% to about 92% for different isolates, unaligned reads were between 3 to 16% suggesting that the quality of DNA preparations from these isolates of good quality.
Sequence annotation and isolate comparison
M. tuberculosis H37Rv genome has been reported to encode 3988 protein encoding genes . Nearly 84% of these genes displayed more than 90% coverage based on our alignments in all strains except F99 (54% coverage). Genes with less than 10% coverage and/or read depth below 10 were considered as “missing genes” (see Additional file 1: Table S1). The total number of predicted missing genes in all the five isolates varied from 17 (AC544) to 74 (F99). Some of these genes (total 5) were missing in all the five isolates. Since genomic deletion can result in genes missing in an isolate, further analysis was carried out to identify large deletions in the isolates using Pindel , which detects breakpoints for large deletion using paired-end data. Among the missing genes, 62 were predicted by Pindel as those due to genomic deletions. Out of these 62 genomic deletions, 23 and 3 are from F99 and LN8 respectively (see Additional file 1: Table S1). In general a missing gene in one isolate was found in another. LysR family activates divergent transcription of linked target genes or unlinked regulons with diverse functions  and it is one such family that was missing in isolate AC74. In yet another example of missing gene is the absence of malonyl CoA-acyl carrier protein transacylase, in all isolates except isolate AC544. This is an essential gene for the transfer of malonyl group from coenzyme-A to acyl carrier protein AcpM, therefore it is required for biosynthesis of cell wall in M. tuberculosis. In case of F99 several trans-membrane protein coding genes, such as Rv2272 and Rv2273 were missing. Although we have considered repeat regions by realigning short reads with multiple matches, a number of prophage proteins (phiRV2) were not mapped. Prophage proteins are quite often present in and around repeat regions and show polymorphisms with respect to their positions among different isolates. For example, two types of prophage proteins are present in M. tuberculosis H37Rv and CDC1551, but absent in Mycobacterium bovis. A total of 68 (38%) missing genes are from PE-PGRS gene family.
Analysis of single nucleotide variations
The likely reason for the origin of mixed genotype could also be due to the fact that patients can be potentially re-infected / super infected by different strains or rapid genomic changes take place in a subset of cells in patients. Recent investigations have suggested that several pulmonary isolates of M. tuberculosis display clonal heterogeneity [28, 29]. Heterogeneous population does offer selective advantage for survival of the tubercle bacilli within the hostile microenvironment of the host. These alterations may facilitate in vivo dissemination / migration of the tubercle bacilli from the pulmonary infectious foci to other organs, as has been speculated to occur in the infected host [30, 31].
Functional classification and enrichment study of genes having major SNVs
Study of insertion sequence elements (IS elements)
IS elements have been extensively used as a marker for strain identification in M. tuberculosis due to high numerical and positional polymorphisms (for a review see [41–44]). IS elements can influence gene expression depending upon the sites of insertion, for example, IS6110 increases the expression of neighboring genes which are involved in virulence . In this study an attempt was made to identify IS elements, particularly IS6110 from short read sequence data of the isolates and to derive a distribution across the genomes. The strategy used has been described in “Methods”. The results are shown in the Additional file 10: Table S9.
Total number of IS elements was found to be 56 including 16 copies of IS6110 in H37Rv strain. Our studies revealed that the numbers of copies of IS6110 varied from 5 to 19 in different isolates and their positions differed from that seen in M. tuberculosis H37Rv.
Primer sequences and standardized PCR parameters
Coordinate of IS element (w.r.t. H37Rv) present in strain
Primers flanking the co-ordinate of IS element
Standardized annealing temp./ MgCl2concentration
Amplicon size (bp)
3480373, + strand, in AC74
1527976, + strand, in AC544
1356494, + strand, in AC741356495, + strand, in F85
2550014-2551368, + strand, in H37Rv
1541952-1543306, - strand, in H37Rv
1657017, + strand, in F85
2229658, - strand, in F99
Multiple copies present in all
SNV based phylogenetic analysis
Our major finding from analysis of NGS data of extra-pulmonary isolates of M. tuberculosis is detection of genomic heterogeneity in isolates. The computational approach used by us can identify mixed genotypes even when one of the genotypes is represented at a low level. We have further analyzed the functional significance of SNVs identified by using different approaches, such as COG. NGS data was also utilized to identify IS elements and its insertion into various sites, among the different isolates. Some of these predictions were validated by experiments. Phylogenetic relationship among isolates is consistent with the origin of the isolates.
Mycobacterium strains and genomic DNA isolation
The clinical isolates of M. tuberculosis used in this study have been maintained in the TB immunology laboratory, Department of Biotechnology, All India Institute of Medical Sciences (AIIMS), New Delhi, India. These isolates were obtained from clinical samples derived from patients diagnosed with extra-pulmonary tuberculosis. The isolates were obtained from: (1) cerebrospinal fluid (CSF) of patients clinically diagnosed as cases of Tubercular meningitis; (2) from fine needle aspirates / biopsies from patients with lymphadenopathy with discharging sinus / abscess formation, (Table 1). All samples, after due processing, were inoculated on Lowenstein-Jensen (LJ) slants for primary isolation. Separate single colonies were propagated on LJ media, whereas in case of confluent growth, the samples were processed and re-inoculated on fresh LJ slants for obtaining single colonies. Genomic DNA was extracted from each of the single colony isolates using standard protocol. DNA was purified as per manufacturer’s instructions by the QIAGEN column and the DNA preparation kit from illumina. After library preparation, the genomic DNA was fragmented in the range of 100 to 800 bases. The resulting fragmented DNA was cleaned up using QIAquick columns (QIAGEN). The size distribution was checked by running aliquots of the samples on AgilentBioanalyzer 7500 Nano chips. Illumina adapters were ligated to each fragment. Fragments of ~ 300 bases were separated using Gel electrophoresis and sequenced at both ends using illumina GAII sequencer. Sequencing depth in these strains varied from 50 - 200x with average read length of 72 bases.
Pre-processing and mapping of Short-reads with reference genome
Sequencing error increases at the end of each cycle, so trimming of short reads is therefore a vital process that could improve the quality of mapping of short reads to the reference genome and identification of single nucleotide variations . Trimming of the short reads was done depending on the average Phred quality score per base in each strain. The average size of reads after trimming was 60 nucleotides. Non-ATGC character containing reads were also filtered before alignment.
M. tuberculosis H37Rv (NC_000962.2) was used as reference genome for the mapping of the short reads. All the short reads in each of the strains were separately mapped with the reference genome using Bowtie version 0.12.7 . To make the alignment more stringent, we used two criteria: (a) maximum of two mismatches were in the seed region of the reads and maximum sum of mismatch quality across alignment is less than 70 (b) disallowing all the reads that map at multiple sites in the reference genome. The reference genome coverage of all the strains ranged from 86 - 94% with minimum read depth of 10. Short reads are also aligned following less stringent criteria allowing reads to map multiple positions, which helped in identification of repeat genes.
Identification of major and minor SNVs
SNVs were identified from the alignment by using our in-house Perl scripts and were classified into two classes, major SNVs and minor SNVs. Characteristic of major SNVs are: (a) consensus base ratio (i.e. number of nucleotides other than the reference divided by total number of nucleotides at a position) more than 0.9, (b) minimum and maximum read depth are 10 and 3 x (average read depth) respectively, (c) consensus base should be supported by reads aligning in both forward and reverse directions, and (d) absence of any other variations present in a 3 bp window. Similarly minor SNVs have: (a) consensus base ratio less than 0.9, (b) more than 10% of the reads in a positions that show the same nucleotide change while the rest of the short reads have the same nucleotide as the reference and (c) variable base should be supported by at least one overlapping paired end reads. Detailed results are listed in Additional file 2: Table S2.
Genetic heterogeneity model using simulated short reads
Normally bacterial genome sequencing is carried out using cells derived from a single colony. If the culture is pure, it is expected that all the members of the population will display identical variations as compared to the reference genome with consensus base ratio near 1. On the other hand if the population contains more than one type of bacteria, then some members will show variations while rest will be identical to the reference sequence. These are rare variations. Relative level of a rare variant depends on the nature of the population that is the level of mixing of different genotypes. Here we tried to model the major and minor variations in a population using short read data generated by simulation.
Let the number of reads generated from CDC1551 be “m”. So, 80% of the genome will have approximately 0.8 x m reads. Now if (0.8 x m)/9 reads from 80% of H37Rv are mixed with these, 80% of tuberculosis genome will be represented by reads from CDC1551 and H37Rv in 9:1 ratio. Similarly 15, 3, 2 and 1% of the genome were represented by 8:2,7:3,6:4 and 5:5 reads from CDC1551 and H37Rv respectively. And then all the reads were aligned with the reference M. tuberculosis CDC1551 by bowtie 0.12.7 . After the alignment, randomly generated SNVs would show up as major variations while SNVs unique to H37Rv as rare variants.
The distribution of consensus base ratio in extra-pulmonary isolates and that in simulated data are plotted together (See Figure 5).
Identification of missing genes and insertion elements
Missing genes were identified from the alignment of reads with multiple maps from each isolate by identifying regions with minimum read depth below 10 and gene coverage less than 10. These regions are either missed while sequencing or are deleted in the genome of one of the isolates. Large deletions from the isolates were identified using Pindel . As sequencing depth normally achieved is around 100 for the whole genome, there is less chance of missing these regions during sequencing.
Total number of major SNVs identified in all five isolates was 4550 and out of these, 2989 are present in coding regions and only in one of the isolates (unique SNV). We used all unique SNV positions for calculating distance between the isolates using pair wise alignment. Distance based approach (“Neighbor-Joining”) was used to generate phylogenetic relationship (see Figure 8). Branch length indicates divergence distance.
Validation of IS element prediction
DNA extraction from the isolates was carried out as described before. Briefly, a single colony of M. tuberculosis was picked and suspended in 100 μl of 0.1% TritonX-100. The suspension was boiled in a dry bath at 90°C for 45 min and centrifuged at 10,000 rpm for 10 min. The supernate was used as template DNA in PCRs.
Primers were designed targeting the regions flanking the predicted co-ordinates of the region wherein IS6110 has been predicted to be present. The insertion of IS element would yield a larger PCR amplicon, compared with its absence at the targeted site. The primers used at various co-ordinates with the standardized PCR parameters have been described in Table 3. Briefly amplifications using primer panels were carried out for 35 cycles with 5 min initial and 1 min cyclic denaturation at 95°C; 45 sec annealing at standardized temperature and 2 min cyclic and 7 min final extension at 72°C. For PCR amplification using IS6110 internal primers (Panel H), 30 cycles of denaturation at 95°C for 30 sec, annealing at 65°C for 30 sec and extension at 72°C for 45 sec was carried out. PCRs were set up with reagents obtained from Fermentas AB, Vilnius, Lithuania, using a thermocycler (Applied Biosystems, USA). The amplicons were analyzed on 1.5% agarose gel against a 100 bp DNA Ladder (Thermo Scientific).
Spoligotyping of M. tuberculosis
Spoligotyping kit was purchased from Isogen Life Science (De Meern, Netherlands). The kit contained the hybridized membrane for spoligotyping, mini-blotter and biotin-labeled primers DRa (5’-GGT TTT GGG TCT GAC GAC-3’) and DRb (5’-CCG AGA GGG GAC GGA AAC-3’) and controls. Primer DRa was biotin labeled hence the amplified PCR product was biotinylated. The detection was done with the streptavidin-POD (peroxidase)-conjugate and chemiluminescence (ECL) detection system. This system was purchased from Roche Applied Science (Mannheim, Germany). Whole procedure was performed according to the manufacturer’s instructions. Detailed procedures were described previously by Kamerbeek et al. 1997. The DNA of different strains was amplified by primers DRa and DRb and the amplified products were hybridized with membranes containing the oligonucleotide probes.
Forward (DRa) and reverse (DRb) primers, dNTP, 10X buffer, Taq and DNA template were mixed together and added to 50 μl double-distilled water. PCR reaction was performed using Taq polymerase under the recommended conditions namely 96°C for 3 min, then at 96°C for 1 min, 55°C for 1 min and 72°C for 1 min. This procedure was followed for 30 cycles and final extension was for 10 min at 72°C.
The biotin labeled PCR product was loaded to a mini-blotter for hybridization with the membrane containing the oligo-nucleotide probes. Mini-blotter setup was incubated at 60°C for 60 min. The membrane was then washed at 60°C with 2X SSPE/0.5% SDS for 10 min, followed by incubation with 2X SSPE/0.5% SDS containing 2.5 μl streptavidin-biotin at 42°C for 60 min. Finally, the membrane was washed twice with 2X SSPE/0.5%SDS at 42°C for 60 min and twice with 2X SSPE for 5 min.
Detection of hybrid DNA with ECL
The membrane was incubated with the ECL detection system for 1 min and then covered with a transparent plastic film. The membrane was then placed a cassette and exposed to X-ray film.
The Department of Biotechnology, Government of India for financial support, the Council of Scientific & Industrial Research, India for research fellowship to S. Das.
- WHO: Tuberculosis. 2013, Reviewed February 2013: Fact sheet No. 104, http://www.who.int/mediacentre/factsheets/fs104/en/index.html; WHO Media CentreGoogle Scholar
- Fanning A: Tuberculosis: 6. Extrapulmonary disease. CMAJ: Canadian Medical Association journal = journal de l’Association medicale canadienne. 1999, 160: 1597-1603.PubMedGoogle Scholar
- Golden MP, Vikram HR: Extrapulmonary tuberculosis: an overview. Am Fam Physician. 2005, 72: 1761-8.PubMedGoogle Scholar
- Sharma SK, Mohan A: Extrapulmonary tuberculosis. Indian J Med Res. 2004, 120: 316-53.PubMedGoogle Scholar
- Krishnan N, Robertson BD, Thwaites G: The mechanisms and consequences of the extra-pulmonary dissemination of Mycobacterium tuberculosis. Tuberculosis (Edinb). 2010, 90: 361-366. 10.1016/j.tube.2010.08.005.View ArticleGoogle Scholar
- Varma TR: Genital tuberculosis and subsequent fertility. International journal of gynaecology and obstetrics: the official organ of the International Federation of Gynaecology and Obstetrics. 1991, 35: 1-11. 10.1016/0020-7292(91)90056-B.View ArticleGoogle Scholar
- Gupta N, Sharma JB, Mittal S, Singh N, Misra R, Kukreja M: Genital tuberculosis in Indian infertility patients. International journal of gynaecology and obstetrics: the official organ of the International Federation of Gynaecology and Obstetrics. 2007, 97: 135-8. 10.1016/j.ijgo.2006.12.018.View ArticleGoogle Scholar
- Abebe M, Lakew M, Kidane D, Lakew Z, Kiros K, Harboe M: Female genital tuberculosis in Ethiopia. International journal of gynaecology and obstetrics: the official organ of the International Federation of Gynaecology and Obstetrics. 2004, 84: 241-6. 10.1016/j.ijgo.2003.11.002.View ArticleGoogle Scholar
- Brown T, Nikolayevskyy V, Velji P, Drobniewski F: Associations between Mycobacterium tuberculosis strains and phenotypes. Emerg Infect Dis. 2010, 16: 272-80. 10.3201/eid1602.091032.PubMed CentralView ArticlePubMedGoogle Scholar
- Cole ST, Brosch R, Parkhill J, Garnier T, Churcher C, Harris D, Gordon SV, Eiglmeier K, Gas S, Barry CE, Tekaia F, Badcock K, Basham D, Brown D, Chillingworth T, Connor R, Davies R, Devlin K, Feltwell T, Gentles S, Hamlin N, Holroyd S, Hornsby T, Jagels K, Krogh A, McLean J, Moule S, Murphy L, Oliver K, Osborne J, et al: Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature. 1998, 393: 537-44. 10.1038/31159.View ArticlePubMedGoogle Scholar
- Vishnoi A, Roy R, Prasad HK, Bhattacharya A: Anchor-based whole genome phylogeny (ABWGP): a tool for inferring evolutionary relationship among closely related microorganisms [corrected]. PLoS One. 2010, 5: e14159-10.1371/journal.pone.0014159.PubMed CentralView ArticlePubMedGoogle Scholar
- Bharti R, Das R, Sharma P, Katoch K, Bhattacharya A: MTCID: a database of genetic polymorphisms in clinical isolates of Mycobacterium tuberculosis. Tuberculosis (Edinb). 2012, 92: 166-172. 10.1016/j.tube.2011.12.001.View ArticleGoogle Scholar
- Das S, Yennamalli RM, Vishnoi A, Gupta P, Bhattacharya A: Single-nucleotide variations associated with Mycobacterium tuberculosis KwaZulu-Natal strains. J Biosci. 2009, 34: 397-404. 10.1007/s12038-009-0046-y.View ArticlePubMedGoogle Scholar
- Das S, Duggal P, Roy R, Myneedu VP, Behera D, Prasad HK, Bhattacharya A: Identification of Hot and Cold spots in genome of Mycobacterium tuberculosis using Shewhart Control Charts. Sci Rep. 2012, 2: 297-PubMed CentralPubMedGoogle Scholar
- Brudey K, Driscoll JR, Rigouts L, Prodinger WM, Gori A, Al-Hajoj SA, Allix C, Aristimuño L, Arora J, Baumanis V, Binder L, Cafrune P, Cataldi A, Cheong S, Diel R, Ellermeier C, Evans JT, Fauville-Dufaux M, Ferdinand S, Garzelli C, Gazzola L, Gomes HM, Guttierez MC, Hawkey PM, Van Helden PD, Kadival GV, Kreiswirth BN, Kremer K, Kubin M, Garcia de Viedma D, et al: Mycobacterium tuberculosis complex genetic diversity: mining the fourth international spoligotyping database (SpolDB4) for classification, population genetics and epidemiology. BMC Microbiol. 2006, 6: 23-10.1186/1471-2180-6-23.PubMed CentralView ArticlePubMedGoogle Scholar
- Kumar P, Sen MK, Chauhan DS, Katoch VM, Singh S, Prasad HK: Assessment of the N-PCR assay in diagnosis of pleural tuberculosis: detection of M. tuberculosis in pleural fluid and sputum collected in tandem. PLoS One. 2010, 5: e10220-10.1371/journal.pone.0010220.PubMed CentralView ArticlePubMedGoogle Scholar
- Ye K, Schulz MH, Long Q, Apweiler R, Ning Z: Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics (Oxford, England). 2009, 25: 2865-2871. 10.1093/bioinformatics/btp394.View ArticleGoogle Scholar
- Schell MA: Molecular biology of the LysR family of transcriptional regulators. Annu Rev Microbiol. 1993, 47: 597-626. 10.1146/annurev.mi.47.100193.003121.View ArticlePubMedGoogle Scholar
- Ghadbane H, Brown AK, Kremer L, Besra GS, Fütterer K: Structure of Mycobacterium tuberculosis mtFabD, a malonyl-CoA:acyl carrier protein transacylase (MCAT). Acta Crystallogr Sect F Struct Biol Cryst Commun. 2007, 63 (Pt 10): 831-5.PubMed CentralView ArticlePubMedGoogle Scholar
- Hatfull GF, Bibb L a: Integration and excision of the Mycobacterium tuberculosis prophage-like element, phiRv1. Mol Microbiol. 2002, 45: 1515-1526. 10.1046/j.1365-2958.2002.03130.x.View ArticlePubMedGoogle Scholar
- Ioerger TR, Feng Y, Ganesula K, Chen X, Dobos KM, Fortune S, Jacobs WR, Mizrahi V, Parish T, Rubin E, Sassetti C, Sacchettini JC: Variation among genome sequences of H37Rv strains of Mycobacterium tuberculosis from multiple laboratories. J Bacteriol. 2010, 192: 3645-53. 10.1128/JB.00166-10.PubMed CentralView ArticlePubMedGoogle Scholar
- Henry I, Sharp PM: Predicting gene expression level from codon usage bias. Mol Biol Evol. 2007, 24: 10-2.View ArticlePubMedGoogle Scholar
- Andersson GE, Sharp PM: Codon usage in the Mycobacterium tuberculosis complex. Microbiology. 1996, 142: 915-925. 10.1099/00221287-142-4-915.View ArticlePubMedGoogle Scholar
- Midha M, Prasad NK, Vindal V: MycoRRdb: a database of computationally identified regulatory regions within intergenic sequences in mycobacterial genomes. PLoS One. 2012, 7: e36094-10.1371/journal.pone.0036094.PubMed CentralView ArticlePubMedGoogle Scholar
- Vindal V, Ashwantha Kumar E, Ranjan A: Identification of operator sites within the upstream region of the putative mce2R gene from mycobacteria. FEBS Lett. 2008, 582: 1117-1122. 10.1016/j.febslet.2008.02.074.View ArticlePubMedGoogle Scholar
- Kendall SL, Burgess P, Balhana R, Withers M, Ten Bokum A, Lott JS, Gao C, Uhia-Castro I, Stoker NG: Cholesterol utilization in mycobacteria is controlled by two TetR-type transcriptional regulators: kstR and kstR2. Microbiology. 2010, 156: 1362-1371. 10.1099/mic.0.034538-0.PubMed CentralView ArticlePubMedGoogle Scholar
- Festa RA, Jones MB, Butler-Wu S, Sinsimer D, Gerads R, Bishai WR, Peterson SN, Darwin KH: A novel copper-responsive regulon in Mycobacterium tuberculosis. Mol Microbiol. 2011, 79: 133-48. 10.1111/j.1365-2958.2010.07431.x.PubMed CentralView ArticlePubMedGoogle Scholar
- Warren RM, Victor TC, Streicher EM, Richardson M, Beyers N, van Pittius NC G, Van Helden PD: Patients with active tuberculosis often have different strains in the same sputum specimen. Am J Respir Crit Care Med. 2004, 169: 610-614. 10.1164/rccm.200305-714OC.View ArticlePubMedGoogle Scholar
- Shamputa IC, Rigouts L, Eyongeta LA, El Aila NA, Van Deun A, Salim AH, Willery E, Locht C, Supply P, Portaels F: Genotypic and phenotypic heterogeneity among Mycobacterium tuberculosis isolates from pulmonary tuberculosis patients. J Clin Microbiol. 2004, 42: 5528-36. 10.1128/JCM.42.12.5528-5536.2004.PubMed CentralView ArticlePubMedGoogle Scholar
- Shaheen R, Subhan F, Tahir F: Epidemiology of genital tuberculosis in infertile population. JPMA The Journal of the Pakistan Medical Association. 2006, 56: 306-9.PubMedGoogle Scholar
- MacLean A: Dewhurst’s textbook of Obstetrics and Gynaecology For Postgraduates. Dewhurst’s textbook of Obstetrics and Gynaecology For Postgraduates. 1995, London: Blackwell Science Ltd, 562-567. 5Google Scholar
- Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, Kiryutin B, Galperin MY, Fedorova ND, Koonin EV: The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001, 29: 22-8. 10.1093/nar/29.1.22.PubMed CentralView ArticlePubMedGoogle Scholar
- Doerks T, Van Noort V, Minguez P, Bork P: Annotation of the M. tuberculosis hypothetical orfeome: adding functional information to more than half of the uncharacterized proteins. PLoS One. 2012, 7: e34302-10.1371/journal.pone.0034302.PubMed CentralView ArticlePubMedGoogle Scholar
- Mukhopadhyay S, Balaji KN: The PE and PPE proteins of Mycobacterium tuberculosis. Tuberculosis (Edinb). 2011, 91: 441-447. 10.1016/j.tube.2011.04.004.View ArticleGoogle Scholar
- Akhter Y, Ehebauer MT, Mukhopadhyay S, Hasnain SE: The PE/PPE multigene family codes for virulence factors and is a possible source of mycobacterial antigenic variation: perhaps more?. Biochimie. 2012, 94: 110-6. 10.1016/j.biochi.2011.09.026.View ArticlePubMedGoogle Scholar
- Li Y, Miltner E, Wu M, Petrofsky M, Bermudez LE: A Mycobacterium avium PPE gene is associated with the ability of the bacterium to grow in macrophages and virulence in mice. Cell Microbiol. 2005, 7: 539-48.View ArticlePubMedGoogle Scholar
- Brunham RC, Plummer FA, Stephens RS: Bacterial antigenic variation, host immune response, and pathogen-host coevolution. Infect Immun. 1993, 61: 2273-6.PubMed CentralPubMedGoogle Scholar
- Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005, 102: 15545-50. 10.1073/pnas.0506580102.PubMed CentralView ArticlePubMedGoogle Scholar
- Huang DW, Sherman BT, Lempicki RA: Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2009, 37: 1-13. 10.1093/nar/gkn923.PubMed CentralView ArticleGoogle Scholar
- Huang DW, Sherman BT, Lempicki RA: Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009, 4: 44-57.View ArticleGoogle Scholar
- Eisenach KD: Use of an insertion sequence for laboratory diagnosis and epidemiologic studies of tuberculosis. Ann Emerg Med. 1994, 24: 450-3. 10.1016/S0196-0644(94)70182-2.View ArticlePubMedGoogle Scholar
- Gunisha P, Madhavan HN, Jayanthi U, Therese KL: Polymerase chain reaction using IS6110 primer to detect Mycobacterium tuberculosis in clinical samples. Indian J Pathol Microbiol. 2001, 44: 97-102.PubMedGoogle Scholar
- Sankar S, Ramamurthy M, Nandagopal B, Sridharan G: An appraisal of PCR-based technology in the detection of Mycobacterium tuberculosis. Mol Diagn Ther. 2011, 15: 1-11. 10.1007/BF03257188.View ArticlePubMedGoogle Scholar
- Warren RM, Van Helden PD, van Pittius NC G: Insertion element IS6110-based restriction fragment length polymorphism genotyping of Mycobacterium tuberculosis. Methods in molecular biology (Clifton, NJ). 2009, 465: 353-370. 10.1007/978-1-59745-207-6_24.View ArticleGoogle Scholar
- Safi H, Barnes PF, Lakey DL, Shams H, Samten B, Vankayalapati R, Howard ST: IS6110 functions as a mobile, monocyte-activated promoter in Mycobacterium tuberculosis. Mol Microbiol. 2004, 52: 999-1012. 10.1111/j.1365-2958.2004.04037.x.View ArticlePubMedGoogle Scholar
- Qu W, Hashimoto S-I, Morishita S: Efficient frequency-based de novo short-read clustering for error trimming in next-generation sequencing. Genome Res. 2009, 19: 1309-15. 10.1101/gr.089151.108.PubMed CentralView ArticlePubMedGoogle Scholar
- Langmead B: Aligning short sequencing reads with Bowtie. Current protocols in bioinformatics. 2010, 32: 11.7.1-11.7.14.Google Scholar
- Li H, Ruan J, Durbin R: Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 2008, 18: 1851-8. 10.1101/gr.078212.108.PubMed CentralView ArticlePubMedGoogle Scholar
- Kent WJ: BLAT–-The BLAST-Like Alignment Tool. Genome Res. 2002, 12: 656-664.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.