The main purpose of the present study was to analyze the changes in gene content of the B. pertussis population in the Netherlands and five other countries between 1949 and 2008. This analysis was conducted using microarray-based CGH methods in order to gain insight into the dynamics of the B. pertussis population. Microarray-based CGH has been widely used to assess the genome variability among bacterial species or closely related bacteria. Given that sequencing of strains on a large scale is still time-consuming and laborious, CGH may resolve the problem to some extent by applying the available genome sequence information. This method can supply additional information about the genome composition and also provide the opportunity to analyze many more unsequenced strains on a genomic scale. Even though this technique has several limitations, we believe that the use of CGH technology, combined with confirmatory PCRs, allows sufficient assessment of the genetic diversity and gene content among B. pertussis strains. The high-throughput capability of microarrays enabled us to analyze more than 170 strains. One limitation of microarray-based analysis is that the detection of elements is limited to the presence of a gene in the microarray. In these studies, we used a microarray that is based on the sequence of the Tohama I B. pertussis strain, as well as all of the extra genes found in Bpp strain 12822 and Bb strain RB50. The Tohama I strain was shown not to be a good representative for B. pertussis strains . Full genome DNA sequencing of three other B. pertussis strains by Bouchez et al  showed that not all of the novel genes were found in the Tohama I strain, but originated from the Bpp or Bb species. Since B. pertussis does not appear to acquire any new genetic material [1, 42, 49], we expect that if novel genes are present in the B. pertussis strains analyzed in these studies,they are likely to originate from Bpp or Bb species. By using a Bordetella pan-microarray, most of the genes present in the B. pertussis strains will be covered. Future DNA sequencing of additional B. pertussis strains will demonstrate if this assumption is justified. In this study, indeed the analysis of 171 B. pertussis strains identified the presence of 12 new genes in the B. pertussis species, which originate from the B. parapertussis/B. bronchiseptica species.
CGH analysis for a large series B. pertussis strains, isolated in six different countries, has resulted in an estimate of the core genome composition of B. pertussis. This analysis has also suggested the degree and nature of the genome flexibility between strains. A total of 3,282 genes were identified as belonging to the core genome. These genes were found in every strain analyzed until present time (this study and [42–45]), and appear to be essential to the lifestyle of this bacterium. The accessory (variably present) portion (589 genes) of the B. pertussis genome corresponds to about 15% of the pan- B. pertussis genome. Genes lost, assumed to not to be essential for B. pertussis survival in the human host, are enriched for genes involved in transport and binding, hypothetical genes and genes encoding unclassified functions. Accessory genes were confined to certain regions (RDs) in the chromosome, mostly flanked to an IS element (ISE) on at least one side of the RD. The IS 481 element, which is abundantly present (238 times) in the B. pertussis genome of the Tohama I strain, may be involved in the process of gene loss by facilitating homologue recombination between these perfect repeats within the genome. Previous studies have shown that repeats can be promoters of gene deletion [50, 51], or can also result in large scale chromosomal rearrangements leading to disruption of the ancestral gene order .
Three main forces have been found to shape genome evolution; gene gain, gene loss and gene change . Gene loss seems to be an important event in the genomic evolution of the B. pertussis species. B. pertussis evolved from a B. bronchiseptica -likeancestor probably by large scale gene loss . Much of this loss is most likely due to ISE-mediated deletion events and/or ISE mediated rearrangements, which reshaped the genome presumably to benefit from increased virulence expression. In evolution from B. bronchiseptica to B. pertussis the genes that are lost or inactivated are generally those involved in membrane transport, small molecule metabolism, regulation of gene expression and synthesis of surface structures . It seems that gene loss began during the evolution from B. bronchiseptica, and is continuing during the evolution of B. pertussis. This conclusion is supported by the results presented in this study that illustrate a progressive decrease in the genome size over time in strains isolated in different countries. Moreover, the genes that were lost exhibited similar functions as the genes lost in the evolution from B. bronchiseptica to B. pertussis (see above). Gene loss in B. pertussis has been previously reported for Finnish  and French B. pertussis strains . However, to our knowledge, such a rapid decline in gene loss during a period of 60 years has not yet been described. The loss of genetic material is a dynamic, ongoing process that is not specific for one country, but observed in several areas of the world. In other examples of bacteria gene loss, the process has been described as a progressive purging of unnecessary genes from the genome . Bacteria appear to prefer gene deletions, which could account for a general drive to lose DNA. It is generally assumed, that the bacterial deletion which offers the least negative fitness effect on the host will be selected. However, gene loss can lead to benefits for the pathogen as well, because some gene products are detrimental to pathogenic lifestyle. For example, loss of the cadA gene from the Shigella bacterium has been correlated with an increase of Shigella pathogenicity . Additionally, the preferential loss of bacterial cell surface determinants has been shown to result in an increase in virulence by reducing the number of targets that could be recognized by the human immune system .
An intriguing question is what other factors influence the size and content of bacterial genomes? The introduction of vaccination against pertussis has been associated with changes in the B. pertussis population [12, 36]. Thus, we investigated whether pertussis vaccination, during several decades, could influence the size and genomic content of the B. pertussis population. Therefore, B. pertussis strains from countries with different pertussis vaccination histories were analyzed. Our collection of B. pertussis isolates were divided into time periods of significant events that were observed in the epidemiology of pertussis. In order to gain more insight into how the B. pertussis strains are related, we used the genomic content overview of the B. pertussis isolates to construct a phylogenetic structure. Recently, this method was also used, in a similar manner, to decipher the microevolution of V. parahaemolyticus .
Clustering based on genomic content showed that B. pertussis strains isolated in different countries are mostly similar and did not reveal a particular geographic region of a specific strain. However, an analysis of the gene content of B. pertussis strains over time demonstrated a gradual change in gene content over a period of 60 years (Figure 5 and Table 3). In the Netherlands, strains that were found approximately 60 years ago cannot usually be isolated today. In the last 15 years, strains with a different gene content (GC-type 2) have emerged in several countries. The increase of these strains has most likely caused or influenced the resurgence of pertussis in many countries in the last decade [10–12]. In general, strains have changed in the Netherlands, Sweden and Japan, somewhat irrespective of any difference in vaccination history. It is remarkable that the gene content of strains isolated during certain timeframes is similar for all countries, even in the countries with low vaccination coverage. These results suggest that the differences in vaccination history have little or no influence in this process.
On the other hand, in the Dutch B. pertussis population we have seen changes that seem to be influenced by alterations in herd immunity. For example, GC-types 31-36 strains, similar to the vaccine 509-strain, appear to be completely removed from the population following the commencement of vaccination. In contrast, strains with similar gene content to the other vaccine strain 134 (e.g. GC-type 1), have persisted until at least 2008. Additionally, shortly after a change in the vaccine dose in the Netherlands, we have detected the expansion of a clone of B. pertussis strains, characterized by the absence of BP1698 and BP2167-80, and identified as GC-types 4 and 9. This clone, also characterized by MLVA types 130, 132, 134 and 137  was able to expand between 1978 and 1988, leading to a pertussis epidemic between 1983 and 1988. However, this clone disappeared again following restoration of the vaccine dose to the original level. Herd immunity was possibly lost because of a decline in vaccine-derived immunity, which was a result of a change in the vaccine composition. This deviation appears to have selected for a change in the composition of circulating B. pertussis strains. In 2007-2008, we found the expansion of another clone of B. pertussis strains with an additional deletion (e.g. BPP0822-27). It is not yet known what factor induced the expansion of these strains, although the introduction of the ACV in 2005 in the Netherlands may be involved. However, we detected a single strain with the same gene content for the first time in 1996. In Sweden, where vaccination was ceased for a longer period, we did not see the rise of new B. pertussis clones that were involved in epidemics. This may possibly be due to the low number of isolates tested. During the last 15 years, the expansion of B. pertussis strains in multiple countries that carry the ptxP3 allele [18, 45], may also have been influenced by intensive pertussis vaccination for half a century. Since pertussis has shifted to older age groups [11, 56] in immunized populations during the last decade, it has been suggested that B. pertussis has adapted to the host population with waning immunity, in order to maintain the bacterial reservoir among older hosts [18, 57]. Mooi et al  proposed that strains surviving better in these hosts are being selected for by vaccination of young infants. Recently, it was shown that the rise of strains carrying the ptxP3 allele, which are also characterized by a particular gene content  (GC-types 2, and its derivatives) and seem to have a benefit in older hosts, has contributed to the resurgence of pertussis in the Netherlands . Thus, the size and gene content of the B. pertussis genome appear to be influenced by vaccination by herd immunity. Immune pressure may select for certain strains with a particular advantage, and which may be linked to specific gene content. The fact that we see the same or similar strains in different countries during certain time periods suggests that an important advantage for these strains may be their capability to spread throughout the immunized population. Importantly, strains with the same GC-type were found to possess different MLST types, which suggest that they are not a single clone.
In our previous study we found that all Dutch strains carrying the ptxP3 allele were characterized by the absence of the BP1948-66 gene cluster . In this study, we confirmed the similarity of the gene content of strains carrying the ptxP3 allele in strains isolated in different countries. Moreover, we found that the ptxP allele is a good marker for strains that have similar gene content, since also the ptxP2 and ptxP6 strains are associated with the absence or presence of specific gene clusters (Figure 6). This indicates that strains with different ptxP types have different genetic backgrounds and form different lineages. Thus, these strains do not differ only in one point mutation, and therefore specific properties of these strains can also be related to this typical gene content. The ptxP1 strains are the most diverse in gene content although most gene clusters are typically lost in one strain or a specific group of strains. The ptxP1 strains missing BP0910-33, BP1135-41, BPP0511-2 and BPP2338-47 has the most single locus and double locus variants indicating that this is the older type. Strains with these characteristics are found in at least five different countries. The minimum spanning tree clearly showed that the ptxP3 clone emerged from the ptxP1 strain.