PorSignDB: A gene set collection characterizing a compendium of in vivo transcriptomic profiles
We first created PorSignDB, a collection of porcine gene signatures, using a systematic approach previously developed for inference of the immunologic gene signature collection ImmuneSigDB [13]. Specifically, we compiled a large gene expression compendium curated from 65 studies including 1069 unique samples. A total of 256 annotated gene sets were derived from 128 pairwise comparisons identifying genes induced and repressed in one phenotype versus another, annotated as ‘UP’ (PHENOTYPE1_VS_PHENOTYPE2_UP) and ‘DOWN’ (PHENOTYPE1_VS_PHENOTYPE2_DN) gene sets, respectively (Fig. 1a). To illustrate this, an example is given for a study comparing lymph nodes of pigs experimentally infected with Salmonella enterica Typhimurium versus those of uninfected pigs [14]. Upregulated genes (UP gene set) are highly expressed in the Salmonella-infected phenotype, while downregulated genes (DN gene set) are highly expressed in the uninfected phenotype (Fig. 1b). Gene Ontology (GO) biological process gene enrichment was performed for every gene set, and provides an overview of the biological information captured in this signature database (Additional file 1). Gene set pairs where neither UP or DN yielded a single significant GO term enrichment hit (Benjamini-Hochberg corrected p-value < 0.05) were discarded in order to retain only biologically meaningful gene sets.
This approach has a number of advantages over ImmuneSigDB. First of all, ImmuneSigDB mainly covers in vitro samples. For PorSignDB however, samples were predominantly derived from real-life patients or laboratory animals (900 in vivo and 157 primary ex vivo specimens out of a total of 1069). In consequence, it constitutes a more natural description of the biological processes going on in real-life situations. In addition, while ImmuneSigDB only describes immune cell transciptomics, the scope of PorSignDB is much wider because its samples were derived from a multitude of different tissues (Fig. 1c). Together, they describe host responses in an entire range of biological themes, with a major part stemming from studies on microbiology, gastroenterology and the cardiovascular system (Fig. 1d).
Of note, porcine genes and individual probes were mapped to Homo sapiens ortholog genes. Because many transcriptional programs are evolutionarily conserved, cross-species gene expression analysis can be applied successfully [15, 16]. Moreover, molecular signature databases are often human-oriented, and the porcine-to-human adaptation of PorSignDB thus facilitates its application to genomic expression data of any species.
To demonstrate the validity of the information contained in the PorSignDB gene sets, we examined a study in which healthy human lungs were exposed to either lipopolysaccharide (LPS) or saline infusion in vivo [17]. In this particular study, alveolar macrophages were obtained through bronchoalveolar lavage and their transcriptomes mapped with microarray. We compared transcriptomic profiles of LPS-exposed macrophages with saline-solution exposed macrophages, and tested signatures from PorSignDB for their enrichment (induced or repressed) using Gene Set Enrichment Analysis (GSEA). Interestingly, PorSignDB also contains pairwise signatures of LPS-stimulated macrophages VS unstimulated macrophages e.g. 2H_VS_0H_LPS_STIMULATION_BONE-MORROW_DERIVED_MACROPHAGES. Indeed, PorSignDB’s gene signatures of LPS-stimulated macrophages were highly induced (Fig. 1e, UP gene sets), while the pairwise gene signatures of unstimulated macrophages were repressed (Fig. 1e, DN gene sets). This shows that PorSignDB signatures can be reproduced in comparable human datasets.
Next, we hypothesized that PorSignDB can be useful because it can label samples with the tissue-specific host-responses that they resemble. In this way, they may provide new insight into genomic data. As an example, we examined an RNA-seq dataset of a mouse myocardial infarction model. In this study, interferon regulatory factor 3 (IRF3) knockout mice (IRF3−/−) showed improved cardiac function and limited heart failure post myocardial infarction [18]. When comparing the myocardial transcriptomes of wild type (wt) with cardioprotective IRF3−/− knockout mice in GSEA, PorSignDBs myocardial infarction tissue signatures were induced (Fig. 1f, UP), while non-infarcted healthy control heart tissue signatures were suppressed (Fig. 1f, DN). In other words, wt myocardial tissue was labeled as ‘infarcted’, while IRF3−/− knockout heart tissue was identified as ‘healthy control’, corroborating their respective phenotypes. The PorSignDB myocardial infarction signatures thus provide additional evidence of IRF3 as a driver of heart failure in response to myocardial infarction. This example demonstrates that PorSignDB can be applied to any mRNA sequencing platform, and is therefore not limited to the original Affymetrix porcine system microarray from which the gene sets were derived.
Finally, the presence of multiple “viral” and “bacterial” gene signatures in PorSignDB prompted the question whether these signatures are heterogeneous, or whether they represent a single similar “infection” readout. In order to investigate this, we calculated gene overlap between bacterial and viral gene signatures (Additional file 2). This analysis shows that only minor overlap exists. This argues that the majority of viral and bacterial-related signatures represent unique readouts of host responses. Similarly, the presence of Salmonella Typhimurium and Salmonella Choleraesuis gene sets raised the question of to what extent these molecular signatures share the same information. However, gene overlap through hypergeometric test did not yield any significant hit (Benjamini-Hochberg corrected p-value < 0.05) (Additional file 3), indicating that there is little redundancy between the Salmonella Typhimurium and Choleraesuis gene sets.
The PorSignDB gene signatures are available as an online resource (http://www.vetvirology.ugent.be/PorSignDB/; Additional files 4 and 5) and can be used by systems biologists to deconvolute cellular circuitry in health and disease. As proof of concept, we employed this gene signature collection describing host responses in a wide variety of tissues to generate new insights in the multisystemic disease associated with PCV2.
PorSignDB reveals diametrically opposed physiological states in vivo in subclinical PCV2 and PMWS
We then leveraged PorSignDB to analyze a field study of pigs naturally affected with PMWS [11]. To compare transcriptomic profiles of PMWS lymph nodes with PCV2-positive but otherwise healthy lymph nodes, we tested signatures from PorSignDB for their enrichment (induced or repressed) in both classes using GSEA (Fig. 2a). We primarily focused on gene sets pertaining to microbiology. For robustness, we only retained signatures from pairwise comparisons in case both upregulated (PHENOTYPE1_VS_PHENOTYPE2_UP) and downregulated (PHENOTYPE1_VS_PHENOTYPE2_DN) genes are significantly enriched (False discovery rate; FDR < 0.01). For example, UP genes in splenic tissue of “Streptococcus suis-infected pigs VS control pigs” are induced (Fig 2b, left heatmap first row), while DN genes are suppressed (Fig. 2b, right heatmap first row).
Overall, this analysis reveals that upregulated genes in “microbial challenge VS control” are induced while downregulated genes are suppressed. In other words, PMWS lymph nodes display transcriptomic reprogramming consistent with tissue responses on infectious agents. This observation is supported by previous findings that naturally occurring PMWS is presented with concurrent infections [7]. Strikingly, two genomic infection signatures do not follow this pattern. First, the opposite behavior of the gene signature from Salmonella Typhimurium 21 days post inoculation (dpi) suggests that the Salmonella infection has already been cleared at this timepoint. This is indeed the case: at 21dpi the bacterial load in these mesenteric lymph nodes was reduced to undetectable levels [19]. In contrast, S. Choleraesuis infection was sustained at 21dpi, coinciding with persistent high bacterium abundance in mesenteric lymph nodes. Intriguingly, the second deviating gene signature originates from pigs that were subclinically infected with PCV2 (Fig. 2a, arrow). Unlike S. Typhimurium, this cannot be explained by pathogen clearance since these experimentally PCV2-infected pigs remained viremic throughout the original study [12]. Instead, pathogen-distressed host responses appear here to be repressed in lymph nodes with low-level subclinical PCV2 replication. Hence, highly expressed genes in “subclinical PCV2-infected VS uninfected” lymph nodes are suppressed, while lowly expressed genes are induced. Finally, the gene sets PMWS_VS_HEALTHY_UP and PMWS_VS_HEALTHY_DN serve as positive control since they were derived from the data that was queried in this instance. PorSignDB signatures from other biological themes may provide additional clues into the alterations in lymph nodes that are subject to PMWS and could be explored further (Additional file 6, see also discussion).
Interestingly, the GO analysis of PorSignDB gene sets reveals that the subclinical PCV2 infection signature 29 dpi (UP) constitutes a transcriptional program implicated in cell cycle progression (Additional file 1, gene set 33). On the other hand, the uninfected pairwise signature (DN) summarizes myeloid leukocyte activation implicated in the immune response (Additional file 1, gene set 34). In other words, this analysis suggests that upon PCV2 subclinical infection, cell cycle progression is promoted, while myeloid leukocyte immune responses are suppressed. To confirm these findings, these gene sets were interrogated in lymph nodes of pigs of the same study, but at other time points [12]. Intriguingly, the onset of both the induction of UP (GO enrichment: “Cell cycle progression”) as the suppression of DN (GO: “Myeloid leukocyte activation”) was immediate, robust, and persisted throughout all time points (all FDRs < 0.001; Fig. 2c). It should be noted that the gene signatures were derived from the 29 DPI time point, which thus serves as a positive control. We recall from Fig. 2b that this runs counter to PMWS patients, where UP is repressed and DN is induced (both FDRs < 0.001).
From this data, it can be concluded that subclinical PCV2 infection simulates pathogen-free tissue, upregulates cell cycle regulator genes and represses myeloid leukocyte activation genes implicated in the immune response. Moreover, these biological processes are reversed in PMWS patients where cell cycle genes are suppressed and myeloid cell activation is induced.
A myeloid leukocyte mediated immune response signature predicts clinical outcome of PCV2
In an experimental setting, PCV2 alone does not lead to clinical signs. Additional superinfections or vaccination challenges are needed to produce PMWS [8]. Why extraneous immunostimulations trigger PMWS remains however poorly understood. A systems-level dissection of PCV2-affected lymphoid tissue may provide an explanation to this conundrum because it can determine which transcripts characterize PMWS, unbiased by previous knowledge. To this extent, the PMWS field study data was divided over a training and validation cohort, and 173 biomarker genes were selected from the training set using a leave-one-out cross validation (Fig. 3a, Additional file 7). Together, they reveal a molecular portrait of PCV2-associated lymphoid lesions. This ‘PCV2 disease signature’ is greatly induced in the validation cohort as shown by GSEA analysis, meaning upregulation of PMWS marker genes and downregulation of Healthy marker genes (Fig. 3b). Interestingly, in mediastinal lymph nodes with subclinical PCV2 at 29dpi, the disease signature is dramatically repressed when compared to lymph nodes of non-infected counterparts. This shows once more that in subclinical PCV2 the transcriptomic recalibration that goes hand in hand with PMWS is suppressed. To illustrate the fidelity of the PCV2 disease signature, individual samples were classified as either PMWS or healthy with the Nearest Template Prediction algorithm [20]. All samples of the validation set were correctly assigned (FDR < 0.05; Fig. 3c). Furthermore, all piglets from the experimental study, either PCV2 free or with subclinical PCV2, were correctly classified as Healthy. Only one sample failed to meet the < 0.05 FDR threshold (Fig. 3d). Furthermore, a Gene Ontology overrepresentation test indicated that the PMWS biomarker genes represent inflammatory responses and myeloid leukocyte immune activation (Additional file 8, Figure A). Of note, this gene signature performs better than an RNMI-based signature (Additional file 8, Figure B-C), which is more suited for small sample sizes and was therefore applied for generating PorSignDB.
Interestingly, when probing the kinetics of the PCV2 disease signature in lymph nodes of pigs experimentally infected with PCV2, S. Typhimurium or S. Choleraesuis, it is clear that these two bacterial infections promote the disease signature. In contrast, in subclinical PCV2 it is consistently suppressed (Fig. 3e-g). In S. Typhimurium the reversal of this clinical gene signature at 21 dpi coincides with the drop of bacterial load in the mesenteric lymph nodes to almost undetectable degree. This demonstrates from a systems-approach that the infection has been virtually cleared at this time point, unlike mesenteric lymph nodes upon S. Choleraesuis infection. In the latter, the persistence of the signature correlates with an enduring high bacterial lymph node colonization [19].
Taken together, PCV2-induced lymphoid depletion and granulomatous inflammation in PMWS patients can be summarized in a robust gene expression signature emblematic of myeloid leukocyte activation. This systems level analysis suggests that the initiation of a myeloid leukocyte mediated immune response is a pivotal event in the progression from subclinical PCV2 to PMWS.
Functional genomics identify regulatory networks perturbations in PCV2 disease
It is becoming increasingly clear that PMWS and subclinical PCV2 represent two opposing adaptations of lymphoid tissue to circoviral infection. To understand how this tiny virus arranges this tour de force, the data sets covering both the PMWS field study [11] and the experimentally induced subclinical PCV2 at 29 dpi [12] were interrogated in the GSEA computational system with the innovative Hallmark gene set collection [21]. This provides a very sensitive overview of alterations in a number of key regulatory networks and signaling pathways in both PMWS patients (Fig. 4a, leftmost column) and in pigs with persistent subclinical PCV2 (Fig. 4, second column). Since the molecular pathogenesis of PCV2 remains to this day mostly unexplored [10, 22], this may uncover several previously unknown network modifications [10, 22]. In lymphoid tissue of pigs with PMWS, many of the affected transcriptional networks echo key events in PCV2-associated lymphopathology such as blatant inflammatory activity (Hallmark gene set ‘Inflammatory response’) and caspase-mediated cell death (‘Apoptosis’). Increases in gene expression mediated by p53 (‘p53 pathways), reactive oxygen species (‘ROS pathway’) and NF-κB (‘TNFα signaling through NFκB’) reflect findings that PCV2 promotes p53 expression [1, 2] and triggers NFκB activation through ROS [23, 24] (Fig. 4, left column). Previously unidentified altered networks [10, 22] include immunological programs (‘Interferon alpha response’ and ‘Interferon gamma response’), cell signaling cascades (‘IL2-STAT5 signaling’, ‘IL6-JAK-STAT3 signaling’, ‘KRAS signaling up’) and bioenergetics (‘Glycolysis’ and ‘Hypoxia’).
Consistent with previous results, subclinical PCV2 infection generally fails to reproduce the imbalances associated with PMWS. Only the transcriptomic programs downstream of interferon-α and interferon-γ are in line with subclinical infections, suggesting a direct viral effect on these immunological networks. It should also be noted that the ‘Hallmark G2M checkpoint’, which describes a transcriptional cell cycle program, is induced in subclinical PCV2, and repressed in PMWS patients. This corroborates the earlier finding that genes implicated in cell cycle progression are upregulated upon subclinical infection, but downregulated in PMWS patients (Fig. 2c).
Most programs are however unaffected or opposed to the changes occurring in PMWS, reaffirming the running thread that subclinical PCV2 and PMWS represent two opposed transcriptomic recalibrations of lymph node tissue.
IL-2 supplementation enables ex vivo modelling of PCV2 in primary porcine lymphoblasts
An increase in viral load in lymphoid tissue is a key characteristic of PMWS [3]. In the PMWS field study, PCV2 copy number was also significantly higher in the PMWS lymph nodes compared to their healthy counterparts as measured by qPCR and in situ hybridization [11]. The Hallmark analysis therefore shows that an increase in the amount of PCV2 occurs in an environment where IL-2 responsive genes are upregulated (Fig. 4a). Given the pivotal role of IL-2 in activated T-cells during immune response [25], IL-2 may indeed be a crucial factor in boosting subclinical PCV2 towards PMWS. Intriguingly, the IL2-STAT5 signaling network is suppressed in subclinical PCV2, but not in S. Choleraesuis and S. Typhimurium, where there is a persistent and transient induction respectively (Fig. 5a). Again, in S. Typhimurium, the reversal of the IL-2 signature coincides with bacterial clearance.
The impact of IL-2 on PCV2 replication cannot be faithfully demonstrated with traditional PK15 kidney cells. Because PCV2 has a tropism for lymphoblasts, these are the cells of choice. Our lab previously demonstrated that treatment of freshly harvested PBMCs with concanavalin A (ConA) coerces T-cells into mitosis, rendering them permissive for PCV2 [26]. Unfortunately, lymphoblast proliferation can only be maintained for a very short time after which the cells forfeit viability and die of attrition. Indeed, when isolated lymphocytes are stimulated with ConA without IL-2, these cells start suffering from apoptosis even before the first passage at 72 h. However, supplementing ConA-stimulated lymphocytes with IL-2 generates continuously expanding primary porcine lymphoblasts (PPLs; Fig. 5b, c). These PPLs can be easily cultured, expanded and infected with PCV2 ex vivo, providing a bona fide target cell culture platform amenable for studying the PCV2 life cycle (Fig. 5d). To prove the beneficial effect of IL-2 on PCV2 replication, lymphocytes were freshly harvested from six individual pigs. IL-2 supplementation doubled PCV2 infection rates after 36 h, a timeframe amounting to a single round of replication (Fig. 5e). PCV2 titers in 5 out of 6 supernatants showed an increase upon IL-2 stimulation. A more sensitive method, measuring PCV2 genome copy numbers in cell culture supernatants showed a significant increase upon IL-2 stimulation for all 6 lymphoblast cell strains (Fig. 5f, g).
STAT3 is a PCV2 host factor and a target for antiviral intervention
Since transcriptional networks of PMWS lymphoid tissue are subject to dramatic changes that correlate with fulminant PCV2 replication, counteracting these alterations can potentially harm the viral life cycle. Given the fierce induction of gene expression downstream the IL6-JAK-STAT3 signaling cascade in PCV2 patients (Additional file 9, Figure A), STAT3 emerges as a druggable candidate host factor. Interestingly, STAT3 is a key regulator of inflammation often exploited by viruses with pathogenic consequences [27]. In a drug assay, treatment with selective STAT3 inhibitor Cpd188 exhibits a dose-dependent effect on PCV2 infection in PPLs at 72 hpi (Fig. 6a). Cell viability assay reveals no toxicity, excluding non-specific adverse effects of the compound on infection (Fig. 6b). Chemical inhibition also displays a dose-dependent effect on PCV2 infection in PK15 cells (Additional file 9, Figure B-D). Thus, robust expression of STAT3 responsive genes are critical for PCV2, and hampering STAT3 activity represents an antiviral strategy (Fig. 6c).
A paracrine macrophage-lymphoblast communication axis exacerbates PCV2 infection
Finally, the PMWS field study dataset (Fig. 2a) [11] was queried in GSEA with ImmuneSigDB’s immunological gene signatures [13]. At first glance, this approach may seem incompatible as ImmuneSigDB describes single types of immune cells, while the PMWS data set covers complex lymph node tissues made out of multiple cells type. However, the main constituent of lymph nodes are immune cells, which are particularly affected by PMWS. It was therefore assumed that analyzing these data with ImmuneSigDB could yield valuable information on the biological processes going on inside these lymphoid organs. Indeed, when comparing PMWS lymph nodes with healthy lymph nodes in a GSEA analysis, it revealed a striking suppression of lymphocyte gene expression and powerful induction of signatures from monocytes and other myeloid cells (Fig. 7a, Additional file 10). This reflects the loss of lymphocytes and histiocytic replacement in PMWS lymph nodes. Together with the previous observation that a myeloid leukocyte activation signature can predict clinical outcome of PCV2 (Fig. 3), it raises the question to what extent infiltrating monocytes affect PCV2 replication. After maturation into macrophages, they may either dampen infection by destroying viral particles, or promote PCV2 in a paracrine fashion by releasing pro-inflammatory cytokines. To test the effect of intercellular communication between macrophages and lymphocytes, a co-culture experiment was set up. PCV2-infected PPLs were seeded in a porous insert, physically separated from a lower compartment with primary porcine macrophages (Fig. 7b). The latter were challenged with Porcine Reproductive and Respiratory Syndrome Virus (PRRSV), a virus that can experimentally trigger PMWS [8] (Fig. 7c).
The presence of non-infected macrophages had no significant effect on PCV2 lymphoblast infection levels, but when co-cultured with PRRSV-infected macrophages, a significant and consistent increase in PCV2 infection could be discerned (Fig. 7d). Importantly, PRRSV has an exclusive tropism for macrophages [28, 29], and cannot infect lymphoblasts (Additional file 11). This excludes an effect of secondary infection of PRRSV on PCV2 replication in these lymphoblasts. This experiment thus demonstrates the existence of a previously unknown axis of intercellular communication between macrophages and lymphoblasts exacerbating PCV2 replication.