A molecular signature of epithelial host defense: comparative gene expression analysis of cultured bronchial epithelial cells and keratinocytes

Background Epithelia are barrier-forming tissues that protect the organism against external noxious stimuli. Despite the similarity in function of epithelia, only few common protective mechanisms that are employed by these tissues have been systematically studied. Comparative analysis of genome-wide expression profiles generated by means of Serial Analysis of Gene Expression (SAGE) is a powerful approach to yield further insight into epithelial host defense mechanisms. We performed an extensive comparative analysis of previously published SAGE data sets of two types of epithelial cells, namely bronchial epithelial cells and keratinocytes, in which the response to pro-inflammatory cytokines was assessed. These data sets were used to elucidate a common denominator in epithelial host defense. Results Bronchial epithelial cells and keratinocytes were found to have a high degree of overlap in gene expression. Using an in silico approach, an epithelial-specific molecular signature of gene expression was identified in bronchial epithelial cells and keratinocytes comprising of family members of keratins, small proline-rich proteins and proteinase inhibitors. Whereas some of the identified genes were known to be involved in inflammation, the majority of the signature represented genes that were previously not associated with host defense. Using polymerase chain reaction, presence of expression of selected tissue-specific genes was validated. Conclusion Our comparative analysis of gene transcription reveals that bronchial epithelial cells and keratinocytes both express a subset of genes that is likely to be essential in epithelial barrier formation in these cell types. The expression of these genes is specific for bronchial epithelial cells and keratinocytes and is not seen in non-epithelial cells. We show that bronchial epithelial cells, similar to keratinocytes, express components that are able to form a cross-linked protein envelope that may contribute to an effective barrier against noxious stimuli and pathogens.


Background
Epithelial tissues in the mammalian airways and skin are among the largest organs and form the interface between the internal milieu of the host and the outside world. They not only protect the host against invading pathogens but also provide an effective barrier to noxious external (chemical and physical) stimuli and dehydration [1,2]. The effectiveness of the epithelial barrier is demonstrated by the rare incidence of severe infections to the lung or skin in healthy individuals. It has become clear that epithelia also play an active role in innate and adaptive immunity [3,4]. Epithelial tissues display three main mechanisms to protect the organism from infection. First, epithelial cells form an impermeable physical barrier which both prevents pathogen entry and minimizes dehydration. Second, epithelial cells are capable of producing defense molecules such as antimicrobial peptides and proteinase inhibitors. Finally, these cells are able to produce signaling molecules such as cytokines and chemokines. These molecules may attract or activate cells of the innate and adaptive immune system [5,6]. Interaction between cells of the immune system is mediated by adhesion molecules and cytokine receptors [7,8] that are present on epithelial cells.
Host defense mechanisms in epithelial cells are coordinated by a complex program of gene expression. Very powerful and sophisticated laboratory techniques such as Serial Analysis of Gene Expression (SAGE) [9] and DNA microarrays [10] have been developed to assess the expression of thousands of genes at the mRNA level in a single experiment. To delineate the barrier function of epithelial cells, the transcriptional change induced by proinflammatory cytokines was recently assessed by means of SAGE in two well-established culture models of epithelial inflammation using subcultures of primary bronchial epithelial cells [11] and primary keratinocytes [12]. These independent studies showed a marked overlap in gene families expressed in response to pro-inflammatory cytokines in both cell types. Upon cytokine exposure, in particular genes associated with cytoskeletal architecture and epidermal barrier function such as keratins, S100 calcium-binding proteins and various antimicrobial proteinase inhibitors were differentially expressed. These studies indicated that bronchial epithelial cells and keratinocytes might respond similarly to external influences to ultimately provide effective host protection. This is especially of interest because the epithelia of the skin and conducting airways are markedly different in morphology. The potential functional resemblance of these types of epithelia is also demonstrated by comparative analysis of genetic studies in patients with asthma and atopic dermatitis showing that similar patterns of gene expression may contribute to susceptibility to these diseases [13]. This prompted us to a conduct a comparative analysis of our previously generated gene expression in culture models of epithelial inflammation. The aim was to test the hypothesis whether bronchial epithelial cells and keratinocytes employ similar mechanisms for providing effective host defense at these epithelia.
Therefore, in the present study, our previously generated SAGE data sets derived from bronchial epithelial cells [11] and keratinocytes [12] that were exposed to pro-inflammatory cytokines were compared to identify a common denominator in host defense in the different types of epithelial cells. SAGE libraries of resting and IL1β/TNFαexposed primary bronchial epithelial cells (~28.000 tags in each library) were compared to SAGE libraries of resting and TNFα-exposed human primary keratinocytes (~13.000 tags in each library). The in silico method Tissue Preferential Expression (TPE) [14] was used for the recognition of putative cell-specific gene expression in these SAGE libraries. Previously, this method has been successfully applied to identify novel specific markers for disease [14,15]. To verify the in silico prediction analysis of tissue specific gene expression, polymerase chain reaction was performed on seven target genes that were identified by the TPE algorithm in a panel of nine different cell types of which seven are normally present in the airways or lungs. The airway-and lung-derived NCI-H292 and A549 cell lines were included since these cell lines are frequently used to study epithelial cell function. We have identified and validated a signature of specific gene expression for bronchial epithelial cells and keratinocytes. The majority of genes in this signature was previously not associated with host defense or inflammation. These results indicate that epithelia of the airways and skin exploit unified host defense strategies to protect the host, despite their morphological differences.

Results
Transcriptional overlap between PBEC and KC and epithelial-specific gene expression upon cytokine exposure was characterized. By comparing the four SAGE libraries of primary bronchial epithelial cells (PBEC) and keratinocytes (KC), an overlap in tags of approximately 80% was observed indicating a high similarity in the repertoire of genes expressed by these types of epithelial cells. Although remarkable commonalities were found in gene families found to be expressed by PBEC and KC, the repertoire of transcribed family members differed among the two cell types (table 2). To extract a pattern of genes that is specifically expressed in epithelial cells that could likely be involved in epithelial host defense we explored which of the genes are preferentially expressed by PBEC and KC using the TPE algorithm. The scatter plot in figure 1 displays the individual tags observed in the cytokine-exposed PBEC and KC libraries. Each dot represents a single tag with the corresponding TPE values for PBEC and KC. In this analysis, four groups of tags were identified: epithelial non-specific tags (i), tags preferentially expressed by either PBEC (ii) or KC (iii) and tags that were preferentially expressed by both PBEC and KC (iv). The expression of the 30 tags observed in the latter group represents putative epithelial-specific genes because a TPE score ≥ 9 was observed in both PBEC and KC (table 3). Almost half of these tags corresponded to genes encoding for keratins, small proline-rich proteins, kallikreins and proteinase inhibitors (table 3). Interestingly, the expression of a large proportion of these genes was found to be affected by cytokine exposure in PBEC or KC (or both) as observed in the initial SAGE studies (as indicated by underlined tag numbers in table 3). A similar picture in preferential tag expression was obtained when using the libraries of resting PBEC and KC since the majority of genes do not show an on/off expression profile upon stimulation with cytokines (data not shown).
To validate this in silico TPE prediction analysis, expression of seven putative epithelial-specific genes by PBEC and KC was assessed by reverse transcriptase polymerase chain reaction (RT-PCR) in nine different cell types. Each cell type in the panel was exposed to medium alone or to IL1β/TNFα. KC were exposed to medium or TNFα alone to maintain comparability with the original SAGE experiment. In concordance with the SAGE data and TPE analysis, expression for SPRR2A was only observed in PBEC. On the other hand, CALML5 was expected to be expressed by KC alone. However, PBEC were shown to be positive for this transcript as well and weak expression was observed in NCI-H292 cells. As demonstrated by the TPE analysis KRT6A, SPRR1A, SPRR1B, IL1F9, S100A2 all showed TPE values of ≥ 9 in both PBEC and KC libraries. The RT-PCR results in figure 2 demonstrates that preferential expression of SPRR1B was found in PBEC, KC and NCI-H292 cells, whereas moderate to weak expression was also detected in fibroblasts, HUVEC, HASM and monocytes. Expression of KRT6A is restricted to PBEC, KC and the bronchial epithelial cell line NCI-H292, whereas expression of this transcript was negative in all other cell types. Transcription of SPRR1A, IL1F9 and S100A2 was only detected in primary cultures of PBEC and KC and was completely absent in all other cell types.

Discussion
Comparative genomics approaches have the potential to gain additional insight into a biological process at the mRNA expression level by integrating and combining data obtained from similar model systems. Particularly, SAGE is excellent for this purpose since digital, scalable expression data is generated that allows comparison without the need for complex mathematical normalization methods. Although the SAGE libraries used in the present analysis were not initially intended for comparative genomic research, remarkable commonalities in epithelial-specific gene expression were found that related to host defense.
The tissue preferential expression (TPE) algorithm was employed to recognize specific tag expression by PBEC and KC under inflammatory conditions (pane iv; figure 1, table 3). Experimental verification of selected epithelialspecific genes by RT-PCR showed a good correlation between the in silico approach and RT-PCR (figure 2). The PCR setup was designed to detect true presence or absence of validation genes and was not intended to be quantitative. The observed discrepancies between SAGE and PCR results can be explained by the difference in detection sensitivity between techniques: RT-PCR is far more sensitive than SAGE in detecting low abundant gene expression.
The majority of tags of the molecular signature corresponded to genes encoding structural components of the cytoskeleton (keratins, small proline-rich proteins, elafin) and for proteins that are involved in the assembly/disassembly (transglutaminase 1, kallikreins and matrixmetalloproteinases) of the cornified cell envelope in keratinocytes (reviewed in [16]). Components of the cross-linked or cornified envelope are linked by transglutaminases (reviewed in [17]). The observation that bronchial epithelial cells express components of and assembly/disassembly enzymes forming a cross-linked envelope is relevant to our understanding of epithelial host defense in the airways. Additional support for this observation is provided by abundant transcription of genes that are known to be involved in cornification in skin, including the S100 calcium-binding proteins [ A disadvantage of the present study might be the differences in type of cytokine-exposure and duration of the treatment. The opposite directional changes in expression in gene families (table 2) observed could be explained either by dissimilarities in the initial model systems or by the inherent differences between PBEC and KC. By using the TPE algorithm, highly cell-specific tag expression can be predicted largely independently from transcriptional levels because the more unique a tag is to a particular tissue, the less important is its level of expression. Therefore, we are confident that the signature of epithelial host defense that was extracted is representative for bronchial epithelial cells and keratinocytes.
Computational subtraction methods such as the TPE algorithm allow functional clustering of genes derived from large and complex genome-wide expression profiles without having full knowledge of the repertoire of genes involved in biological processes of interest. Although the identified molecular signature of host defense is characteristic for bronchial epithelial cells and keratinocytes, it would be of great interest to study whether this gene expression pattern is also applicable to other types of epithelial cells, a finding that would greatly enhance our understanding of epithelial defense strategies.

Conclusion
In summary, our comprehensive comparison of overlapping genes across bronchial epithelial cells and keratinocytes provides novel insights in epithelial host defense strategies, in particular of the airway epithelium. Combining in silico and experimental approaches is very valuable in accelerating the interpretation of genomics data and defining follow-up research. We identified an expression signature of genes that were specifically expressed by bronchial epithelial cells and keratinocytes. These genes are likely to fulfill an eminent function in epithelial host defense. Based on the present findings we propose that formation of a cross-linked protein envelope by bronchial epithelial cells is an effective host defense strategy of the mucosal epithelium in the human airways. This function would be analogous to the host defense function of cornifying keratinocytes. Finally, a better understanding of unified host defense strategies in different epithelia may lead to the identification of novel therapeutic targets for epithelial inflammatory disorders such as asthma and atopic dermatitis.

SAGE data
The previously published SAGE libraries that were compared in this study were derived from two models of epithelial inflammation using primary bronchial epithelial cells [11]  For tag mapping, after discarding tags occurring only once, the libraries were compared with NCBI's "reliable Unigene cluster to SAGE tag map" [28] and with SAGEgenie of the Cancer Genome Anatomy Project [29]. Both maps were based on Unigene build#171. Additionally, to enhance the reliability of tag identity we included the virtual tag classification as used in SAGEgenie to assess the location of each tag within the corresponding transcript. Reliable tags can be discriminated from tags that are not isolated from the 3'-end such as internally primed transcripts and tags derived from internal NlaIII restriction sites [29,30]. In the data set comparisons, only tags were included that were derived from the most 3'-restriction site of NlaIII, tags that matched to undefined 3'-end transcripts and tags for which no additional information was available. The last category may contain tags that correspond to novel transcripts.

TPE analysis
Epithelial-specific gene expression in PBEC and KC was identified using the Tissue Preferential Expression (TPE) algorithm [14]. The calculated Tissue Preferential Expression (TPE) value is based both on the presence of a particular tags and its level of expression in the SAGE library of interest in comparison to a panel of reference SAGE libraries derived from a range of different whole tissues. To allow calculation of TPE values, each of the PBEC and KC SAGE libraries as well as the reference libraries were nor-  Tags with TPE values of <4 were excluded from further analysis since these tags occur very frequently in other cell types as well (See supplement for a detailed description of the TPE algorithm). The threshold value for the TPE analysis that is indicative for tissue-specific expression was chosen very high to prevent possible false positives. Tags with a corresponding TPE value of ≥ 9 is indicative for tis- Human lung fibroblasts (HFL-1) HFL-1 (CCL-153, American Type Culture Collection) were cultured according to the supplier's recommendations. Prior to stimulation, cells were cultured overnight in serum free medium.

Monocytes
CD14-purified monocytes were kindly provided by the department of Nephrology (Leiden University Medical Center, Leiden, The Netherlands), and were resuspended in RPMI 1640 medium supplemented with 20 U/ml penicillin, 20 µg/ml streptomycin, 2 mM glutamine (all from Bio Whittaker) and 10% heat-inactivated FCS and were PCR verification of 7 potential preferentially expressed tags identified using the TPE algorithm Figure 2 PCR verification of 7 potential preferentially expressed tags identified using the TPE algorithm. The expression of these seven genes was assessed both under resting conditions and after cytokine exposure in all cell types. On the right, the TPE values of the gene are listed for both PBEC and KC after cytokine exposure. Tags for which no TPE value could be calculated because of absence of the tag in the particular library are indicated by "not available" (N/A). The predicted preferential expression could be verified for all genes. The expression of six of these genes seems to be selective for epithelial cells only. Whereas SPRR1B is preferentially expressed by epithelial cells, moderate to low levels of expression were also detected in other cell types as well. SPRR2A is preferentially expressed by PBEC only. No tags for SPRR2A were found in the KC libraries after TNFα exposure whereas expression for CALML5 was observed by RT-PCR in both PBEC and KC, while no tags for this gene were found in PBEC libraries.
TPE scatter plot of SAGE tags of PBEC and KC libraries after cytokine exposure Figure 1 TPE scatter plot of SAGE tags of PBEC and KC libraries after cytokine exposure. Reliable 3'-end tags with TPE>4 and tag frequency of = 2 in at least one library were plotted. Tags with corresponding TPE values = 9 in both libraries were considered to be potential epithelial cellspecific tags as indicated by the threshold lines in the figure.
A similar picture was obtained when TPE values of tags from the resting libraries were plotted.