Skip to main content

Relationship between gene expression and lung function in Idiopathic Interstitial Pneumonias



Idiopathic interstitial pneumonias (IIPs) are a group of heterogeneous, somewhat unpredictable diseases characterized by progressive scarring of the interstitium. Since lung function is a key determinant of survival, we reasoned that the transcriptional profile in IIP lung tissue would be associated with measures of lung function, and could enhance prognostic approaches to IIPs.


Using gene expression profiling of 167 lung tissue specimens with IIP diagnosis and 50 control lungs, we identified genes whose expression is associated with changes in lung function (% predicted FVC and % predicted DLCO) modeled as categorical (severe vs mild disease) or continuous variables while adjusting for smoking status and IIP subtype; false discovery rate (FDR) approach was used to correct for multiple comparisons. This analysis identified 58 transcripts that are associated with mild vs severe disease (categorical analysis), including those with established role in fibrosis (ADAMTS4, ADAMTS9, AGER, HIF-1α, SERPINA3, SERPINE2, and SELE) as well as novel IIP candidate genes such as rhotekin 2 (RTKN2) and peptidase inhibitor 15 (PI15). Protein-protein interactome analysis of 553 genes whose expression is significantly associated with lung function when modeled as continuous variables demonstrates that more severe presentation of IIPs is characterized by an increase in cell cycle progression and apoptosis, increased hypoxia, and dampened innate immune response. Our findings were validated in an independent cohort of 131 IIPs and 40 controls at the mRNA level and for one gene (RTKN2) at the protein level by immunohistochemistry in a subset of samples.


We identified commonalities and differences in gene expression among different subtypes of IIPs. Disease progression, as characterized by lower measures of FVC and DLCO, results in marked changes in expression of novel and established genes and pathways involved in IIPs. These genes and pathways represent strong candidates for biomarker studies and potential therapeutic targets for IIP severity.


There is substantial clinical heterogeneity in the clinical, radiologic, and histopathologic features within each subtype of idiopathic interstitial pneumonias (IIPs). For instance, all forms of IIP have a somewhat unpredictable prognosis and many but not all patients can progress to end stage lung disease [13]. While subtypes of IIPs differ in clinical, radiographic, and histopathologic presentation [13], the type of IIP often cannot be determined and many of the subtypes of IIP have overlapping clinical and laboratory features indicating that the current definitions remain too broad.

Idiopathic pulmonary fibrosis (IPF), by far the most common form of IIP, is histopathologically defined by the presence of the prototypical form of pulmonary fibrosis, usual interstitial pneumonia (UIP), a fibrosing interstitial pneumonia characterized by a pattern of heterogeneous, subpleural regions of fibrotic, and remodeled lung that often results in death within 2–3 years of diagnosis [4]. Other IIPs, such as respiratory bronchiolitis-associated interstitial lung disease (RB-ILD), are more cellular, occur earlier in life, and have a considerably lower mortality [5]. By contrast, idiopathic nonspecific interstitial pneumonia (iNSIP), a pattern of IIP that is more likely a syndrome than a disease, is most commonly characterized by interstitial fibrosis but in a more uniform pattern than UIP, and carries a better prognosis than IPF/UIP [6, 7]. These differences within the subtype and overlaps among subtypes create a distinct challenge to achieve an accurate diagnosis for an individual patient with this potentially life-threatening diagnosis.

The molecular mechanisms that account for the extreme heterogeneity in clinical presentation, radiologic patterns, histopathologic variation, and disease progression are largely unknown. We hypothesize that biological heterogeneity in IIPs will be reflected by gene expression patterns in lung tissue from patients with IIP, and gene expression patterns will change as a function of disease activity, progression, and severity. To test this hypothesis, we measured gene expression in lung tissue of patients with IIP, and correlated gene expression patterns with diffusing capacity of the lung for carbon monoxide (DLCO) and forced vital capacity (FVC).


Demographic characteristics of the Lung Tissue Research Consortium (LTRC) cohort

Table 1 summarizes demographic and clinical characteristics of the LTRC IIPs cohort and by subtypes of IIPs that include at least 10 individuals. Included in the table is the portion of the non-diseased control cohort used together with the LTRC cohort. Overall, the IIPs cohort is older than the controls. Within the IIP cohort, individuals within IPF/UIP group are the oldest followed by iNSIP, uncharacterized fibrosis, and RB-ILD. There are no significant differences across IIP subtypes in gender or racial distribution. Comparison of smoking histories reveals that approximately half of the individuals with IIP are former smokers, as compared to controls that are almost 50 % current smokers. Subjects with IPF/UIP and RB-ILD diagnosis reported higher pack-years compared to subjects with iNSIP, uncharacterized fibrosis, and controls; however, there is variability within each group and therefore there are no statistically significant differences among groups. We also compared St. George’s score, an indicator of overall lung health [8] in subtypes of IIP (no data available for controls) and did not find significant differences among groups. Finally, variables associated with lung function, % predicted pre-bronchodilator FVC and DLCO reveal better lung function in the RB-ILD group compared to other IIPs.

Table 1 Subject demographics and clinical characteristics of the derivation (LTRC) cohort by IIP subcategory

Identification of genes associated with the IIPs clinical subtype

To establish whether there are significant differences in expression in clinical subtypes of IIP, we first identified molecular profiles associated with clinically defined subtypes of IIP. For this initial analysis, we used an ANCOVA model that incorporates clinical subtype with at least 10 individuals per group (including controls), age, gender and smoking status as factors. Venn diagrams in Additional file 1: Figure S1 illustrate overlap of differentially expressed mRNAs amongst IIP categories versus controls using the 5 % FDR criterion alone or combined 5 % FDR and 2-fold change criteria. While the statistical interpretation of intersection-union testing among these groups is compromised by differences in the power of individual contrasts and by the common comparator group, a large proportion of the differentially expressed mRNA transcripts exhibit differences between control lung and two or more of the IIP subgroups. However, we also observed genes unique to each IIP subtype, especially in the IPF/UIP group which has the most differentially expressed transcripts overall as well as the most unique transcripts. This result is likely a combination of IPF/UIP being the largest group of IIPs examined in our study and the fact that IPF/UIP has the most remodeled lung of all IIPs. Examination of 29 genes in common to all IIPs using the 5 % FDR and 2 fold change criteria reveals genes involved in inflammation (IL1RL1, IL1R2, IL18R1, IL8RAP), extracellular matrix (ITGA10, FCN3), Wnt signaling (SFRP2), coagulation (alpha-1 antichymotrypsin or SERPINA3) and host defense (DEFA3) (Additional file 1: Table S1). Interestingly, defensin DEFA3 is one of the genes we identified as differentially expressed in peripheral blood of severe vs mild IPF/UIP categorized by differences in the DLCO [9]. We used the high degree of overlap among IIPs and biological relevance of genes identified in common to all IIPs as a rationale for inclusion of all IIP subtypes in the analysis of lung function variables as opposed to focusing only on IPF/UIP; however, we adjust for IIP subtype in all further analyses.

Identification of genes associated with DLCO

We next sought to identify genes whose expression is associated with the DLCO measurement. For this analysis, we used an ANOVA model that incorporates categories of disease based on predicted DLCO (see below), IIP subtype and smoking status, or an ANCOVA model that incorporates the % predicted DLCO measurement, IIP subtype and smoking status as factors. Although no significant difference in smoking status exist among IIPs due to high variability in each IIP subtype, differences in smoking histories in these individuals may influence gene expression and we therefore included them in the model. We did not include age and gender because they are accounted for in the calculation of % predicted lung function variables. By categorical analysis, 91 unique transcripts differentiate mild disease (DLCO ≥65 %; n = 33) from severe disease (DLCO ≤35 %; n = 40) at 5 % FDR (Additional file 1: Table S2). When DLCO is treated as a continuous variable (135 subjects with DLCO measurement available), 706 genes correlate with changes in % predicted DLCO at 5 % FDR (Additional file 1: Table S3). Six hundred fourteen genes not identified in the categorical analysis were found to be associated with changes in DLCO in the continuous analysis.

Identification of genes associated with FVC

We also identified gene expression changes associated with the FVC measurement, in an analogous manner to the analysis of DLCO. By categorical analysis, 681 genes differentiate mild disease (FVC ≥75 %; n = 50) from severe disease (FVC ≤40 %; n = 40) when categorized by % predicted FVC at 5 % FDR (Additional file 1: Table S4). Two thousand four hundred sixty seven genes correlate with changes in % predicted FVC in 164 subjects with available FVC data at 5 % FDR (Additional file 1: Table S5) with 1794 of these transcripts not overlapping with categorical analysis.

Transcriptional changes in common to decline in DLCO and FVC

To focus on genes that are the most likely to be involved in the pathogenesis of IIP, we identified differentially expressed transcripts in common to lower measures of DLCO and FVC. We first intersected 91 transcripts identified as significant in the categorical analysis of disease severity based on DLCO and 681 from the analogous analysis of FVC. Fifty eight transcripts that are in common to the two analyses are listed in Table 2. While a number of these transcripts, including ADAMTS family members (ADAMTS4, ADAMTS9) [10], AGER [11], HIF-1α [12, 13], serpin family members (SERPINA3, SERPINE2) [14], and selectin E (SELE) [15] have an established role in lung fibrosis, expression of novel IIP candidate genes such as rhotekin 2 (RTKN2) and peptidase inhibitor 15 (PI15) is also significantly altered in severe compared to mild disease. Dot plots shown in Fig. 1 demonstrate decrease in RTKN2 but an increase in SELE and PI15 with a decrease in lung function. We confirmed expression levels of these three genes in the same set of LTRC samples by qRT-PCR (Fig. 2).

Table 2 Differentially expressed genes (5 % FDR) in severe vs mild disease as characterized both by a decline in % predicted DLCO and FVC
Fig. 1

Categorical analysis of changes in gene expression based on lung disease severity categorized by % predicted DLCO (top) or % predicted FVC (bottom) for novel IIP candidate genes rhotekin 2 (RTKN2 - left), selectin E (SELE - middle), and peptidase inhibitor 15 (PI15 – right). Mild, moderate, and severe are defined as percent predicted DLCO (mild ≥65 %, moderate <65 % and >35 %, and severe ≤35 %), or FVC (mild ≥75 %, moderate <75 % and >40 %, and severe ≤40 %). p values for severe vs mild groups are shown in Table 2

Fig. 2

qRT-PCR validation of microarray data. Fold changes relative to control lungs for rhotekin 2 (black bars), selecting E (white bars) and PI15 (gray bars) in all IIPs and mild and severe disease based on either FVC or DLCO. Error bars represent standard deviations

Given that analysis of expression changes as a function of continuous decrease in lung function variables identified large number of additional genes, we use the overlap between transcripts associated with a decline in DLCO or FVC in the continuous analysis (553 unique transcripts) to identify canonical pathways and transcriptional networks that are likely to be important predictors of severity of IIPs. Canonical Pathways Analysis of the 553 genes identified hepatic fibrosis and hepatic stellate cell activation (P <1 × < 10−7), acute phase response signaling (P <1 × 10 −5), and HIF-1α signaling (P <1 × 10−4) as the most significantly enriched (Fig. 3a). A number of other pathways were identified (P <0.001) including IL17a, DC and NK crosstalk, LPS/IL-1 mediated inhibition of RXR function, leukocyte extravasation signaling, atherosclerosis signaling, and aryl hydrocarbon receptor signaling. Protein-protein interactome analysis using the InnateDB database of interactions suggests that more severe presentation of IIPs is characterized by an increase in cell cycle progression and apoptosis (genes centered around MYC), increased hypoxia (genes centered around HIF-1α) and dampened innate immune response (genes centered around TNF-α) (Fig. 3b). Network analysis in Ingenuity Pathway Analysis identified seven high scoring networks (score >25; Additional file 1: Figure S2) that largely recapitulate findings from InnateDB database.

Fig. 3

553 differentially expressed genes common to FVC and DLCO by continuous analysis were evaluated by (a) canonical pathways and (b) protein-protein interactome analysis. Pathways and their associated –log P value, as determined by the Fisher Exact Test in Ingenuity Pathway Analysis, are shown as the blue bars. The ratio of the number of genes mapping to these pathways relative to the total number of genes in the pathway is plotted as orange squares. The interactome was created using NetworkAnalyst [33] and the InnateDB PPI dataset by first importing all 553 genes and then reducing the network to zero-order interactome. The nodes are colored based on their correlation of expression and lung function variables (green are negative and red are positive correlations). The sizes of nodes are proportional to their betweenness centrality values

Validation of gene expression changes in an independent cohort

To test the generalizability of our findings, we examined expression of the candidate genes with the strongest associations with DLCO and FVC in an independent cohort of IIPs. Additional file 1: Table S6 provides characteristics of the replication (National Jewish Health or NJH) cohort. We primarily focused this analysis on the most pronounced transcriptional changes, 4 genes with >2 fold change from the categorical analysis of both FVC and DLCO (RTKN2, PI15, ADAMTS4, and SELE; Table 2). We used the same ANOVA models as in the derivation cohort, incorporating lung function variable, IIP subtype, and smoking, on these 4 genes to assess whether there are significant differences in expression between mild and severe disease, as defined by percent predicted DLCO (22 mild and 17 severe) or FVC (36 mild and 23 severe). The results of this analysis (Table 3) demonstrate that extent of transcriptional changes associated with lung function are dampened in this cohort compared to the LTRC cohort. However, all genes have p <0.05 for association with either FVC or DLCO. We also performed an analysis to validate 58 genes from Table 2 in the replication cohort in the same manner as previous analysis. 44/58 (76 %) genes have nominal p <0.05 in at least one of the analyses (Additional file 1: Table S7). We also considered lack of RB-ILD samples in the replication cohort as a counfounder of our validation analysis. This subtype of IIPs is characterized by better lung function than IPF and iNSIP; presence of RB-ILD cases with better lung function creates a wider range of values for FVC and DLCO in the LTRC cohort and may potentially allow for identification of more associations despite controlling for IIP subtype in the model. 49/58 (85 %) genes identified as significant in the categorical analysis of FVC and DLCO (Table 2) remain significant after removal of RB-ILD samples from the analysis. Similarly, of the 533 genes identified in common to DLCO and FVC when considered as continuous variables, 401 genes (75 %) remain significant in the analysis without RB-ILD. Therefore, RB-ILD samples do not significantly influence the results of our analysis beyond loss of power with fewer samples.

Table 3 Validation of gene expression changes from the LTRC IIP cohort in an independent cohort of IIPs (NJH cohort)

Localization of gene expression changes

To partially address the issue of cell specificity of gene expression profiles identified in whole lung tissue, we performed immunohistochemistry to localize expression of rhotekin 2, one of the genes with most prominent expression change associated with diseases progression and severity, in normal and IIP lung. In histologically normal lung tissue, rhotekin 2 is expressed in airway/bronchial epithelia, alveolar epithelia, alveolar macrophages, and submucosal glands (Additional file 1: Figure S3). We next examined different histopathological features (airways, honeycomb cysts and alveolar cysts) in lung tissue of IPF subjects with more compared to less severe disease. While we did not observe consistent differences across all subjects we examined, we observed a trend of lower expression of rhotekin 2 in IPF airways (Fig. 4a), honeycomb cysts (Fig. 4b), and alveolar cystic areas (Fig. 4c) of individuals with more severe disease. Based on our data, reduced mRNA levels of rhotekin 2 in severe IPF are most likely a combination of reduced expression in specific cells and loss of healthy airway epithelia that are replaced by honeycomb cysts [16]. Localization of rhotekin 2 in iNSIP lung tissue also demonstrated a decrease in expression in more severe disease (Additional file 1: Figure S4). These findings corroborate the findings of reduced rhotekin 2 expression at the mRNA level in more severe IIPs.

Fig. 4

Localization of rhotekin 2 expression in IPF lung of patients with mild (left panels) and severe (right panels) disease. Immunohistochemical staining of airways (a), honeycomb cysts (b), and alveolar cysts (c) lung tissue reveals some decrease in expression of rhotekin 2 in airway epithelial cells in patients with severe compared to mild disease accompanied by an increase in expression in hyperplastic alveolar type II cells in alveolar cysts. Top panels represent rhotekin 2 stained sections and bottom panels are corresponding tissue sections incubated with non-immune serum (negative primary antibody controls). Tissue sections were counterstained with hematoxylin. Images were taken at 10× and 40× (inset) magnifications


Results of our investigation of global gene expression changes in lungs of patients with IIPs demonstrate that, at the molecular level, there are more commonalities than differences among different subtypes of IIPs. On the other hand, the extent of disease, as characterized by lower measures of FVC and DLCO, is associated with marked changes in expression of novel and established genes and pathways involved in IIPs.

Our findings that lung tissue from IPF, iNSIP, RB-ILD, and uncharacterized fibrosis share many common expression patterns supports the concept that, despite differences in clinical presentation, these diseases may be related etiologically and pathogenically. Three publications [1719] indicate that IPF/UIP and iNSIP, presumed distinct clinical-pathologic processes, may be related etiologically and pathogenically. Two of these studies [17, 19] have used transcriptional profiles of affected lung tissue to create IIP molecular signatures, and their findings suggest that dissimilar histological patterns may be related biologically.

In addition to some differences in expression among different IIP subtypes, we identified more pronounced changes in gene expression associated with more advanced disease. This conclusion is supported by previous reports [20, 21]. Selman et al. identified 437 differentially expressed genes in rapid compared to slow progressors, including overexpression of genes involved in morphogenesis, oxidative stress, migration/proliferation, and genes from fibroblasts/smooth muscle cells [20]. Our group also demonstrated differential expression (p <0.05 and >5 fold change) of 191 transcripts in individuals with progressive IPF (based on changes in DLCO and FVC over 12 months) compared to slow progressors [21]. While only less than 10 % of the genes in Table 2 (AGER, C11orf9, NFIL3, PSAT1, RTKN2, and SERPINE2) were identified by these earlier studies, additional overlap was observed when we considered gene families (for example GADD45A in the present study and GADD45B in our earlier publication), suggesting that the present study was successful at confirming published observations but also identifying novel gene transcripts. Two potential explanations for limited overlap are the fact that technologies and genome annotations have changed (our earlier study used the Serial analysis of gene expression [SAGE] as opposed to microarrays) and sample size (our current study has considerably larger sample size than previous publications).

Among the gene transcripts most strongly associated with disease severity are rhotekin 2 (RTKN2), selecting E (SELE), and peptidase inhibitor 15 (PI15). Rhotekin 2 is a member of a family of proteins containing a Rho-binding domain that are target peptides for the Rho-GTPases and are important in lymphocyte development and function [22]; however rhotekin 2 is ubiquitously expressed [23]. Genetic variants in rhotekin 2 have been associated with rheumatoid arthritis (RA) and activation of the NF-κB pathway in Japanese [24]. These known roles for rhotekin 2, the observation of decreased expression of rhotekin 2 in more severe disease, both in our earlier publication [21] and in the current study, and localization of expression to the lung epithelium in the current study, point to this gene as a candidate that warrants further functional investigation in IIPs. Selectin E is a member of the selectin family of cell adhesion molecules and plays a central role in adhesion of leukocytes to the endothelium [25]. Its expression inhibits bleomycin-induced lung fibrosis [15] but the soluble form of the protein product is increased in serum of patients with pulmonary fibrosis [26]. Finally, PI15 is a trypsin inhibitor that is developmentally regulated in the lung mesenchyme [27] but otherwise is not well characterized; its role in development is consistent with recapitulation of developmental pathways in lung injury that is a hallmark of IIPs [28].

One of the strengths of the present study is the use of two independent cohorts of IIP for derivation and validation. However, we were only able to validate the most pronounced transcriptional changes identified in the derivation cohort. One possible explanation for this is that overall the NJH cohort has milder disease, an observation we have previously made [29]. We also considered the possibility that lack of RB-ILD cases in the replication cohort contributed to limited validation but ruled this out by showing that exclusion of RB-ILD cases from the derivation cohort did not substantially influence the results. A notable weaknesses of the present study in that, analogous to earlier publications, we have used whole lung tissue with the mixture of cells. Immunohistochemistry analysis of rhotekin 2, one of the genes with strongest correlation with disease severity in our study, demonstrates that reduced expression may be a combination of reduced expression in specific cells and loss of healthy airway epithelia that are replaced by honeycomb cysts. Further future studies will be needed to determine which of the genes identified by our analysis are differentially regulated in specific cell populations. Another weakness of our study is cross-sectional design in which we identify gene expression changes associated with a decline in lung function across a cohort of different individuals. A study of longitudinal design with measurements of gene expression and lung function over time in same individuals will be needed to validate these findings. However, the finding that rhotekin 2 expression was associated with more rapid disease progression in our earlier study [21] supports the notion that genes identified in our study have direct relevance to disease progression.


In summary, we identified commonalities and differences in gene among different subtypes of IIP. Disease progression, as characterized by lower measures of FVC and DLCO, results in marked changes in expression of novel and established genes and pathways involved in IIP. These genes and pathways represent strong candidates for biomarker studies and potential therapeutic targets for IIP severity.


Subjects and tissue samples

All human tissue was collected with appropriate ethical review for the protection of human subjects. Written informed consent was obtained for all subjects’ participation in research by the Lung Tissue Research Consortium and National Jewish Health ILD Research Program. The current study only used de-identified information and was determined to be non-human subject research by both National Jewish Health and Colorado Multiple Institution IRBs. ATS/ERS guidelines were followed for diagnosis of IIP subtypes. The LTRC IIP cohort was used to derive gene expression signatures. NJH IIP cohort was used to validate gene expression signatures. The control tissue cohort was split to provide control lung expression profiles for both derivation and validation stages.

Lung tissue specimens from lower (n = 121), upper (n = 31), and middle/lingula (n = 15) lobes from subjects with IIP (119 IPF, 17 iNSIP, 13 uncharacterized fibrosis, 11 RB-ILD, 4 DIP, and 3 COP) were obtained from the LTRC. The LTRC is a resource created by the NHLBI to provide human lung tissues and DNA to qualified investigators for use in research. The program enrolls donor subjects who are anticipating lung surgery, collects blood and extensive phenotypic data from the prospective donors, and then processes their surgical waste tissues for research use. Most donor subjects have fibrotic interstitial lung disease or COPD. Clinical data include clinical and pathological diagnoses, chest CT images, pulmonary function tests (spirometry, DLCO, and ABG), exposure (including cigarette smoking history) and symptom questionnaires (including Borg dyspnea scale), and family history of lung disease.

The NJH ILD cohort consists of 131 patients with biopsy-proven IIP (111 IPF/UIP, 12 iNSIP, and 8 uncharacterized fibrosis) that were clinically evaluated by investigators at National Jewish Health. All subjects in this cohort have undergone a standardized evaluation designed to provide a specific diagnosis. The evaluation included a standardized history focused on the presence of current or previous systemic disease; medications; tobacco and recreational drug use; familial lung disease; avocational, occupational, environmental, and accidental exposures. Additional testing includes serologic evaluation for evidence of systemic disease, chest radiography, pulmonary physiology (including lung volumes by body plethysmography, spirometry before and after inhaled bronchodilator, and diffusing capacity), pressure volume curves, and gas exchange with exercise (formal six-minute walk testing and/or cardiopulmonary exercise testing). Video assisted thorascopic (VAT) or open surgical lung biopsy was performed as clinically indicated. The diagnosis of IIP was established using the criteria defined in the ATS/ERS consensus statement [1, 2].

Control, non-diseased lung tissue from lower (n = 86), and middle (n = 4) lobes was obtained from International Institute for Advancement of Medicine, formerly Tissue Transformation Technologies (Ediston, NJ). All individuals had suffered brain death and were evaluated for organ transplantation before research consent. Informed consent was obtained at the time of transplant evaluation. All specimens failed regional lung selection criteria for transplantation. Subjects had to demonstrate no evidence of active infection or chest radiographic abnormalities, mechanical ventilation <48 h, PaO2/FiO2 ratio >200, and no past medical history of underlying lung disease or systemic disease that involves the lungs (e.g., rheumatoid arthritis). Lung samples were procured within 34 h after brain death (mean, 16.2 h; range, 4.5–33.25 h). The control cohort was divided to proportionally provide the same percentage controls to LTRC and NJH IIP cohorts; 50 controls were used with the LTRC cohort and 40 with the NJH cohort.

Microarray data generation

Total RNA including small RNA species was isolated from approximately 100 mg of snap-frozen lung tissue using the mirVana kit (AB/Ambion, Austin TX). RNA purity and concentration were determined by spectrophotometry, and RNA integrity was determined using the Bioanalyzer (Agilent, Santa Clara, CA). mRNA microarray target labeling was conducted using 300 ng of total RNA and the Message Amp II kit (AB/Ambion, Austin TX), hybridized to the Human Gene 1.0 ST Array (Affymetrix, Santa Clara, CA) and processed according to the manufacturer’s instructions. All microarray data met the quality control criteria established by the Tumor Analysis Best Practices Working Group [30] and are available in the Gene Expression Omnibus repository as GSE31962.

Microarray data analysis

Expression data from 217 mRNA arrays (LTRC cohort; 167 IIPs and 50 controls) were analyzed using ANOVA implemented in Partek (St Louis, MO). Intensity data were imported, log2-transformed, and quantile normalized using RMA [31], and expression levels were summarized on a transcript level using the mean value of all probesets mapping to a transcript. Non-expressed and invariant transcripts were removed using a median variance filter, corrected by a Benjamini-Hochberg false discovery rate (FDR) of 0.10 [32], resulting in a final dataset of 11950 transcript measurements across 217 samples. Differential expression of individual transcripts was identified using an ANOVA (for categorical analysis) or ANCOVA (for continuous analysis) model incorporating lung function variable (%predicted FVC or DLCO), the final clinical diagnosis of each subject and smoking status. We included age and smoking status in the model as there are significant differences in these variables between IIP and control groups. We considered the impact of several technical variables including array batch, RNA preservative, RNA quality (RIN) and anatomic location of the lung biopsy; minimal expression changes were associated with these variables and we therefore did not include them in the final model. NJH cohort mRNA expression profiles were collected and processed in the same manner as the LTRC cohort data, with the exception of the final filtering step; in this case, 11950 transcripts from the LTRC dataset were retained in the dataset. Pathway analysis was performed using the Ingenuity Pathway Analysis (IPA) database and software ( and the Fisher Exact Test to determine significant enrichments. We used the Network Analyst tool to generate the protein-protein interactome. NetworkAnalyst uses a comprehensive high-quality protein-protein interaction (PPI) database based on InnateDB [33]. The database contains manually curated protein interaction data from published literature as well as experimental data from several PPI databases including IntAct, MINT, DIP, BIND, and BioGRID. The database currently contains 14755 proteins and 145955 interactions for human, and 5657 proteins and 14491 interactions for mouse.

Quantitative RT-PCR

Primers for mRNA expression were designed using Primer-BLAST and are listed in Additional file 1: Table S8. RNA was normalized to a concentration of 100 ng/μL and reverse transcribed to cDNA using the Applied Biosystems High Capacity cDNA Reverse Transcription Kit. Each 20-μL PCR contained 15 ng cDNA, 0.5 μM final concentration of forward and reverse primers and 1× final concentration of the Power SYBR Green master mix. Real-time PCR was performed on an Applied Biosystems Viia 7 instrument using the following profile: 50 °C for 2 min, 95 °C for 10 min, and 40 cycles of 95 °C for 15 s, and 60 °C for 1 min. Dissociation curves were collected at the end of each run. ΔCT values were calculated relative to GAPDH, and ΔΔCT values were calculated by comparison among different groups of samples.


Standard immunohistochemical staining protocols were followed. Briefly, histological sections of normal and IPF lung were deparaffinized, blocked with hydrogen peroxide, followed by antigen retrieval in citrate buffer, and non-immune serum block. RTKN2 anti-rabbit antibody (Sigma, St. Louis MO; product number HPA037946) was added to histological sections at 1:200 final dilution and incubated overnight. Secondary antibody staining was performed with 3, 3′-diaminobenzidine (DAB) using the ImmPRESS kit (Vector Laboratories, Burlingame CA) and RTKN2 was visualized using the peroxidase substrate (ImmPact DAB kit). The sections were counterstained with hematoxylin. The primary antibody was replaced by non-immune serum for negative control slides.


  1. 1.

    King T, Costabel U, Cordier J-F, Dopico GA, Du Bois RM, Lynch DA, et al. American Thoracic Society. Idiopathic pulmonary fibrosis: diagnosis and treatment. International consensus statement. American Thoracic Society (ATS), and the European Respiratory Society (ERS). Am J Respir Crit Care Med. 2000;161(2 Pt 1):646–64.

    Google Scholar 

  2. 2.

    Travis WD, King TE, Bateman ED, Lynch DA, Capron F, Center D, et al. American Thoracic Society/European Respiratory Society International Multidisciplinary Consensus Classification of the Idiopathic Interstitial Pneumonias. This joint statement of the American Thoracic Society (ATS), and the European Respiratory Society (ERS) was adopted by the ATS board of directors, June 2001 and by the ERS Executive Committee, June 2001. Am J Respir Crit Care Med. 2002;165(2):277–304.

    Article  Google Scholar 

  3. 3.

    Demedts M, Costabel U. ATS/ERS international multidisciplinary consensus classification of the idiopathic interstitial pneumonias. Eur Respir J. 2002;19(5):794–6.

    CAS  Article  PubMed  Google Scholar 

  4. 4.

    King Jr TE, Tooze JA, Schwarz MI, Brown KR, Cherniack RM. Predicting survival in idiopathic pulmonary fibrosis: scoring system and survival model. Am J Respir Crit Care Med. 2001;164(7):1171–81.

    Article  PubMed  Google Scholar 

  5. 5.

    Wells AU, Nicholson AG, Hansell DM, du Bois RM. Respiratory bronchiolitis-associated interstitial lung disease. Semin Respir Crit Care Med. 2003;24(5):585–94.

    Article  PubMed  Google Scholar 

  6. 6.

    Daniil ZD, Gilchrist FC, Nicholson AG, Hansell DM, Harris J, Colby TV, et al. A histologic pattern of nonspecific interstitial pneumonia is associated with a better prognosis than usual interstitial pneumonia in patients with cryptogenic fibrosing alveolitis. Am J Respir Crit Care Med. 1999;160(3):899–905.

    CAS  Article  PubMed  Google Scholar 

  7. 7.

    Nicholson AG, Colby TV, du Bois RM, Hansell DM, Wells AU. The prognostic significance of the histologic pattern of interstitial pneumonia in patients presenting with the clinical entity of cryptogenic fibrosing alveolitis. Am J Respir Crit Care Med. 2000;162(6):2213–7.

    CAS  Article  PubMed  Google Scholar 

  8. 8.

    Chang JA, Curtis JR, Patrick DL, Raghu G. Assessment of health-related quality of life in patients with interstitial lung disease. Chest. 1999;116(5):1175–82.

    CAS  Article  PubMed  Google Scholar 

  9. 9.

    Yang IV, Luna LG, Cotter J, Talbert J, Leach SM, Kidd R, et al. The peripheral blood transcriptome identifies the presence and extent of disease in idiopathic pulmonary fibrosis. PLoS One. 2012;7(6):e37708.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  10. 10.

    Leeming DJ, Sand JM, Nielsen MJ, Genovese F, Martinez FJ, Hogaboam CM, et al. Serological investigation of the collagen degradation profile of patients with chronic obstructive pulmonary disease or idiopathic pulmonary fibrosis. Biomark Insights. 2012;7:119–26.

    PubMed Central  Article  PubMed  Google Scholar 

  11. 11.

    Englert JM, Hanford LE, Kaminski N, Tobolewski JM, Tan RJ, Fattman CL, et al. A role for the receptor for advanced glycation end products in idiopathic pulmonary fibrosis. Am J Pathol. 2008;172(3):583–91.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  12. 12.

    Kottmann RM, Kulkarni AA, Smolnycki KA, Lyda E, Dahanayake T, Salibi R, et al. Lactic acid is elevated in idiopathic pulmonary fibrosis and induces myofibroblast differentiation via pH-dependent activation of transforming growth factor-beta. Am J Respir Crit Care Med. 2012;186(8):740–51.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  13. 13.

    Ueno M, Maeno T, Nomura M, Aoyagi-Ikeda K, Matsui H, Hara K, et al. Hypoxia-inducible factor-1alpha mediates TGF-beta-induced PAI-1 production in alveolar macrophages in pulmonary fibrosis. Am J Physiol Lung Cell Mol Physiol. 2011;300(5):L740–52.

    CAS  Article  PubMed  Google Scholar 

  14. 14.

    Bauman KA, Wettlaufer SH, Okunishi K, Vannella KM, Stoolman JS, Huang SK, et al. The antifibrotic effects of plasminogen activation occur via prostaglandin E2 synthesis in humans and mice. J Clin Invest. 2010;120(6):1950–60.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  15. 15.

    Horikawa M, Fujimoto M, Hasegawa M, Matsushita T, Hamaguchi Y, Kawasuji A, et al. E- and P-selectins synergistically inhibit bleomycin-induced pulmonary fibrosis. Am J Pathol. 2006;169(3):740–9.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  16. 16.

    Seibold MA, Smith RW, Urbanek C, Groshong SD, Cosgrove GP, Brown KK, et al. The idiopathic pulmonary fibrosis honeycomb cyst contains a mucocilary pseudostratified epithelium. PLoS One. 2013;8(3):e58658.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  17. 17.

    Flaherty KR, Travis WD, Colby TV, Toews GB, Kazerooni EA, Gross BH, et al. Histopathologic variability in usual and nonspecific interstitial pneumonias. Am J Respir Crit Care Med. 2001;164(9):1722–7.

    CAS  Article  PubMed  Google Scholar 

  18. 18.

    Steele MP, Speer MC, Loyd JE, Brown KK, Herron A, Slifer SH, et al. The Clinical and Pathologic Features of Familial Interstitial Pneumonia (FIP). Am J Respir Crit Care Med. 2005;172(9):1146–52.

    PubMed Central  Article  PubMed  Google Scholar 

  19. 19.

    Yang IV, Burch LH, Steele MP, Savov JD, Hollingsworth JW, McElvania-Tekippe E, et al. Gene expression profiling of familial and sporadic interstitial pneumonia. Am J Respir Crit Care Med. 2007;175(1):45–54.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  20. 20.

    Selman M, Carrillo G, Estrada A, Mejia M, Becerril C, Cisneros J, et al. Accelerated variant of idiopathic pulmonary fibrosis: clinical behavior and gene expression pattern. PLoS One. 2007;2(5):e482.

    PubMed Central  Article  PubMed  Google Scholar 

  21. 21.

    Boon K, Bailey NW, Yang J, Steel MP, Groshong S, Kervitsky D, et al. Molecular phenotypes distinguish patients with relatively stable from progressive idiopathic pulmonary fibrosis (IPF). PLoS One. 2009;4(4):e5134.

    PubMed Central  Article  PubMed  Google Scholar 

  22. 22.

    Collier FM, Gregorio-King CC, Gough TJ, Talbot CD, Walder K, Kirkland MA. Identification and characterization of a lymphocytic Rho-GTPase effector: rhotekin-2. Biochem Biophys Res Commun. 2004;324(4):1360–9.

    CAS  Article  PubMed  Google Scholar 

  23. 23.

    Mabbott NA, Baillie JK, Brown H, Freeman TC, Hume DA. An expression atlas of human primary cells: inference of gene function from coexpression networks. BMC Genomics. 2013;14:632.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  24. 24.

    Myouzen K, Kochi Y, Okada Y, Terao C, Suzuki A, Ikari K, et al. Functional variants in NFKBIE and RTKN2 involved in activation of the NF-kappaB pathway are associated with rheumatoid arthritis in Japanese. PLoS Genet. 2012;8(9):e1002949.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  25. 25.

    Schmidt S, Moser M, Sperandio M. The molecular basis of leukocyte recruitment and its deficiencies. Mol Immunol. 2013;55(1):49–58.

    CAS  Article  PubMed  Google Scholar 

  26. 26.

    Hayashi S, Abe K, Matsuoka H, Goya S, Morishita H, Mori M, et al. Increased level of soluble E-selectin in the serum from patients with idiopathic pulmonary fibrosis. Inflammation. 2004;28(1):1–5.

    CAS  Article  PubMed  Google Scholar 

  27. 27.

    Kaplan F, Ledoux P, Kassamali FQ, Gagnon S, Post M, Koehler D, et al. A novel developmentally regulated gene in lung mesenchyme: homology to a tumor-derived trypsin inhibitor. Am J Physiol. 1999;276(6 Pt 1):L1027–36.

    CAS  PubMed  Google Scholar 

  28. 28.

    Selman M, Pardo A, Kaminski N. Idiopathic pulmonary fibrosis: aberrant recapitulation of developmental programs? PLoS Med. 2008;5(3):e62.

    PubMed Central  Article  PubMed  Google Scholar 

  29. 29.

    Yang IV, Coldren CD, Leach SM, Seibold MA, Murphy E, Lin J, et al. Expression of cilium-associated genes defines novel molecular subtypes of idiopathic pulmonary fibrosis. Thorax. 2013;68(12):1114–21.

    Article  PubMed  Google Scholar 

  30. 30.

    Tumor Analysis Best Practices Working Group. Expression profiling--best practices for data generation and interpretation in clinical trials. Nat Rev Genet. 2004;5(3):229–37.

    Article  Google Scholar 

  31. 31.

    Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 2003;31(4):e15.

    PubMed Central  Article  PubMed  Google Scholar 

  32. 32.

    Hunter L, Taylor RC, Leach SM, Simon R. GEST: a gene expression search tool based on a novel Bayesian similarity metric. Bioinformatics. 2001;17 Suppl 1:S115–22.

    Article  PubMed  Google Scholar 

  33. 33.

    Xia J, Benner MJ, Hancock RE. NetworkAnalyst--integrative approaches for protein-protein interaction network analysis and visual exploration. Nucleic Acids Res. 2014;42(Web Server issue):W167–74.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

Download references


This study was funded by the National Heart Lung and Blood Institute (HL095393, HL097163, HL099571, HL101715, and HL121572). The funding agency had no role in design, in the collection, analysis, and interpretation of data; in the writing of the manuscript; nor in the decision to submit the manuscript for publication.

Author information



Corresponding author

Correspondence to Ivana V. Yang.

Additional information

Competing interests

Dr. Coldren has a pending US Patent application serial no: 61/666,233. Dr. Schwartz reports expert testimony fees from Weitz and Luxenberg Law Firm, Brayton, Purcell Law Firm, Wallace and Graham Law Firm. Royalties, outside the submitted work. In addition, Dr. Schwartz has pending US patent application serial #: 61/666,233 and serial #: 60/992,079 pending. Dr. Yang received personal fees from Boehringer-Ingleheim, outside the submitted work. In addition, Dr. Yang has a pending US Patent application serial no: 61/666,233. Other authors declare no competing interests.

Authors’ contributions

MPS, IVY, and DAS designed the study and wrote the manuscript; MPS, CDC, TEF, MIS, and DAS developed the conceptual approaches to data analysis; LGL, CDC, CME, and IVY analyzed the data; MPS, SG, CC, GPC, KKB and MIS performed clinical and pathological phenotyping of lung specimens; EM, CH, and DH performed laboratory work. All authors read and approved the final manuscript.

David A. Schwartz and Ivana V. Yang shared senior authors.

Additional file

Additional file 1:

Supplemental Figures and Tables. (PDF 2199 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Steele, M.P., Luna, L.G., Coldren, C.D. et al. Relationship between gene expression and lung function in Idiopathic Interstitial Pneumonias. BMC Genomics 16, 869 (2015).

Download citation


  • False Discovery Rate
  • Idiopathic Pulmonary Fibrosis
  • Force Vital Capacity
  • Categorical Analysis
  • Usual Interstitial Pneumonia