Skip to main content


Plasma small ncRNA pair panels as novel biomarkers for early-stage lung adenocarcinoma screening

Article metrics

  • 874 Accesses

  • 4 Citations



Lung cancer is a major cause of cancer-related mortality worldwide, and around two-thirds of patients have metastasis at diagnosis. Thus, detecting lung cancer at an early stage could reduce mortality. Aberrant levels of circulating small non-coding RNAs (small ncRNAs) are potential diagnostic or prognostic markers for lung cancer. We aimed to identify plasma small ncRNA pairs that could be used for early screening and detection of lung adenocarcinoma (LAC).


A panel of seven small ncRNA pair ratios could differentiate patients with LAC or benign lung disease from high-risk controls with an area under the curve (AUC) of 100.0%, a sensitivity of 100.0% and a specificity of 100.0% at the training stage (which included 50 patients with early-stage LAC, 35 patients with benign diseases and 29 high-risk controls) and an AUC of 90.2%, a sensitivity of 91.5% and a specificity of 80.4% at the validation stage (which included 44 patients with early-stage LAC, 32 patients with benign diseases and 51 high-risk controls). The same panel could distinguish LAC from high-risk controls with an AUC of 100.0%, a sensitivity of 100.0% and a specificity of 100.0% at the training stage and an AUC of 89.5%, a sensitivity of 85.4% and a specificity of 83.3% at the validation stage. Another panel of five small ncRNA pair ratios (different from the first) was able to differentiate LAC from benign disease with an AUC of 82.0%, a sensitivity of 81.1% and a specificity of 78.1% in the training cohort and an AUC of 74.2%, a sensitivity of 70.4% and a specificity of 72.7% in the validation cohort.


Several small ncRNA pair ratios were identified as markers capable of discerning patients with LAC from those with benign lesions or high-risk control individuals.


Lung cancer is one of the main causes of cancer-related deaths worldwide [1]. In the USA, the incidence of lung cancer was estimated to be the second highest among all cancers (224,390 new cases in 2016), and lung cancer was predicted to be the most important cause of cancer-related mortality (158,080 deaths in 2016) [2]. About 80% of all lung cancers are non-small cell lung cancers (NSCLC) [3], and the two most common NSCLCs are lung adenocarcinoma (LAC, about 50%) and squamous cell carcinoma (SqCC, about 30%) [4].

Detecting lung cancer at its early stages could reduce mortality rates by 10- to 50-fold [5], but about two-thirds of patients have metastasis at diagnosis. Low-dose computed tomography (LDCT) provides a non-invasive method to detect early-stage tumors, but the rate of false-positive diagnosis is high [6, 7]. Molecular biomarkers could represent a promising screening approach.

Small non-coding RNAs (small ncRNAs), including microRNAs (miRNAs), nucleolar RNAs and tRNAs, have been shown to repress or degrade specific transcripts involved in cell fate and proliferation, cell death, energy metabolism and tumorigenesis [8]. When circulating in plasma/serum, mature miRNAs form a miRNA-Argonaute-protein complex that ensures their stability [9]. Therefore, small ncRNAs can be measured non-invasively with remarkable stability and repeatability [10]. Thus, aberrant levels of circulating miRNAs could be potential diagnostic or prognostic markers in lung [11], colorectal [12], prostate [13] and breast [14, 15] cancers.

The normalization of data for plasma/serum small ncRNA levels measured using quantitative reverse transcription polymerase chain reaction (qRT-PCR) is challenging, and this is an obstacle to standardization of the measurements. For this reason, a ratio-based method is critical for the analysis of data regarding circulating small ncRNAs. Many researchers have chosen to ‘spike’ samples with a synthetic RNA sequence (like C. elegans miR-39 and miR-54, or plant miRNAs) in order to normalize qRT-PCR data for circulating miRNA levels [16,17,18]. However, synthetic miRNAs are not protected from endogenous RNase activity and are rapidly degraded [18, 19], and none have been established for quantification of miRNAs in the blood [20,21,22,23]. miR-16 is frequently used as a control [24], but elevated serum levels of miR-16 correlate with bone metastasis in patients with breast cancer [25]. To bypass the normalization issue, some studies have analyzed plasma miRNA values by looking at the reciprocal ratios of miRNAs in the same sample [26,27,28].

In the present study, ratios of miRNAs in the same sample were used to reduce experimental variation. Rather than directly comparing the level of a single small ncRNA between groups, the ratio of any two small ncRNAs was calculated for the same sample, and then the expression level ratio was compared between different groups. Since the two targets are simultaneously measured in the same sample under the same conditions, the relative expression level (calculated as a ratio) should reflect a true value for comparison between groups.

Therefore, the aim of the present research was to perform a small ncRNA profiling study using next generation sequencing to measure whole genome-level small ncRNAs in plasma specimens from patients with early LAC, patients with benign lung lesions and high-risk controls.


Patient cohorts

For the training cohort, 1250 patients were enrolled at the Cancer Center of Rush University Medical Center (RUMC, Chicago, IL, USA) from March 2004 to October 2010. Among these patients, a sub-cohort of 114 patients (including 50 patients with early-stage [stage I or II] LAC, 35 patients with benign disease, and 29 high-risk individuals without lung disease) was selected for this pilot study. These patients had been followed up for at least two years and their diagnosis had not changed during follow-up.

LAC was staged according to the TNM Classification of Malignant Tumours, 6th edition. The inclusion criteria were: 1) disease confined to the chest without evidence of distant metastases; 2) no preoperative chemotherapy or radiotherapy within 1 year of the initial blood sampling; 3) a minimum of 2 years of clinical follow-up data; and 4) Caucasian.

Patients with benign lesions included participants with a range of non-neoplastic pulmonary disorders (e.g. granulomas, hamartomas and inflammatory lesions) as suggested by LDCT screening. All participants with benign diseases and the high-risk individuals without lung disease were followed-up by annual LDCT and remained cancer-free for a minimum of 2 years.

For the validation stage, 127 individuals (including 44 patients with early-stage LAC, 32 patients with benign diseases and 51 individuals without lung disease) were recruited at the Lung Cancer Biospecimen Resource Network (LCBRN, University of Virginia, Charlottesville, VA, USA) between March 2014 and October 2014. Note that the 127 individuals in the validation cohort were not from the original 1250 individuals used for the training cohort.

The study was approved by the institutional review board of RUMC. All participants provided written informed consent. The training cohort was from RUMC, and the validation cohort was from the LCBRN. The study was conducted at the RUMC.

Collection of plasma samples

The plasma samples were collected and processed according to a standard protocol commonly used in many laboratories. All blood samples were collected using EDTA-anticoagulant tubes and centrifuged first at 4000 rpm for 10 min and then at 12,000 rpm for 15-min to completely remove cell debris. The supernatant (plasma) was stored at − 80 °C until analysis. No vigorous shaking or mixing was allowed during the processing of the samples. All samples were collected when the diagnosis was first made.

Experimental strategy

To obtain an expression profile of plasma small ncRNAs that was specific for LAC, initial screening by Illumina next-generation sequencing and validation by qRT-PCR were used on an individual basis. The first step was to compare the profiles of the plasma expression ratios of small ncRNAs between participants. Then, specific small ncRNAs were tested.

RNA isolation, qRT-PCR and Illumina next-generation sequencing

RNA isolation was performed as described previously [29]. Total RNA, including miRNA, was isolated from plasma using the Qiagen miRNeasy Mini kit (Qiagen, Valencia, CA, USA) in accordance with the manufacturer’s protocol, with minor modifications. In brief, 0.5 mL of plasma was diluted 1:1 with RNase-free water (a total of 1 mL) to achieve full phase separation. QIAzol® LS Reagent (3 mL) was added to each sample. The sample (total of 4 mL) was mixed in a tube, vortexed for 10 s, and incubated at room temperature for 15 min to allow complete dissociation of the nucleoprotein complex. The homogenized solution was centrifuged at 12,000×g for 10 min at 4 °C. The supernatant was transferred, and 0.8 mL of chloroform was added. After mixing vigorously for 15 s, the sample was centrifuged at 12,000×g for 15 min. The upper aqueous phase was carefully transferred to a new collection tube, and 2.5× volume of ethanol was added. The sample was applied directly to a silica membrane, and the RNA was bound and cleaned with buffers provided by the manufacturer to remove impurities. The immobilized RNA was collected from the membrane with 16 μL of RNase-free water (pre-warmed at 80 °C).

Small ncRNAs were measured using TaqMan MicroRNA Reverse Transcription Kits (Applied Biosystems, Foster City, CA, USA) in accordance with the manufacturer’s protocol. Briefly, about 30 ng enriched RNA was reverse transcribed with a TaqMan MicroRNA Reverse Transcription Kit in a reaction volume of 15 μL. The expression levels of the small ncRNAs were quantified in triplicate by qRT-PCR using human TaqMan MicroRNA Assay Kits (Applied Biosystems) and an iPLEX 4 system (Eppendorf, Hauppauge, NY, USA).

Illumina next-generation sequencing was used according to a method described previously [30]. Small RNA sequencing (smRNA-seq) was first performed to identify plasma miRNAs and some other circulating small ncRNAs in six samples pooled from 29 high-risk healthy individuals (there were 30 samples originally, but technical failure occurred in case), 30 individuals with benign lesions and 30 patients with LAC. The samples were from the training cohort. The pooled samples were made using 500 μL from each individual. Around 20 million reads were undertaken per sample, and about 90% of the reads aligned to the human genome.

For the library preparation, 6-μL volumes of the eluates from the plasma RNA isolation were used. Library preparation was performed using a minor modification of the Illumina protocol (Illumina, San Diego, CA, USA). A miRNA library was made from each RNA sample by 3library was made from each RNA sample by human genome ligation, reverse transcription, and PCR amplification. Libraries were then pooled in batches of 12 samples of equal amounts and clustered with a concentration of 10.5 pmol in one lane for each single-read flow cell using cBot (Illumina). Sequencing (50 cycles) was performed on a HiSeq 2500 system (Illumina) using the primer sequences listed in Table 1. Demultiplexing of the raw sequencing data and generation of the FASTQ files were performed using CASAVA v1.8.2 (Illumina).

Table 1 Primer sequences of the small ncRNAs

Analysis of the smRNA-seq data.

The 3′ sequencing adapter was removed from the FASTQ files by local alignment of the adapter to the sequenced reads. Cutadapt software was used to remove the 3′ adapter [31]. All sequences having a length < 15 bp after adapter removal were discarded.

The reads in each library were summarized to tags in a quantified FASTA format. The FASTA reads were then mapped to the genome under consideration with Bowtie [32, 33]. To eliminate the ambiguous mapping hits, only the uniquely mapped loci with the fewest alignment mismatches were reported, allowing for a maximum of two mismatches [34,35,36]. The clean reads were then re-mapped back to human small ncRNAs using Bow-tie, the small ncRNA abundance was determined using Cufflinks software, and the annotation for each mapped locus was derived from ncRNA databases such as miRBase and Dfam [37, 38].

Selection of differentially expressed small ncRNA pairs

To explore the high-throughput smRNA-seq data generated for each pooled sample, multiple-step bioinformatics data analysis was performed including adapter trimming, quantification, alignment, and identification of miRNAs and other small ncRNA species. Five types of small ncRNA were identified, including miRNAs (mature miRNAs and pre-miRNAs), snoRNAs, tRNAs, rRNAs and scRNAs. The averaged detectable numbers of small ncRNAs per pooled sample were narrowed down, based on at least 50 copies for a small ncRNA in any one of the pooled samples. Next, the ratios of any two small ncRNAs (except pre-miRNAs) were calculated in the same sample for all pooled samples, achieving on average about 333,336 ratios for each sample.

To provide a list of differentially expressed small ncRNA pairs, differential expression analysis was performed with comparison of LAC and benign diseases vs. control (i.e. individuals without lung disease), LAC vs. control, and LAC vs. benign, based on a fold change ≥2 and corrected P-value ≤0.05.

Using this strategy, a list of apparent small ncRNA pairs that fulfilled all three criteria (50 copies, fold change ≥2 and corrected P-value ≤0.05) was obtained from the sequenced samples, and these small ncRNA pairs were considered as candidate plasma biomarkers for LAC (Additional file 1: Table S1).

To demonstrate that the selected candidates were not only clinically useful and applicable but also highly sensitive, specific and accurate for the differentiation of LAC from benign disease and no lung disease (i.e. controls), receiver-operating characteristic (ROC) curve analysis was performed and the small ncRNA pairs were selected as individual plasma small ncRNA pair biomarkers for the diagnosis of LAC if they met these criteria: 1) sensitivity > 80%; 2) specificity > 80%; and 3) area under the ROC curve (AUC) > 0.800.

Data were compared in terms of lesion characteristics using WEKA 3.7 software (University of Waikato) for modeling [39]. Support vector machine recursive feature elimination (SVM-RFE) and a SVM classification algorithm were used to rank individual apparent small ncRNA pairs according to their predictive power to discriminate between the three groups in the training stage, and 10-fold cross validation was used to estimate the performance of the predictive model.

Identification of a panel of small ncRNA pairs as candidate biomarkers for early-stage LAC using qRT-PCR

Small ncRNAs were measured in the training and validation cohorts using TaqMan MicroRNA Assay Kits (Applied Biosystems), in accordance with the manufacturer’s protocol. Briefly, about 30 ng of enriched RNA was reverse transcribed with a TaqMan Small ncRNA Reverse Transcription Kit (Applied Biosystems) in a 15-μL reaction volume. Expression levels of small ncRNAs were quantified in triplicate by qRT-PCR using human TaqMan MicroRNA Assay Kits (Applied Biosystems) and an iPLEX 4 system (Eppendorf). To bypass the normalization issue, we used the same ratio strategy described above to reduce experimental variation.

Statistical and bioinformatics analysis

The analysis was performed using SPSS 20.0 (IBM, Armonk, NY, USA). After the plasma concentrations of the small ncRNAs had been log2-transformed, Student’s t-test was used to compare mean small ncRNA ratios between the LAC, benign and control groups. The difference between two groups (group X vs. group Y) in the plasma miRNA ratio was analyzed using the equation: RATIO(group X vs. group Y) = mean of ΔCTX(miR1/miR2) – mean of ΔCTY(miR1/miR2), where CTGROUP(miR1/miR2) = CTGROUP(miR2) – CTGROUP(miR1). The fold change (FC) of group X/group Y was calculated as: FC = 2RATIO. The chi-squared test was used to compare the distributions of the training and validation cohorts with regard to gender, race and tumor stage. Two-sided P-values < 0.05 were considered statistically significant.


Characteristics of the patients

There were no significant differences among the three groups in age, gender and smoking history (Table 2).

Table 2 Characteristics of the patients in the training and validation stages

We identified 342 miRNAs, 47 tRNAs, 19 snoRNAs, 3rRNAs and 4 scrRNAs in the six pooled samples. The list of small ncRNA pairs that apparently fulfilled all three criteria in the training stage and were candidate biomarkers for LAC are listed in Additional file 1: Table S1. The ratios based on the sequencing data were found to be consistent with those from actual PCR data for the training and validation stages (Fig. 1). Data for each group describing the means and standard deviations for the expression ratios of the various small ncRNA pairs are presented in Additional file 1: Table S2. Furthermore, scatter plots comparing the expression ratio of each small ncRNA pair between groups are shown in Additional file 1: Figures S1–S3.

Fig. 1

Comparison of RATIO values for two panels of ncRNA pairs between sequencing data and qRT-PCR data for the training and validation stages. Upper graph: panel 1, lung adenocarcinoma (LAC) and benign disease (benign) vs. no lung disease (control); middle graph: panel 1, LAC vs. control; lower graph: panel 2, LAC vs. benign

A panel of small ncRNA pairs distinguished patients with LAC or benign disease from control individuals

In the training stage, a panel of seven small ncRNA pairs (designated Panel 1) was identified as a candidate panel for differentiating patients with early-stage LAC or benign disease from controls; this panel included miR-22/miR-378, miR-423/miR-378, miR-22/sno-U57, miR-126/sno-U57, miR-152/sno-U57, miR-423/sno-U57 and miR-22/sno-DR119 (Table 3). All seven small ncRNA pairs showed significantly increased RATIO values in the LAC+benign group compared with the control group (Table 3). Analysis of the predictive power of this panel for the diagnosis of early-stage lung disease revealed an AUC of 100.0%, a sensitivity of 100.0% and a specificity of 100.0% in the training stage (Table 4 and Fig. 2a).

Table 3 Panels of small ncRNA pairs that distinguished between individuals with lung adenocarcinoma, benign lung disease and no lung disease (controls)
Table 4 Predictive values of small ncRNA pair panels at the training and validation stages
Fig. 2

Receiver operating characteristic (ROC) curve analysis of small ncRNA pair panels for disease prediction in the training and validation stages. Shown are the area under the ROC curve (AUC) values of Panel 1 for lung adenocarcinoma (LAC) and benign vs. control (training: a; validation: b), Panel 1 for LAC vs. control (training: c; validation: d), and Panel 2 for LAC vs. benign (training: e; validation: f)

Panel 1 was further tested in the validation stage, which was independent of the training stage. The variations in the RATIO values of the seven small ncRNA pairs between groups were similar for the validation and training stages (Table 3). At the validation stage, the combination of these seven small ncRNA pair markers yielded a predictive power with a sensitivity of 84.3%, a specificity of 82.9% and an AUC of 90.2% (Table 4 and Fig. 2b).

As shown in Table 3, Panel 1 was able to distinguish the LAC group from the control group. All seven small ncRNA pairs had significantly higher RATIO values in the LAC group than in the control group (Table 3). The predictive power of Panel 1 for differentiating patients with early-stage LAC from controls had a sensitivity of 100.0%, a specificity of 100.0% and an AUC of 100.0% in the training stage (Table 4 and Fig. 2c) and a sensitivity of 81.8%, a specificity of 86.3% and an AUC of 89.5% in the validation stage (Table 4 and Fig. 2d).

A specific panel of small ncRNA pair biomarkers distinguished LAC from benign disease

A panel of 5 small ncRNA pair markers (Panel 2) was found to specifically separate LAC from benign lesions; this panel included miR-374a-5p/miR-126-5p, miR-374a-5p/miR-152-3p, miR-374a-5p/miR-378a-3p, miR-374a-5p/miR-423-5p and miR-374a-5p/tRNA-Thr-ACG. All five small ncRNA pairs had a significantly higher RATIO value in the LAC group than in the benign group (Table 3). In the training stage, this panel demonstrated predictive power with a sensitivity of 81.1%, a specificity of 78.1% and an AUC of 82.0% (Table 4 and Fig. 2e). In the validation stage, the sensitivity was 70.4%, the specificity was 72.7%, and the AUC was 74.2% (Table 4 and Fig. 2f). Thus, the ability of Panel 2 to differentiate between the LAC and benign groups was not as good as the ability of Panel 1 to differentiate between the LAC and control groups.


In this present study, profiling of plasma small ncRNA pairs in patients with and without LAC identified a distinct panel of seven small ncRNA pairs that could help to predict LAC at an early stage. To the best of our knowledge, this is the first report using next generation sequencing of plasma small ncRNA pairs (other than miRNAs) for the early detection of lung cancer. Plasma is an ideal sample on which to base the development of a quick, non-invasive blood test for the early diagnosis of LAC. In the present study, the false positive rates for distinguishing lung disease (LAC and benign disease) from controls and LAC from controls were lower than those reported for LDCT screening alone (13–17.1%) [6, 7]. The sensitivity, specificity and AUC of these small ncRNA panels may not be high enough to readily distinguish between LAC, benign disease and controls using the profiles alone, but this study suggests that these small ncRNA panels could be used with LDCT-based screening methods to distinguish patients with LAC from high-risk individuals, potentially improving the currently available approaches [6, 7].

miR-22 suppresses lung cancer cell progression [40] and is a predictive marker for pemetrexed-based chemotherapy [41]. miR-126 inhibits NSCLC proliferation [42], enhances the sensitivity of NSCLC to anticancer agents [43] and is associated with the prognosis of NSCLC [44]. miR-152 regulates metastasis of NSCLC [45]. miR-374a suppresses lung cancer cell proliferation [46] and is a prognostic marker for NSCLC [47]. miR-378 is a tumor suppressor in NSCLC [48] but could be involved in brain metastasis [49]. The possible involvement of miR-423-5p in lung cancer has not been reported before.

The results of this study showed a sensitivity of 84.3%, specificity of 82.9% and AUC of 90.2% for distinguishing patients with lung disease (LAC or benign disease) from controls. In a previous investigation, a panel of 16 ratios involving 13 different miRNAs correctly classified 16 of 19 patients, with a sensitivity of 84% and a specificity of 80% [26]. Furthermore, a miRNA signature classifier algorithm showed a sensitivity of 87% and a specificity of 81% for the detection of lung cancer, and when this classifier algorithm was combined with LDCT, it reduced the false positive rate from 19.4 to 3.7% [27]. Other research showed that a 10-miRNA biomarker profile had high AUC, sensitivity and specificity values for the detection of NSCLC (97, 93 and 90%, respectively) [18]. A study that assessed miRNA in sputum samples identified four miRNAs that distinguished patients with LAC from control individuals with a sensitivity of 80.6% and a specificity of 91.7% [50].

The present study is not without limitations. The sample size was relatively small and the participants were from only two centers (one center for each cohort). SqCC samples were not included. Only Caucasians were included, limiting the generalizability of the results. A panel of small ncRNA pairs was not identified that could distinguish the LAC group from the benign and control groups (considered together rather than separately). Other RNAs, such as lncRNAs, ceRNAs and circRNAs, were not considered. Formal assessments of the internal and external reproducibility of the measurements were not performed. However, the present study did show a similar pattern of qRT-PCR results at the training and validation stages (which used independent cohorts), and repeat qRT-PCR experiments in the same samples 3 months after the initial measurements yielded consistent findings (data not shown). Additional studies are necessary to confirm the results of this study before this technique can be used as a screening method.

In the present study, the samples were prospectively collected from patients who had at least 2 years of clinical follow-up without a change in status. This should ensure that the data accurately reflect the disease status at the time of collection and means that we can potentially predict the cancer 2 years before it occurs. Because of the difficulties in normalizing the levels of small ncRNAs, the use of a ratio-based method for circulating small ncRNAs is probably key to identifying small ncRNA biomarkers, and this strategy will be validated in a larger dataset of individuals with no lung disease (controls), benign lung disease and lung cancer. If successfully validated, this ratio strategy could then be applied in the clinic setting, enabling the use of circulating small ncRNA biomarkers for the early detection of cancer in the future.


Several small ncRNA pair ratios were identified as markers capable of discerning patients with LAC from those with benign lesions or high-risk control individuals.



lung adenocarcinoma


Lung Cancer Biospecimen Resource Network




non-small cell lung cancers


quantitative reverse-transcription polymerase chain reaction


Rush University Medical Center

(small ncRNAs):

small non-coding RNAs


small RNA sequencing


  1. 1.

    Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65:87–108.

  2. 2.

    Siegel RL, Miller KD, Jemal A. Cancer statistics, 2016. CA Cancer J Clin. 2016;66:7–30.

  3. 3.

    Beadsmoore CJ, Screaton NJ. Classification, staging and prognosis of lung cancer. Eur J Radiol. 2003;45:8–17.

  4. 4.

    Perez-Moreno P, Brambilla E, Thomas R, Soria JC. Squamous cell carcinoma of the lung: molecular subtypes and therapeutic opportunities. Clin Cancer Res. 2012;18:2443–51.

  5. 5.

    Edwards BK, Brown ML, Wingo PA, Howe HL, Ward E, Ries LA, et al. Annual report to the nation on the status of cancer, 1975-2002, featuring population-based trends in cancer treatment. J Natl Cancer Inst. 2005;97:1407–27.

  6. 6.

    International Early Lung Cancer Action Program I, Henschke CI, Yankelevitz DF, Libby DM, Pasmantier MW, Smith JP, et al. Survival of patients with stage I lung cancer detected on CT screening. N Engl J Med. 2006;355:1763–71.

  7. 7.

    Bach PB, Jett JR, Pastorino U, Tockman MS, Swensen SJ, Begg CB. Computed tomography screening and lung cancer outcomes. JAMA. 2007;297:953–61.

  8. 8.

    Flynt AS, Lai EC. Biological principles of microRNA-mediated regulation: shared themes amid diversity. Nat Rev Genet. 2008;9:831–42.

  9. 9.

    Turchinovich A, Weiz L, Langheinz A, Burwinkel B. Characterization of extracellular circulating microRNA. Nucleic Acids Res. 2011;39:7223–33.

  10. 10.

    Mitchell PS, Parkin RK, Kroh EM, Fritz BR, Wyman SK, Pogosova-Agadjanyan EL, et al. Circulating microRNAs as stable blood-based markers for cancer detection. Proc Natl Acad Sci U S A. 2008;105:10513–8.

  11. 11.

    Cazzoli R, Buttitta F, Di Nicola M, Malatesta S, Marchetti A, Rom WN, et al. microRNAs derived from circulating exosomes as noninvasive biomarkers for screening and diagnosing lung cancer. J Thorac Oncol. 2013;8:1156–62.

  12. 12.

    Redova M, Sana J, Slaby O. Circulating miRNAs as new blood-based biomarkers for solid cancers. Future Oncol. 2013;9:387–402.

  13. 13.

    Bryant RJ, Pawlowski T, Catto JW, Marsden G, Vessella RL, Rhees B, et al. Changes in circulating microRNA levels associated with prostate cancer. Br J Cancer. 2012;106:768–74.

  14. 14.

    Ng EK, Li R, Shin VY, Jin HC, Leung CP, Ma ES, et al. Circulating microRNAs as specific biomarkers for breast cancer detection. PLoS One. 2013;8:e53141.

  15. 15.

    Cuk K, Zucknick M, Heil J, Madhavan D, Schott S, Turchinovich A, et al. Circulating microRNAs in plasma as early detection markers for breast cancer. Int J Cancer. 2013;132:1602–12.

  16. 16.

    Schetter AJ, Harris CC. Plasma microRNAs: a potential biomarker for colorectal cancer? Gut. 2009;58:1318–9.

  17. 17.

    Patz EF Jr, Pinsky P, Gatsonis C, Sicks JD, Kramer BS, Tammemagi MC, et al. Overdiagnosis in low-dose computed tomography screening for lung cancer. JAMA Intern Med. 2014;174:269–74.

  18. 18.

    Chen X, Hu Z, Wang W, Ba Y, Ma L, Zhang C, et al. Identification of ten serum microRNAs from a genome-wide serum microRNA expression profile as novel noninvasive biomarkers for nonsmall cell lung cancer diagnosis. Int J Cancer. 2012;130:1620–8.

  19. 19.

    Schwarzenbach H, Nishida N, Calin GA, Pantel K. Clinical relevance of circulating cell-free microRNAs in cancer. Nat Rev Clin Oncol. 2014;11:145–56.

  20. 20.

    Hu J, Wang Z, Liao BY, Yu L, Gao X, Lu S, et al. Human miR-1228 as a stable endogenous control for the quantification of circulating microRNAs in cancer patients. Int J Cancer. 2014;135:1187–94.

  21. 21.

    McDermott AM, Kerin MJ, Miller N. Identification and validation of miRNAs as endogenous controls for RQ-PCR in blood specimens for breast cancer studies. PLoS One. 2013;8:e83718.

  22. 22.

    Zheng G, Wang H, Zhang X, Yang Y, Wang L, Du L, et al. Identification and validation of reference genes for qPCR detection of serum microRNAs in colorectal adenocarcinoma patients. PLoS One. 2013;8:e83025.

  23. 23.

    Hu Z, Dong J, Wang LE, Ma H, Liu J, Zhao Y, et al. Serum microRNA profiling and breast cancer risk: the use of miR-484/191 as endogenous controls. Carcinogenesis. 2012;33:828–34.

  24. 24.

    Kroh EM, Parkin RK, Mitchell PS, Tewari M. Analysis of circulating microRNA biomarkers in plasma and serum using quantitative reverse transcription-PCR (qRT-PCR). Methods. 2010;50:298–301.

  25. 25.

    Ell B, Mercatali L, Ibrahim T, Campbell N, Schwarzenbach H, Pantel K, et al. Tumor-induced osteoclast miRNA changes as regulators and biomarkers of osteolytic bone metastasis. Cancer Cell. 2013;24:542–56.

  26. 26.

    Boeri M, Verri C, Conte D, Roz L, Modena P, Facchinetti F, et al. MicroRNA signatures in tissues and plasma predict development and prognosis of computed tomography detected lung cancer. Proc Natl Acad Sci U S A. 2011;108:3713–8.

  27. 27.

    Sozzi G, Boeri M, Rossi M, Verri C, Suatoni P, Bravi F, et al. Clinical utility of a plasma-based miRNA signature classifier within computed tomography lung cancer screening: a correlative MILD trial study. J Clin Oncol. 2014;32:768–73.

  28. 28.

    Fortunato O, Boeri M, Verri C, Conte D, Mensah M, Suatoni P, et al. Assessment of circulating microRNAs in plasma of lung cancer patients. Molecules. 2014;19:3038–54.

  29. 29.

    Chen H, Liu H, Zou H, Chen R, Dou Y, Sheng S, et al. Evaluation of plasma miR-21 and miR-152 as diagnostic biomarkers for common types of human cancers. J Cancer. 2016;7:490–9.

  30. 30.

    Wu X, Somlo G, Yu Y, Palomares MR, Li AX, Zhou W, et al. De novo sequencing of circulating miRNAs identifies novel markers predicting clinical outcome of locally advanced breast cancer. J Transl Med. 2012;10:42.

  31. 31.

    Chen C, Khaleel SS, Huang H, Wu CH. Software for pre-processing Illumina next-generation sequencing short read sequences. Source Code Biol Med. 2014;9:8.

  32. 32.

    Langmead B, Salzberg SL. Fast gapped-read alignment with bowtie 2. Nat Methods. 2012;9:357–9.

  33. 33.

    Hackenberg M, Rodriguez-Ezpeleta N, Aransay AM. miRanalyzer: an update on the detection and analysis of microRNAs in high-throughput sequencing experiments. Nucleic Acids Res. 2011;39:W132–8.

  34. 34.

    Muller S, Rycak L, Winter P, Kahl G, Koch I, Rotter B. omiRas: a web server for differential expression analysis of miRNAs derived from small RNA-Seq data. Bioinformatics. 2013;29:2651–2.

  35. 35.

    Humphreys DT, Suter CM. miRspring: a compact standalone research tool for analyzing miRNA-seq data. Nucleic Acids Res. 2013;41:e147.

  36. 36.

    Williamson V, Kim A, Xie B, McMichael GO, Gao Y, Vladimirov V. Detecting miRNAs in deep-sequencing data: a software performance comparison and evaluation. Brief Bioinform. 2013;14:36–45.

  37. 37.

    Kozomara A, Griffiths-Jones S. miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res. 2014;42:D68–73.

  38. 38.

    Wheeler TJ, Clements J, Eddy SR, Hubley R, Jones TA, Jurka J, et al. Dfam: a database of repetitive DNA based on profile hidden Markov models. Nucleic Acids Res. 2013;41:D70–82.

  39. 39.

    Witten IH, Frank E. Data mining: practical machine learning tools and techniques. 2nd ed. San Francisco: Morgan Kaufmann; 2005.

  40. 40.

    Xin M, Qiao Z, Li J, Liu J, Song S, Zhao X, et al. miR-22 inhibits tumor growth and metastasis by targeting ATP citrate lyase: evidence in osteosarcoma, prostate cancer, cervical cancer and lung cancer. Oncotarget. 2016;7:44252–65.

  41. 41.

    Franchina T, Amodeo V, Bronte G, Savio G, Ricciardi GR, Picciotto M, et al. Circulating miR-22, miR-24 and miR-34a as novel predictive biomarkers to pemetrexed-based chemotherapy in advanced non-small cell lung cancer. J Cell Physiol. 2014;229:97–9.

  42. 42.

    Sun Y, Bai Y, Zhang F, Wang Y, Guo Y, Guo L. miR-126 inhibits non-small cell lung cancer cells proliferation by targeting EGFL7. Biochem Biophys Res Commun. 2010;391:1483–9.

  43. 43.

    Zhu X, Li H, Long L, Hui L, Chen H, Wang X, et al. miR-126 enhances the sensitivity of non-small cell lung cancer cells to anticancer agents by targeting vascular endothelial growth factor A. Acta Biochim Biophys Sin Shanghai. 2012;44:519–26.

  44. 44.

    Kim MK, Jung SB, Kim JS, Roh MS, Lee JH, Lee EH, et al. Expression of microRNA miR-126 and miR-200c is associated with prognosis in patients with non-small cell lung cancer. Virchows Arch. 2014;465:463–71.

  45. 45.

    Zhang YJ, Liu XC, Du J, Zhang YJ. MiR-152 regulates metastases of non-small cell lung cancer cells by targeting neuropilin-1. Int J Clin Exp Pathol. 2015;8:14235–40.

  46. 46.

    Wu H, Liu Y, Shu XO, Cai Q. MiR-374a suppresses lung adenocarcinoma cell proliferation and invasion by targeting TGFA gene expression. Carcinogenesis. 2016;37:567–75.

  47. 47.

    Vosa U, Vooder T, Kolde R, Fischer K, Valk K, Tonisson N, et al. Identification of miR-374a as a prognostic marker for survival in patients with early-stage nonsmall cell lung cancer. Genes Chromosomes Cancer. 2011;50:812–22.

  48. 48.

    Zhang GJ, Zhou H, Xiao HX, Li Y, Zhou T. MiR-378 is an independent prognostic factor and inhibits cell growth and invasion in colorectal cancer. BMC Cancer. 2014;14:109.

  49. 49.

    Chen LT, Xu SD, Xu H, Zhang JF, Ning JF, Wang SF. MicroRNA-378 is associated with non-small cell lung cancer brain metastasis by promoting cell migration, invasion and tumor angiogenesis. Med Oncol. 2012;29:1673–80.

  50. 50.

    Yu L, Todd NW, Xing L, Xie Y, Zhang H, Liu Z, et al. Early detection of lung adenocarcinoma in sputum by a panel of microRNA markers. Int J Cancer. 2010;127:2870–8.

Download references


Small ncRNA sequencing was conducted by City of Hope National Medical Center in California.


This project was supported by NIH grant 1R21CA164764, Hawaii Community Foundation and by the Bears Care Foundation (awarded to Youping Deng). This work was also supported by NIH grants 5P30GM114737, P20GM103466, 2U54MD007601, U54MD007584 and Shenzhen Science and Technology Project (No: JCYJ20150402095058885).

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Author information

YHD, HLL and YPD carried out the studies, participated in collecting data, and drafted the manuscript. JMA and YZ performed the statistical analysis. JAB provided biospecimens, and participated in its design. HKC carried out the studies, participated in collecting data and helped to revise the manuscript. BJ helped to revise the manuscript. XL, FY and JW helped to draft the manuscript. All authors read and approved the final manuscript.

Correspondence to Bin Jiang or Youping Deng.

Ethics declarations

Ethics approval and consent to participate

The study was approved by the institutional review board of Rush University Medical Center. All participants provided written informed consent.

Competing interests

The authors declare that they have no competing interest.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1

Table S1. All candidate ncRNA pairs for lung cancer prediction. Table S2. Mean and standard deviation values of the expression ratios of the various ncRNA pairs for each group. Figure S1. Scatter plots comparing the expression ratios of the seven small ncRNA pairs in Panel 1 between the LAC+benign group and the control group for the training and validation stages. Figure S2. Scatter plots comparing the expression ratios of the seven small ncRNA pairs in Panel 1 between the LAC group and the control group for the training and validation stages. Figure S3. Scatter plots comparing the expression ratios of the five small ncRNA pairs in Panel 2 between the LAC group and the benign group for the training and validation stages. (DOCX 557 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Dou, Y., Zhu, Y., Ai, J. et al. Plasma small ncRNA pair panels as novel biomarkers for early-stage lung adenocarcinoma screening. BMC Genomics 19, 545 (2018) doi:10.1186/s12864-018-4862-z

Download citation


  • Small non-coding RNA
  • Cancer screening
  • Biomarkers
  • Lung cancer