Skip to main content

Systematic comparison of high-throughput single-cell RNA-seq methods for immune cell profiling

Abstract

Background

Elucidation of immune populations with single-cell RNA-seq has greatly benefited the field of immunology by deepening the characterization of immune heterogeneity and leading to the discovery of new subtypes. However, single-cell methods inherently suffer from limitations in the recovery of complete transcriptomes due to the prevalence of cellular and transcriptional dropout events. This issue is often compounded by limited sample availability and limited prior knowledge of heterogeneity, which can confound data interpretation.

Results

Here, we systematically benchmarked seven high-throughput single-cell RNA-seq methods. We prepared 21 libraries under identical conditions of a defined mixture of two human and two murine lymphocyte cell lines, simulating heterogeneity across immune-cell types and cell sizes. We evaluated methods by their cell recovery rate, library efficiency, sensitivity, and ability to recover expression signatures for each cell type. We observed higher mRNA detection sensitivity with the 10x Genomics 5′ v1 and 3′ v3 methods. We demonstrate that these methods have fewer dropout events, which facilitates the identification of differentially-expressed genes and improves the concordance of single-cell profiles to immune bulk RNA-seq signatures.

Conclusion

Overall, our characterization of immune cell mixtures provides useful metrics, which can guide selection of a high-throughput single-cell RNA-seq method for profiling more complex immune-cell heterogeneity usually found in vivo.

Background

Understanding the cellular diversity underlying immune responses is an important component of immunological research. Although techniques such as FACS and mass cytometry [1] are useful for studying cellular diversity according to well-characterized cell-surface-protein markers, the advent of single-cell RNA sequencing (RNA-seq) has expanded the power to characterize individual immune cells from a defined set of cell-surface markers to the entire transcriptome for last few years. These single-cell technologies have enabled immunologists to characterize inflammation [2] and immune responses to cancer [3,4,5,6,7], uncovering previously uncharacterized cellular diversity and cell-type specific transcriptional responses. As recent advances have increased cell throughput and lowered per-cell costs, the number of high-throughput single-cell RNA-seq techniques that can process more than a thousand cells per experiment has increased.

Several key factors, such as variable capture and amplification efficiencies during library preparation, impact the ability of single-cell RNA-seq techniques to accurately and comprehensively characterize immune-cell diversity. Mixtures of different cell sizes are particularly complex as small cells contain low total number of transcripts and therefore, are difficult to distinguish from ambient noise. The relatively small size and low mRNA content of immune cells may impact the performance of single-cell RNA-seq methods differently than was previously described using larger cells [8,9,10,11,12,13]. Immune cells constitute a broad range of cell types across various lineages, activation states, and cell sizes. Efficient recovery across these diverse cell types impacts the fidelity of cell-composition analyses. Methods that recover a larger fraction of cells in a cost-efficient manner benefit studies that sample tissues containing few immune cells. Also, increased sensitivity in detecting individual mRNA transcripts results in more comprehensive cellular profiles, which greatly advances the characterization of immune sub-types. A more complete picture of cellular transcriptional activity facilitates the identification of differentially-expressed (DE) marker genes and positively impacts the mapping of cells against reference immune cell signatures.

Previous benchmarking studies using somatic cell lines or peripheral blood mononuclear cells (PBMCs) reported that high-throughput single-cell RNA-seq methods generally enabled broader sampling of diverse populations at a lower per-cell cost. However, larger sample sizes come at the expense of lower mRNA detection sensitivity [8,9,10,11,12,13]. In this work, we extend previous findings with a focus on the application of high-throughput methods to immune-cell profiling. By using a defined mixture of four lymphocyte cell lines, we assess the performance of seven high-throughput methods using four commercially-available systems to address common concerns in immune-cell profiling. First, we examine library efficiency in terms of cell recovery and cell-assignable reads. Next, we assess mRNA detection sensitivity and the correlation of cellular profiles to immune cell signatures from bulk RNA-seq. Finally, we compare results across the lymphocyte cell lines and explore in-vivo variation of mRNA detection across peripheral blood mononuclear cells (PBMCs) in consideration of varying cell sizes and cellular mRNA contents. This study serves as useful guidelines for the selection of a suitable single-cell RNA-seq method to study immune cells.

Results

Design of single-cell RNA-seq benchmarking experiments

We benchmarked four commercially-available high-throughput single-cell systems: the Chromium [14] (10x Genomics), the ddSEQ (Illumina and Bio-Rad), the scRNA-Seq System running Drop-seq (Dolomite Bio) [15], and the ICELL8 cx (Takara Bio) [16] (Fig. 1). We tested three methods available for the Chromium (3′ v3, 3′ v2 and 5′ v1) as well as two methods for the ICELL8 (the official 3′ DE protocol and an alternate 3′ DE-UMI protocol). All methods tested perform mRNA end counting by tagging mRNA sequences with a barcode containing a cell identifier (CID) and a unique molecular identifier (UMI) with lengths that vary by method (Supplement Table 1).

Fig. 1
figure1

Overview of high-throughput single-cell benchmarking experiments. Experiments were performed using four immune cell lines to benchmark cell recovery, transcript detection sensitivity, concordance to bulk RNA-seq and differentially-expressed gene identification

All techniques, apart from ddSEQ, amplify full-length cDNA (Supplement Table 1) using a modified Smart-seq protocol [17, 18], which incorporates a 5′ PCR handle by employing a reverse transcriptase’s ability to switch templates at the end of a transcript. Full-length cDNA can be amplified with primers in the 5′ template-switch and 3′ poly-T oligonucleotides. Barcoded cDNA ends are further amplified after direct ligation or tagmentation to incorporate Illumina sequencing adapters. ddSEQ contains a single amplification step during adapter incorporation after second strand synthesis without amplification of full-length cDNA. Amplification bias introduced in the multiple rounds of PCR in these protocols, is mitigated by the incorporation of UMIs [19]. However, UMI counts are unreliable in the ICELL8 3′ DE protocol because cDNA is amplified in the presence of barcoding primers, potentially inflating UMI counts. The alternative ICELL8 3′ DE-UMI protocol is more robust for UMI counting since reverse transcription and cDNA amplification are uncoupled by an exonuclease digestion of barcoding primers.

We used a 1:1:1:1 mixture of four lymphocyte cell lines from two species (Fig. 1; Supplement Table 2): EL4 (mouse CD4+ T cells), IVA12 (mouse B cells), Jurkat (human CD4+ T cells), and TALL-104 (human CD8+ T cells). These cells also vary in morphology: TALL-104 cells (~ 5 μm diameter) are considerably smaller than the other cell types (~ 10 μm diameter). These cell lines are expected to have distinct expression profiles enabling the classification of each cell type. Usage of cells from two species allowed us to clearly identify cross-species doublet contamination to calculate capture rates of cell multiplets. To mirror typical single-cell sequencing runs and to ensure a comparison independent of sequencing limitations, we normalized the read depth of our libraries to ~ 50,000 reads per cell (Fig. 1; Supplement Figs. 1 and 2). Cells were identified and classified by correlating single-cell expression profiles to bulk RNA-seq.

Evaluation of cell capture and library efficiency

One important consideration for single-cell RNA-seq is the capture rate, or the fraction of cells recovered in the data relative to input. This is especially critical when working with precious samples with few cells. To identify recovered cells, we used the curve of the log-total count against the log-rank of each CID, which is equivalent to the transposed log-log empirical cumulative density plot of the total counts of each CID. The knee and inflection points in this curve typically define the transition between the cell-containing component and the ambient RNA component of the total count distribution. Here, we defined a recovered cell as a CID located above the inflection point (Supplement Fig. 2a). In our tests, we found that capture rates were slightly lower than, but tracked with theoretical rates (Fig. 2a; Table 1). As expected, we observed the highest rates with 10x Genomics methods, ranging from ~ 30 to ~ 80%, while ddSEQ and Drop-seq methods recovered < 2% of cells.

Fig. 2
figure2

Library-pool and cell-capture efficiencies: a Cell capture efficiency was measured by the number of cell identifiers (CIDs) above the inflection point of the rank ordered reads/CID plot (knee plot) relative to the number of cells loaded on the instrument. Horizontal lines indicate theoretical capture efficiency based on bead/cell loading concentrations or manufacturer’s guidelines. b Library pool efficiency was measured by the number of reads in CIDs above the inflection point

Table 1 Summary of average mRNA/gene detection sensitivities and capture rates for each single-cell RNA-seq method

In addition to the capture rate, we also quantified events capturing multiple cells in a single partition. This technical artifact impairs downstream data analysis, as artificial mixtures of transcriptomes may be interpreted wrongly as single cells. The extent of this issue is influenced by the quality of the single-cell suspension, cell health, and cell loading concentration. By counting CIDs with a significant fraction of both human and mouse transcripts, for all methods, we observed multiplet rates around the 5% we had targeted with our cell-loading concentrations (Table 1; Supplement Fig. 3a).

Another significant factor in efficiency is the fraction of reads that can be assigned to individual cells. Increased background noise in sequencing libraries results in wasted reads and unnecessarily increased sequencing costs. We observed the highest fraction of cell-associated reads for our ICELL8 experiments (> 90%), intermediate rates for 10x experiments (~ 50–75%) and the lowest rates for ddSEQ and Drop-seq (< 25%) (Fig. 2b; Supplement Tables 3 and 4). We also examined the genomic locations of aligned reads. About 75% of aligned bases of each library were mapped to exons and UTRs. Notably, the intergenic fraction was lowest in 10x samples, suggesting lower genomic contamination in these methods. (Supplement Fig. 3b). The ddSEQ method exhibited the greatest UTR bias. This is likely due to the longest read-length (150 bases) for ddSEQ of each tested technology.

10x 5′ v1 and 3′ v3 methods demonstrate the highest mRNA detection sensitivity

Because immune cells tend to have low levels of mRNA, the mRNA detection sensitivity, or the fraction of a cell’s transcriptome detectable, critically impacts downstream analyses. Single-cell RNA-seq methods are inherently prone to dropouts due to inefficiencies during library preparation resulting in false-negative gene-expression signals [15]. Although we performed library normalization to obtain a consistent read depth across all cells, we found that read distributions of individual cell types varied. Since EL4 cells demonstrated the highest consistency between read distributions across experiments (Supplement Fig. 1c), we focused our initial analysis on EL4 cells to minimize batch effects due to differential sequencing depths. We observed the highest detection of both transcripts and genes with at least one read count using 10x Genomics methods, with the highest levels seen in the 3′ v3 experiments (median 28,006 UMIs/4776 genes across all samples) followed by the 5′ v1 and 3′ v2 kits (25,988 UMIs/4470 genes and 21,570 UMIs/3882 genes, respectively) (Fig. 3a, b; Supplement Table 4). ddSEQ and Drop-seq experiments demonstrated similar detection rates (10,466 UMIs/3644 genes and 8,791 UMIs/3255 genes, respectively). UMI counts generated by the ICELL8 3′ DE method are unreliable due to residual barcoding primers during cDNA amplification, so we focused on gene detection sensitivity instead. We observed a significant drop in gene detection between the 3′ DE and 3′ DE-UMI methods (2849 and 1288 genes, respectively) and a low number of UMIs counted in the 3′ DE-UMI method ((2792 UMIs). This suggests that many transcripts are lost in the additional primer digestion and cleanup steps. Cross-contamination due to ambient RNA minimally impacted these UMI detection rates with average estimates of contamination calculated with DecontX [20] falling under 1% for UMI-based methods (Supplement Table 4). For the other three cell types, rankings of methods by absolute UMI- and gene-count distributions slightly differed from EL4 cells, likely due to greater variation in read depth across samples for these cell types (Supplement Figs. 1c and 4a).

Fig. 3
figure3

Transcript detection sensitivity: a Distributions of unique molecular identifiers (UMIs) and b genes detected in EL4 cells by sample are plotted. c Numbers of UMIs or d genes detected versus numbers of reads per cell for each cell type are plotted. e Accumulated average numbers of genes detected from aggregated data of subsamples up to 50 cells are plotted. f Dropout modeling (dropout rate versus FPKM of bulk sequencing) for EL4 cells by method are shown. A left-shifted curve indicates higher sensitivity, that is, fewer dropouts at lower expression levels. Sensitivity of methods for EL4 cells ranked in the following order: 10x 3′ v3 > 10x 5′ v1 > 10x 3′ v2 > ddSEQ > Drop-seq > ICELL8 3′ DE > ICELL8 3′ DE-UMI. Cells with high mitochondrial expression rates were excluded from this calculation

To account for varying read distributions across the four cell types, we compared the number of detected UMIs and genes relative to the total number of reads per cell. For EL4, IVA12 and Jurkat cells, we observed a similar trend across methods with regards to efficiency of transcript and gene detection (Fig. 3c, d). Again, 10x 3′ v3 (mean ± sd reads/UMI = 2.07 ± 0.52, reads/gene = 9.04 ± 2.65) and 5′ v1 chemistries (mean ± sd reads/UMI = 1.98 ± 0.19, reads/gene = 9.51 ± 2.68) were the most efficient, requiring fewer reads to detect a single UMI or gene. These methods are followed by 10x 3′ v2 (reads/UMI = 2.35 ± 0.33, reads/gene = 11.17 ± 3.03), ddSEQ (reads/UMI = 5.25 ± 1.14, reads/gene = 13.42 ± 3.89), Drop-seq (reads/UMI = 6.40 ± 1.42, reads/gene = 15.97 ± 5.62) and ICELL8 methods (3′ DE: reads/gene = 29.68 ± 41.48, 3’ DE-UMI: reads/UMI = 21.77 ± 5.50, reads/gene = 47.5 ± 17.91). This trend is largely mirrored in TALL-104 cells, albeit less distinct due to the low read depth obtained for those cells (Fig. 3c, d; Supplement Fig. 1c).

We further examined the number of genes with at least one sequenced read in pseudo-bulk populations. For this purpose, cells form each cell type were pooled and gene-expression measurements were merged. We observed similar trends with higher numbers of detected genes with the 10x 3′ v3, and 5′ v1 method for EL4, IVA12 and Jurkat cells (Fig. 3e). Although the ICELL8 3′ DE method had a low per-cell gene detection rate, when pooling more than 30 cells this method exhibited comparable levels of gene detection to 10x 3′ v2, ddSEQ and Drop-seq methods. This is likely due to the high false-negative rate of genes with overall low expression levels in the ICELL8 3′ DE method. The cumulative number of genes for TALL-104 cells was lower than the other cell types and the relative detection rates across methods did match trends seen in other cell types, possibly due to the low read depth and cell recovery for this cell type.

We also examined the ability of each method to detect genes at various expression levels by calculating the dropout rate, the conditional probability that a gene is not detected in a given cell. The dropout rate was modeled as a function of the expression level in bulk RNA-seq (FPKM) for each cell type. We used a nonlinear least square fit of the data that accounted for the activity of reverse transcriptase described by Michaelis-Menten kinetics [21,22,23]. Here, higher gene detection sensitivity as a function of fewer dropouts at lower expression levels, was indicated by left-shifted curves and lower Gene Detection 50 (GD50) value, the point at which this curve reached a detection probability of 0.5. The GD50 metric represented the expression level of a gene we would expect to be detected in half of the cells, and could help guide expectations of detection rates for genes of interest based on their expression in bulk RNA-seq. For EL4 cells, 10x Genomics methods were the most sensitive with 10x 3′ v3 having the lowest GD50 at 13.6 FPKM, followed by the 5′ v1 and 3′ v2 chemistries (16.8 FPKM and 20.2 FPKM, respectively). The ddSEQ and Drop-seq methods had comparable dropout rates (25.0 FPKM and 26.7 FPKM, respectively), while ICELL8 methods had the lowest sensitivity (37.9 FPKM/3′ DE and 112.1 FPKM/3′ DE-UMI) (Fig. 3f; Table 1). We observed similar trends across methods with the other three cell types, which had greater variance in read depth and transcript detection (Supplement Figs. 4b-d).

mRNA detection affects the fidelity of single-cell and pseudo-bulk transcriptomes

We next investigated how well single-cell expression recapitulates immune signatures from bulk RNA-seq. For this purpose, we correlated expression of a set of marker genes (defined using bulk RNA-seq data; see Methods) between bulk RNA-seq and single cells. In general, cells with more genes detected had a better concordance to bulk RNA-seq immune signatures (Supplement Fig. 5). We observed higher Pearson correlation coefficients for 10x 3′ v3, 5′ v1 and ddSEQ methods against EL4, IVA12 and Jurkat bulk RNA-seq expression signatures (Fig. 4a). ICELL8 3′ methods, with generally fewer genes detected, demonstrated the lowest correlation values. Overall, poorer correlation to TALL-104 bulk RNA-seq was in line with fewer transcripts and genes detected for this cell type in the single-cell data.

Fig. 4
figure4

Correlation to bulk RNA-seq: a Pearson correlation (r) of cell identifiers (CIDs) to bulk RNA-seq data using highly-expressed variable genes. Only r values above 0.2 were included in plot. b Average Pearson correlation using all genes for aggregated data of 50 subsamples of up to 50 cells are plotted

We further examined the correlation between pooled single-cell RNA-seq pseudo-bulk transcriptomes and bulk RNA-seq data using all genes. Averaging gene-expression profiles across single cells is commonly performed to compare data across experiments and is thought to resemble bulk data. For EL4, IVA12 and Jurkat, most methods began to plateau around a correlation value of r = 0.9 with a pool of 10–20 cells (Fig. 4b). The maximum correlation values were lower for ICELL8 3′ DE (r = 0.90 and 3′ DE-UMI methods (r = 0.81–0.90) compared to other methods (r=0.92–0.95), and correlation was generally lower for TALL-104 cells in all methods, suggesting that lower mRNA detection sensitivity not only affects data fidelity at a per-cell level but also impacts aggregated single-cell data. Although samples were prepared under identical conditions, we cannot rule out any effects of biological differences between samples. However, it is likely that higher variance in the detection of lowly expressed transcripts drives much of the difference in expression observed in single-cell and bulk RNA-seq, and aggregation across individual cells may not increase the correlation of expression for these lowly-expressed genes. Notably, our data indicates that detection sensitivity is not necessarily improved by pooling across single cells and results from such analyses should be interpreted cautiously.

Higher mRNA detection sensitivity improves identification of differentially-expressed genes

To assess the performance of differential expression analysis for each method, we focused on the two mouse cell types (EL4 and IVA12) because these cells had more similar sequencing depths compared to the two human cell types. We used the hurdle model proposed by Finak et al. [24] to identify differentially-expressed (DE) genes with an FDR < 10− 4 (Fig. 5a). For each DE analysis we sampled 199 cells, the lowest number of recovered cells by any method. Gene expression data was normalized by each cell’s library size (see Methods), which correlated highly to scaling factors derived by deconvolution from cell pools (mean +/− sd r =0.99 +/− 0.016) (Supplement Table 4) [25]. Over 3000 DE genes were identified in 10x Genomics methods, the highest among the methods tested, followed by Drop-seq (avg ~ 2700 genes) and ddSEQ (avg ~ 2800 genes), while the two ICELL8 methods had the fewest numbers of DE genes (avg ~ 1800 and ~ 1000 genes) (Fig. 5b; Table 1). We observed similar trends with two alternative commonly-used tests for differential expression, a Mann-Whitney-Wilcoxon test [26] and a likelihood ratio test with an negative binomial generalized linear model [26, 27] (Supplement Fig. 6a). Performing DE analysis using all the cells obtained in each method increased the number of genes passing the significance threshold due to the increased statistical power (Supplement Fig. 6b). When we considered the 5,868 genes that had more than a 1.5-fold difference in bulk RNA-seq data as a proxy for ground-truth expression differences, the trend remained the same (Fig. 5b; Supplement Figs. 6a, 6b; Table 1). To further evaluate the effectiveness of calling DE genes in terms of quantity and quality, we assessed recall and precision of each technology. Recall was calculated as the fraction of DE genes in bulk RNA-seq data that was identified as differentially expressed in the single-cell data. Precision was defined by the fraction of DE genes from single-cell data that were also differentially expressed in bulk RNA-seq data. We observed the highest recall rates for 10x Genomics methods (0.595 - 10x 5′ v1; 0.577 - 10x 3′ v3) (Table 1) indicating lower type II error rates for these methods with higher mRNA detection sensitivity. Precision was reasonably high across all methods (0.728–0.887) (Table 1), implying a moderately low number of falsely called DE genes overall.

Fig. 5
figure5

Differentially-expressed (DE) gene detection: a Fold change (FC) versus false discovery rate (FDR) calculated using a hurdle model (MAST) for mouse genes in EL4 vs IVA12 cells. Shown is a representative subsample of mouse cells (n=199) using the 10x 3′ v2 method demonstrating the criteria for declaring DE genes (FDR < 10− 4); DE genes are highlighted in red. b Number of significant DE genes calculated using MAST between EL4 and IVA12 cells by method. Error bars represent the 95% confidence interval. The total number of significant DE genes are plotted in red, the number of DE genes with > 1.5-fold difference in expression in bulk RNA-seq (5868 genes) are plotted in cyan. c Median bulk RNA-seq expression (FPKM) of all significant DE genes (red) or DE genes with > 1.5-fold difference (cyan). Error bars represent 95% confidence interval

In general, we observed that fold changes in single-cell data correlated well with gene expression differences in bulk RNA-seq data, especially for genes with higher expression levels (Supplement Fig. 6c). In contrast, genes with low expression correlated poorly with smaller fold changes observed in the single-cell data, consistent with higher dropout probabilities for lowly-expressed transcripts. Also, the distribution of FPKM values was generally higher for DE genes from single-cell data compared to genes with at least 1.5-fold changes in bulk RNA-seq (Supplement Figs. 6d, e), indicating that all methods exhibit a considerable type II error rate with respect to lowly-expressed genes. Furthermore, we found the lowest median FPKM in bulk RNA-seq for DE genes from the methods with the highest detection sensitivity, 10x 3′ v3 (median = 3.43 FPKM) and 10x 5′ v1 (median = 3.45 FPKM), and the highest median FPKM for the ICELL8 3′ DE-UMI method (median = 4.91 FPKM), which had the lowest transcript detection sensitivity (Fig. 5c; Supplement Figs. 6d, e).

Recovery of low-mRNA-content cells

Many immune single-cell experiments profile an undefined mixture of cell types that potentially vary in mRNA content. Efficient cell recovery across diverse cell types is important to accurately characterize the diversity of these populations. We next compared the differences in cell recovery between the four cell types included in our sample mixture. In particular, TALL-104 cells are smaller (5 μm diameter) than the other three cell-types (EL4/IVA12 - 11 μm, Jurkat - 10 μm diameter) and, in our hands, more difficult to culture, with viability rates under 80% and slow growth. Across all experiments, TALL-104 cells had the lowest distribution of reads, UMIs, and genes recovered (Supplement Figs. 1c and 4a), such that they were more susceptible to exclusion based on read or UMI thresholding of CIDs to distinguish cells from ambient noise.

We classified cells by correlating their expression profile to gene signatures from bulk RNA-seq. This enabled us to examine the recovery of each cell type with common thresholding points on the log-log curve of total reads or UMIs vs rank ordered CIDs. In droplet-based methods, thresholding removes a large fraction of CIDs that are derived from droplets containing a barcoded bead, but no cell. Two points are commonly used as thresholds: the knee point where the signed curvature is minimized and the inflection point where first derivative is minimized [28] within a given range of total reads or UMIs (Fig. 6a; Supplement Fig. 2a). While the fraction of classifiable cells on each side of these two thresholds varied across experiments, both thresholds were able to capture EL4, IVA12 and Jurkat cells (Fig. 6a). Notably, most TALL-104 cells would be removed by thresholding using the stringent knee point, with only one experiment having any TALL-104 cells above this threshold. While the more permissive inflection point performed better at capturing TALL-104 cells, all TALL-104 cells would be considered ambient noise using this threshold in samples (Fig. 6b).

Fig. 6
figure6

Cell recovery by cell identifier (CID) thresholding: a Example of using the transposed log-log empirical cumulative density plot of the total counts of each CID to identify cell-containing droplets . Common thresholding points, the ‘knee’ and the ‘inflection’ are indicated with arrows. The knee is the point at which the signed curvature is minimized, the inflection is the point at which the first derivative is minimized. b The fraction of cells above the knee or inflection are plotted. c Fraction of cells below mitochondrial rate threshold (listed in Supplement Table 4) relative to knee point. Samples are colored by cell sample mixture listed in Supplement Table 2

As low mRNA recovery might reflect poor cell health, e.g., due to mechanical stress in cell preparation, we also examined the fraction of cells below each threshold that had a high rate of mitochondrially encoded UMIs or reads, an indication of broken or poor-quality cells (Fig. 6c). Accordingly, a large fraction of cells removed by the knee point cutoff had a high mitochondrial rate. However, significant numbers of cells had reasonable mitochondrial rates, including many TALL-104 cells. This suggests that lower transcript recovery in TALL-104 cells is related, at least in part, to lower overall mRNA content and not cellular damage, although we cannot completely rule out other cell-quality issues that do not affect the mitochondrial rate. Overall, some cell populations could be lost when thresholding based on total UMI or read count is too stringent. It would be beneficial to include more cells at the initial CID selection step and filter cells more stringently in downstream analyses with other cell-quality criteria to avoid loss of cell populations with low mRNA content. Of note, this issue does not affect ICELL8 methods as all cell-related barcodes are known a priori when cell-containing wells are selected for processing.

mRNA detection sensitivity varies across heterogenous cell types

In heterogeneous populations, mRNA capture rates and read depths may vary across subpopulations. We explored differences in mRNA detection sensitivity across the four cell types in our samples. As it is common in single-cell profiling of mixed populations, we observed differences in read and UMI recovery across cell types in each method (Fig. 7a). When the entire data was used to model dropout rates, we found that cell types with the lowest read distributions, such as TALL-104 cells, had increased dropout probabilities and GD50 levels across all seven methods tested (Fig. 7b; Supplement Figs. 4b, d). We hypothesized that differences in dropout rates were predominantly driven by differences in mRNA detection rates and compared cells from each cell type with similar numbers of UMIs. Cells were separated into six quantile bins based on the number of UMIs (Fig. 7c) and dropout rates for each cell type were modeled. Because there were few TALL-104 cells recovered in many samples (Supplement Table 4), we focused on data from the 10x 3′ v3 method which had sufficient numbers of cells available for analysis. We found that with increasing number of total UMIs, GD50 values and dropout rates decreased. Notably, GD50 levels were similar across cell-types within a bin (Fig. 7d). Slight differences in GD50 were related to variation in mean number of UMIs for a particular cell type. TALL-104 cells, which fell into the two lowest bins due to the low numbers of transcripts detected, had similar dropout rates as other cell types in the same bin (Fig. 7d, e).

Fig. 7
figure7

Dropout rates by cell type: a Distribution of reads across cell types is plotted by method. b Dropout rate models for cell types are shown. c 10x 3′ v3 cells were binned by number of unique molecular identifiers (UMIs) and distributions of nUMIs for each cell type in each bin are plotted. d Gene Detection 50 (GD50) rates, expression level at 0.5 probability of the dropout model, are plotted for each cell type in 10x 3′ v3 experiments by bin. e Dropout models in each bin for EL4, IVA12 and Jurkat cells are plotted along with the model for TALL-104 cells in bin 1

To explore the range of mRNA capture rates across lymphocytes in vivo, we profiled a single PBMC sample using the 10x 3′ v3 and 5′ v1 methods. Cells were classified by mapping gene expression to a reference PBMC CITE-Seq dataset [27, 29] (Fig. 8a). We recovered 9,450 and 7,780 classified cells for the 10x 3′ v3 and 10x 5′ v1 methods, respectively. Of the classified cells, we identified a variety of cell types comprising PBMC cells including cells from the lymphoid (e.g. B and T cells) and myeloid lineages (e.g. dendritic cells and monocytes). We observed similar fractions of cell types in data from both methods (Supplement Figs. 7a, b). Cell classes were found to express expected cell-type-marker genes (Fig. 8b) and had distinct expression of DE genes (Supplement Fig. 7c). For instance, we observed enrichment of CD19, which encodes part of the BCR co-receptor, in B cells and CD3E, which encodes part of the TCR co-receptor in T cells (Fig. 8b). The median numbers of genes and UMIs detected across various cell types ranged from under 500 genes and 1,000 UMIs detected in platelets (3′ v3–333 genes/773 UMIs; 5′ v1–450 genes/941 UMIs) to over 3,500 genes and 18,000 UMIs in proliferating CD4 T cells (3’ v3–4,156 genes/ 18,469 UMIs; 5′ v1–3,737 genes/18,877 UMIs) (Fig. 8c, d). Similar to our mixed cell line benchmarking experiments, we observed a variation in read depth across cell types related to gene and UMI detection statistics (Supplement Fig. 7d).

Fig. 8
figure8

mRNA capture variation across peripheral blood mononuclear cells (PBMCs). a Single-cell data generated with the 10x 3′ v3 and 10x 5′ v1 chemistries were projected onto an annotated PBMC CITE-Seq reference dataset. b Violin plots of log normalized expression of common immune-cell markers. c nGenes and d nUMIs detected for each cell class from each method. CTL-cytotoxic, TCM – T central memory, TEM – T effector memory, Treg – T regulatory cells, dnT – double negative T, gdT – gamma delta T, MAIT - Mucosal associated invariant T, pDC - Plasmacytoid dendritic cell, ASDC - AXL+ dendritic cell, cDC – classical dendritic cell, MC – monocyte, Eryth-Erythrocyte, ILC - Innate lymphoid cell, HSPC - Hematopoietic stem and progenitor cell

Discussion

In this paper we explored several important quality metrics of single-cell RNA-seq methods: efficiency of cell recovery, library efficiency and mRNA detection sensitivity. High recovery of cells put into a system and minimal loss of reads due to noise are important, especially for limited samples with few cells. The differences in performance we observed across these methods are directly related to the design of cell and mRNA capture. To partition cells, the methods we tested either use microfluidics to generate nanoliter sized droplets or to partition cells on microwell chips. Ideally, those microreactors contain exactly one bead and one cell. In practice, however, the number of cells per microreactor approximately follows a Poisson distribution. While the loading probability of cells is similar across these methods, the distribution of barcoding oligonucleotides varies. The loading statistics of Drop-seq and ddSEQ follow a Poisson distribution, while 10x Chromium chips load beads in a sub-Poissonian fashion. The latter enables an increased theoretical capture rate of ~ 60%. Sparser loading of barcoding beads in ddSEQ and Drop-seq minimizes the occurrence of bead doublets, but at the expense of lower maximum recovery rates of ~ 3% and ~ 5% respectively. Oligonucleotide loading is tightly controlled on ICELL8 chips with the pre-printing of oligonucleotides, providing a priori knowledge of cell-related CIDs when coupled with cell imaging. The ability of the ICELL8 to selectively process a subset of wells, those containing cells identified by fluorescence imaging, greatly improves this method’s library efficiency compared to techniques that process all partitions. Accordingly, we observed the highest fraction of cell-related reads in ICELL8 libraries, especially compared to ddSEQ and Drop-seq methods with a large fraction of bead containing droplets lacking a cell and increased potential for ambient RNA. Quality of single-cell suspensions are also important factors to these metrics. Variable cell viability and inefficient cell quantification of our samples may negatively impact cell-capture and multiplet rates and explain the discrepancy between expected and observed rates.

In our experiments, we observed the highest mRNA-detection sensitivity in the 10x 3′ v3 and 5′ v1 methods, with the highest numbers of transcripts and genes detected and lower probabilities of gene dropouts at lower expression levels. Our results corroborated previous reports about the performance of some of the methods assessed in our work [10,11,12,13]. Here, we extended these findings by demonstrating an increased sensitivity of the more recent 5′ v1 and 3′ v3 methods, which also validated claims made by 10x Genomics. Further, we found ICELL8 methods had the lowest mRNA detection sensitivity of the methods tested for the assayed immune cell types. Of note, this is partly in disagreement with two papers reporting better performance of the ICELL8 3′ DE method relative to 10x 3′ v2 and Drop-seq [10, 30]. Differences to the performance we have observed may be related to cell types used in each study. For example, ICELL8 3′ DE detected significantly fewer genes per cell compared to 10x 3′ v2, ddSEQ and Drop-seq in B-cells in Mereu et al. [10], which is on par with our findings.

Gene detection rates may be increased by greater sequencing depths, particularly for lowly-expressed genes (Supplement Fig. 2b; Supplement Table 3). However, high-throughput methods aim to sequence many cells concurrently for a broad exploration of populations, at the expense of the completeness of individual transcriptional profiles. Here, libraries are not routinely sequenced to full saturation due to high sequencing costs. To be able properly assess mRNA detection sensitivity, we normalized samples to a common sequencing depth of ~ 50,000 reads per cell by downsampling raw reads. Additional iterations of this stochastic process showed little variation in the resulting analysis (Supplement Table 5), suggesting our normalization step did not introduce any technical bias. Notably, the resulting sequencing depth is typical for common high-throughput single-cell RNA-seq experiments. Therefore, our data can provide expectations for mRNA and gene detection rates in experiments with a similar sequencing depth using other immune cells.

Multiple aspects of single-cell RNA-seq protocols such as efficiencies in mRNA capture, reverse transcription, and cDNA amplification can affect the overall mRNA detection sensitivity. Efficient mRNA capture may be impacted by the template switch mechanism, as only first strand cDNAs which have successfully switched templates can be amplified. ddSEQ, the sole method we have tested that does not utilize template switching, is not as sensitive as the 10x Genomics methods, possibly due to other technical differences. Another source of inefficiency may arise in the reverse transcription cleanup step prior to cDNA amplification. We found that the addition of a primer digestion step to the ICELL8 3′ DE protocol in the 3′ DE-UMI method decreased the mRNA detection sensitivity. Additional improvements to mRNA capture such as improving oligonucleotide chemistry for mRNA capture and cDNA amplification may enhance mRNA detection sensitivity and improve single-cell RNA-seq techniques in the future.

Increasing the sensitivity of mRNA detection greatly benefits downstream analyses of immune profiling datasets. Sampling transcriptomes with high fidelity results in a greater likelihood of detecting rare transcripts for identifying DE genes at lower expression levels. In general, our results showed that expression profiles of cells with high mRNA content generated by methods with a high mRNA detection rate correlated well to bulk-RNA-seq data. Also, the number of DE genes as well as the overall correlation in fold-change differences to bulk RNA-seq improved with higher mRNA detection sensitivity. Here, all 10x Genomics methods, which had the highest mRNA detection sensitivity, exhibited a high correlation to bulk RNA-seq data as well as more DE genes with a lower range of expression levels in bulk data. Notably, our results reveal that the higher variance in the detection of lowly expressed transcripts commonly observed in techniques with lower sensitivity is not necessarily overcome by pooling across single cells when performing pseudo-bulk analyses. Strengthening the underlying mRNA detection sensitivity can improve downstream analyses to identify marker genes as well as classify subtle immune subtypes and cell states with small, but significant differences in gene expression, and can facilitate the identification of novel immune subpopulations [31].

Importantly, our data also provides insight into the performance of single-cell techniques across heterogenous populations of immune cells. Although the immune cell lines used in this study may differ from lymphocytes found in vivo, the standardized cell culture conditions for these cells helps reduce expression variability compared to primary cells and facilitated data analysis. Nonetheless, our results provide better guidance for immune profiling in contrast to the higher mRNA content cell lines such as carcinoma or stem cells commonly used in previous comparison papers [8, 11, 12]. The inclusion of small TALL-104 cells can allow us to assess the sensitivity of these methods in subpopulations with comparatively low mRNA levels. Relaxing CID filtering criteria based on total UMI or total read counts can improve recovery of TALL-104 cells for downstream analyses. Notably, smaller immune cells such as TALL-104 cells have a higher gene dropout rate that is related to the number of transcripts captured from a cell. Thus, additional quality metrics also need to be calibrated carefully for the identification of small immune cell types. Cell imaging on the ICELL8 cx to identify otherwise challenging cells of interest such as TALL-104, can also be used to recover populations of small cells. In general, TALL-104 cells exhibit lower mRNA detection rates and higher dropout rates. Thus, we can expect that other immune cell types with low mRNA content exhibit similar dropout rates as other immune cells with a comparable rate of transcript detection.

Cell-compositional bias may also arise during sample preparation. For example, density gradient purification of PBMCs may selectively reduce the number of granulocytes, or cell populations may be lost due to greater sensitivity to dissociation and purification induced stress. Furthermore, large cells are filtered from single-cell suspensions to avoid clogging narrow microfluidic channels. 10x Chromium microfluidics are ~ 50–60 μm in diameter, while Drop-seq channels are typically ~ 125 μm [15] and ICELL8 nozzles have a 125 μm bore size, and these systems can theoretically process cells up to that diameter in size. However, most immune cells are smaller than the 40 μm filter size commonly used during single-cell sample preparation and can be captured by a variety of systems.

We can translate the observation in our data in vitro that dropout rates and gene detection sensitivity is related to the total number of transcripts captured from a cell (Fig. 7) to the variety of immune cells found in vivo. In our 10x Genomics PBMC datasets, we observed a range of mRNA detection rates across various cell types with proliferating cells and hematopoietic stem cells (HSPCs) displaying the highest rates of UMIs and genes detected. Additionally, dendritic cells (DCs) and monocytes (MCs) displayed elevated UMI and gene detection rates relative to most T and B cells (Fig. 8c, d), similar to observations in other PBMC datasets [10]. Taken together, this suggests that these cells would have a lower dropout rate and more robust DE gene detection. These results can be further extended to immune cell populations, such as granulocytes, which have been excluded currently during sample preparation. It would be interesting to further explore the capabilities of various single-cell RNA-seq and sample preparation methods to assay other immune populations that are particularly difficult to survey.

Conclusions

Our comparison of data from seven high-throughput single-cell methods can help guide method selection for immune profiling experiments. Our data can provide reasonable predictions of transcript and gene detection rates for lymphocytes, as well as insight into performance across heterogenous immune cell populations with varying mRNA content. Our results suggest looser thresholding of CIDs in droplet-based methods can be beneficial to retain cell populations with low-mRNA content. Additionally, smaller cells such as TALL-104 cells have a higher gene-dropout rate that is related to the number of transcripts captured from a cell.

Each method we tested showed advantages that could benefit immune cell profiling. In this study, 10x Genomics methods had the highest cell recovery and mRNA detection sensitivity, making these techniques particularly suited to experiments with limited samples and experiments that require detection of genes with lower expression levels. Here, the performance was comparable between the 10x 5′ v1 chemistry and 3′ v3 methods, making the 5′ v1 chemistry an appropriate substitute when pairing gene expression analysis with TCR/BCR clonotyping. 10x Genomics and Illumina/Bio-Rad (ddSEQ) sell reagents in kits, facilitating adoption of these methods, but limiting customization of protocols. Takara Bio also sells reagent kits for the ICELL8, however, protocols on the instrument are customizable allowing for greater flexibility. Drop-seq is also an open system that is fully customizable and custom reagents such as target-capture oligonucleotide beads [32, 33] can be easily integrated into the protocol. The fluorescent imaging capabilities of the ICELL8 cx enable the pairing of sequencing and imaging data in downstream analysis. Our ICELL8 experiments demonstrated high library efficiencies with a large fraction of reads assignable to cells and potential utility to recover low-mRNA-content cells, such as TALL-104 cells, that are more susceptible to stringent read and UMI thresholding. Overall, our data shows that all methods exhibit specific strengths which can be aligned with experimental goals, sample limitations, and budgetary constraints.

Methods

Cell culture

All cell lines were acquired from ATCC. EL4 (ATCC TIB-39) cells were cultured in RPMI-1640 + 2 mM L-glutamine + 10% FBS + 1.7 ul 2-mercaptoethanol per 500 ml media. IVA12 (ATCC HB-145) cells were cultured in DMEM + 10% FBS + 1x P/S. Jurkat (ATCC TIB-152) cells were cultured in RPMI-1640 + 10% FBS + 1x P/S. TALL-104 (ATCC CRL-11386) cells were cultured in IMDM + 15% FBS + 1x L-Glu + 200 U/ml IL-2. Prior to processing cells, cells were washed in 1x PBS and cell concentration and viability were determined using a Countess (Invitrogen). Cells were mixed at a 1:1:1:1 ratio based on viable counts and resuspended in PBS + BSA solution according to manufacturer’s guidelines.

Chromium

Cells were resuspended in PBS with 0.04% BSA at a stock concentration within the recommended range (typically ~1e6 cells/mL) and loaded at a volume to target between 2000 and 6000 cells depending on sample. Libraries were prepared according to manufacturer’s instructions for each chemistry. Libraries were sequenced on a NextSeq500 or NovaSeq (Illumina) according to manufacturer’s guidelines: 3′ v3 - 28x8x0x91, 3′ v2 - 26x8x0x98, and 5′ v1 - 26x8x0x110 [read1xindex1xindex2xread2].

ddSEQ

Cells were resuspended in PBS + 0.1% BSA and loaded at a concentration of either 2000 or 2500 cells/ul. Libraries were prepared according to manufacturer’s instructions. Libraries were sequenced on an Illumina NextSeq500 (68x8x0x150) at a 3pM concentration with provided custom read 1 primers.

Drop-seq

Libraries were prepared following the McCarroll Lab Drop-seq protocol (http://mccarrolllab.org/Drop-seq/) [15], with cells and beads encapsulated using the Dolomite scRNA-Seq system. Oligo beads (ChemGenes) contained the original Drop-seq polyT primer with a VN anchor at the 3′ end (TTTTTTTAAGCAGTGGTATCAACGCAGAGTACJJJJJJJJJJJJNNNNNNNNVTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTVN). Cells were resuspended in PBS + 0.01% BSA and loaded at a concentration of either 150 or 300 cells/ul. For encapsulation, cell and bead solutions were loaded at 30 ul/min and encapsulation oil was loaded at 200 ul/min. Libraries were sequenced on a NextSeq 500 (Illumina) (21x8x0x138) at a 3pM concentration with custom Read1 Drop-seq primers (GCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTAC).

ICELL8 cx

Cells were resuspended in 1X PBS and loaded at a final concentration of 2800 cells/ml. Only wells containing single cells as determined by the Cell-Select software using default settings were processed. Libraries were prepared using the Takara Bio SMARTER ICELL8 cx 3′ DE user manual or an alternate protocol that separates the initial reverse transcription reaction from cDNA amplification. In short, after RT, cDNA was removed from the chip and cleaned and concentrated with a Zymo Clean & Concentrator-5 kit. cDNA was then treated with 20 U of Exonuclease for 30 min at 37 °C and the enzyme was deactivated with 20 min at 80 °C. cDNA was then amplified and tagmented using the Illumina Nextera XT kit for the final sequencing libraries. Libraries were sequenced at 25x8x0x131.

Peripheral blood mononuclear cells (PBMC) chromium libraries

Blood samples were collected into sodium-heparin tubes (Beckton Dickinson) from a healthy adult donor through an internal blood donation program. Consent was kindly provided by the donor, and the sample was de-identified upon receipt of the sample. Blood was diluted 1:1 with PBS without Ca2+ and Mg2+ and centrifuged at 800 g × 30 min x room temperature without braking in Accuspin tubes (Sigma-Aldrich) containing 15 ml of Ficoll-Paque PLUS density gradient media (GE Healthcare). Leukocytes were carefully removed after centrifugation and placed in a new 50 ml tube pre-filled with chilled PBS without Ca2+and Mg2+ and centrifuged again at 800 g × 7 min × 4 degrees C. After centrifugation, the cell pellet was counted and resuspended in PBS with 0.04% BSA at 1000 cells/uL.

10x 5′ v1 and 10x 3′ v3 methods (10x Genomics) were performed in succession on the same day using the same PBMC preparation. Libraries were prepared according to the manufacturer’s protocol for each method to capture 10,000 cells per sample. For the 10x 5′ v1 preparation, 13 cycles were used for cDNA amplification, and 16 cycles were used for sample-index PCR amplification. For the 10x 3′ v3 prep, 11 cycles were used for cDNA amplification, and 14 cycles were used for sample-index PCR amplification. Libraries were sequenced on an Illumina NovaSeq6000 instrument using an S4 flow cell at 28x8x0x91.

Bulk RNA sequencing

RNA was isolated from cells using the RNeasy kit (Qiagen). Libraries were generated using 1μg of total RNA using a modified Illumina TruSeq Stranded mRNA protocol. Reverse transcription was performed with the addition of RNaseOut (Invitrogen) and actinomycin-D (MP Biomedicals). The resulting product was cleaned using AMPure RNAClean beads. Additionally, second-strand synthesis was performed using dUTP instead of dTTP and an additional USER (New England Biolabs) digestion step was incorporated after size-selection. Libraries were sequenced at 101x6x0x101 on a HiSeq (Illumina) to a minimum depth of 30 million reads per sample.

Read alignment and transcript counting

Cell-line data was aligned to a combined human/mouse reference genome obtained from 10x Genomics: reference “cellRanger_1.2.0” composed of hg19 with Ensembl 82 and mm10 with Ensembl 84. PBMC data was aligned to the human reference only. Bulk data was aligned using STAR v2.5.1b and quantified using featureCounts v1.6.3 [34].

For normalizing single-cell libraries, we considered the fact that cell types with low mRNA content are more prone to dropouts and thus, may compromise proper normalization based on the mean read count per CID. Thus, robust library scaling factors were derived by using only cells with sufficiently high mRNA content. For this purpose, we calculated Gaussian kernel density estimates with a smoothing bandwidth determined via the Normal Reference Distribution method as provided by R v4.0.2. Local maxima of the density function were sorted in decreasing order. The first significant mode, di, was considered the sequencing depth of the cell population with the highest mRNA content for sample i (Supplement Fig. 1a). Scaling factors were then derived by si = mini(di)/di. FASTQ files were downsampled by factor si using seqtk v1.3-r106. This resulted in libraries with around 50,000 reads per cell. Downsampling was repeated three times for four representative samples for an assement of the margin of error. Alignment statistics for normalized data were generated using PicardTools CollectRNA-seqMetrics using the aligned (and filtered) BAM files from each pipeline. Cell mixture (EL4, IVA12, Jurkat, and Tall) and PBMC experiments were scaled independently to the lowest local maxima value in each group.

Downsampled FASTQ files were further processed using method-specific pipelines with parameters set as recommended. All pipelines employ STAR [35] for the alignment step, but are tailored to identify method-specific barcodes and count transcripts. Chromium data was processed using Cellranger v3.0.2 (with STAR v2.5.1b); ddSEQ, Drop-seq and ICELL8 data were processed using the Drop-seq_Tools pipeline v2.3.0 (with PicardTools v2.18.14 and STAR v2.4.2a). ddSEQ CIDs and UMIs were extracted using ddSeeker v0.9.0 [36]. ICELL8 read count matrices were generated using mappa v0.9 software (with STAR v2.7.0b).

To assess step-wise quality metrics of each method’s original data we applied a uniform pipeline (Supplement Table 3). First, we demultiplexed and aligned sequenced reads using STAR as recommended by each scRNA-seq method. Here, ddSEQ data was processed using SureCell RNA Single-Cell v1.1.0 (with STAR v2.5.2b). Then, high-quality reads were filtered by MAPQ scores (MAPQ = 255 for 10x, Drop-Seq, and ICELL8; MAPQ = 50 for ddSEQ) using samtools 1.9 [37] and mapped against mouse (Ensembl 84) and human (Ensembl 82) gene annotations using featureCounts from the subreadpackage 2.0.1 with parameters “-t exon -g gene_id -C -p --primary”. We wrote custom Java code (v JDK 11.0.1) to generate read and UMI count matrices. Here, we considered potential sequencing errors in UMIs and corrected these as follows: reads got grouped by <barcode, gene, UMI> tuples. If two groups had the same <barcode, gene> pair, but their UMIs differed by a single base, the UMI of the smaller group was corrected to the UMI of the larger group.

Cell mixture cell type classification.

Cells were assigned to one of four input cell classes by their similarity to cell type signatures from bulk RNA-seq data. First, we selected highly expressed genes with FPKM > 50 in any bulk RNA-seq sample. Next, gene expression was contrasted between bulk RNA-seq samples from the same species (EL4 vs IVA12 and Jurkat vs TALL-104) and we filtered 184 highly variable genes (93 murine, 91 human) with a ln fold difference > 3 between the two cell types. Pearson correlation, r, was calculated between each gene expression vector of each cell xi184 and each gene expression vector of each bulk RNA-seq sample yj184: fi( yj) = r(ln(xi + 1), ln(yj + 1)). A cell type was assigned using the following four classification rules derived from the correlation coefficient distributions:

  1. 1.
    $$ {f}_i\left({y}_{\mathrm{EL}4}\right)>0.60\&{f}_i\left({y}_{\mathrm{IVA}12}\right)<0.05\&{f}_i\left({y}_{\mathrm{Jurkat}}\right)<0.05\&{f}_i\left({y}_{\mathrm{TALL}-104}\right)<0.20\to \mathrm{EL}4 $$
  2. 2.
    $$ {f}_i\left({y}_{\mathrm{EL}4}\right)<0.05\&{f}_i\left({y}_{\mathrm{IVA}12}\right)>0.60\&{f}_i\left({y}_{\mathrm{Jurkat}}\right)<0.05\&{f}_i\left({\mathrm{y}}_{\mathrm{TALL}-104}\right)<0.20\to \mathrm{IVA}12 $$
  3. 3.
    $$ {f}_i\left({y}_{\mathrm{EL}4}\right)<0.05\&{f}_i\left({y}_{\mathrm{IVA}12}\right)<0.05\&{f}_i\left({y}_{\mathrm{Jurkat}}\right)>0.40\&{f}_i\left({y}_{\mathrm{TALL}-104}\right)<0.20\to \mathrm{Jurkat} $$
  4. 4.
    $$ {f}_i\left({y}_{\mathrm{EL}4}\right)<0.05\&{f}_i\left({y}_{\mathrm{IVA}12}\right)<0.05\&{f}_i\left({y}_{\mathrm{Jurkat}}\right)<0.05\&{f}_i\left({y}_{\mathrm{TALL}-104}\right)>0.40\to \mathrm{TALL}-104 $$

We relaxed these rules for ICELL8 data to account for overall lower CID numbers and method-specific distribution differences:

  1. 5.
    $$ {f}_i\left({y}_{\mathrm{EL}4}\right)>0.40\&{f}_i\left({y}_{\mathrm{IVA}12}\right)<0.05\&{f}_i\left({y}_{\mathrm{Jurkat}}\right)<0.05\&{f}_i\left({y}_{\mathrm{TALL}-104}\right)<0.20\to \mathrm{EL}4 $$
  2. 6.
    $$ {f}_i\left({y}_{\mathrm{EL}4}\right)<0.05\&{f}_i\left({y}_{\mathrm{IVA}12}\right)>0.40\&{f}_i\left({y}_{\mathrm{Jurkat}}\right)<0.05\&{f}_i\left({y}_{\mathrm{TALL}-104}\right)<0.20\to \mathrm{IVA}12 $$
  3. 7.
    $$ {f}_i\left({y}_{\mathrm{EL}4}\right)<0.05\&{f}_i\left({y}_{\mathrm{IVA}12}\right)<0.05\&{f}_i\left({y}_{\mathrm{Jurkat}}\right)>0.35\&{f}_i\left({y}_{\mathrm{TALL}-104}\right)<0.20\to \mathrm{Jurkat} $$
  4. 8.
    $$ {f}_i\left({y}_{\mathrm{EL}4}\right)<0.05\&{f}_i\left({y}_{\mathrm{IVA}12}\right)<0.05\&{f}_i\left({y}_{\mathrm{Jurkat}}\right)<0.05\&{f}_i\left({y}_{\mathrm{TALL}-104}\right)>0.25\to \mathrm{TALL}-104 $$

Cells with two or more assigned cell types, were removed. For each sample, we classified the top n CIDs ranked by total read count with n = 2 × number of expected cells.

To analyze the thresholding method using the transposed log-log empirical cumulative density plot of the total read counts of each CID, we calculated the knee and inflection points as described in Lun et al. [28]. Briefly, the knee and inflection points are interpreted as determinants of the range in which the curve transitions between cell-containing droplets/wells with high mRNA content and empty droplets/wells with ambient RNA. The total count per CID is modeled as a function of decreasing CID rank by fitting cubic smooth splines with 20 degrees of freedom. The knee point is defined as the point on the curve where the signed curvature is minimized, the inflection point is defined as the point where the first derivative of the spline basis functions is minimized. We defined a lower bound for fitting the smooth splines as the minimum number of total reads of all classified CIDs. Calculations were performed using the R package DropletUtils v1.6.1. For the analysis of the original (i.e., not down-sampled) data described in Supplement Fig. 2, CIDs above the inflection point were considered genuine cells.

PBMC cell type classification

PBMC data was projected onto an annotated PBMC CITE-Seq reference dataset [29] using Seurat (v3.9.9.9003) [27]. Each cell received an assignment and prediction score to a cell class in the reference. For analysis, we included CIDs with reads greater than the inflection point calculated as described above using the top 12,000 CIDs by read count and more than 200 genes detected. Data was normalized using SCTransform [38]. Cells with a prediction score > 0.5 and a mitochondrial rate < 20% were retained for further analysis. Marker genes were identified using the FindMarkers() function using the Wilcoxon-rank-sum test comparing cells in each cell population with more than 10 cells to all other cells in the dataset.

Doublet rate estimation

CIDs were classified as multi-species multiplets if the number of total counts from each species exceeded the 10th percentile of the distribution for that species. The total count distributions were calculated using cells assigned in the step described above. Multiplet rates were calculated by taking CIDs above the inflection point and dividing the number of multi-species multiplets by the total number of cells. To obtain the true multiplet rate that considers non-detectable intraspecies multiplets and accounts for differing proportions of human and mouse cells, this fraction was divided by an adjustment factor λ:

  1. 9.
    $$ \lambda =2\frac{n_{\mathrm{human}}{n}_{\mathrm{mouse}}}{{\left({n}_{\mathrm{human}}+{n}_{\mathrm{mouse}}-{n}_{\mathrm{hm}}\right)}^2} $$

where nhuman is the number of human cells, nmouse is the number of mouse cells, and nhm is the number of inter-species multiplets.

Cell recovery rates

Cell recovery rates were calculated as the number of CIDs above the inflection point divided by the total number of cells loaded onto a system. Theoretical capture rates for 10x Genomics and ddSEQ methods are based on expected recovery numbers given in the user manual. Theoretical capture rates for Drop-seq were based on a 5% droplet occupancy of oligo beads. Theoretical capture rates for iCELL8 protocols were calculated based on a Poisson distribution of wells with single cells based on an average cell occupancy of 1 cell per well.

Estimation of ambient-RNA contamination

For each sample, UMI count matrices of filtered cells were analyzed using the decontX() function in the celda package (v1.2.4) [20] using cell classifications as cluster labels. Estimates of contamination were generated by calculating the difference in UMI counts after decontamination relative to the total number of UMIs detected for species-specific genes of a cell.

Pseudo-bulk analyses

Pseudo-bulk analyses analyzing correlation to bulk RNA-seq data and gene detection rates were performed by summing UMI counts across multiple cells. A subsample with at least 100 classified cells or the maximum number of classified cells recovered that meet the mitochondrial rate threshold were selected from each method. Mitochondrial rate thresholds were determined on a per-sample basis based on distribution of rates (Supplement Table 4). Various numbers of cells (1–50) were randomly sampled from this pool and expression values were averaged. The aggregated expression matrix was used for analyzing its correlation to bulk RNA-seq and for quantifying the number of detected genes. Mean values across 50 iterations for these metrics were used for visualization.

Dropout modeling

Dropout rates denote the fraction of missing values in a gene’s expression vector. We estimated the dropout rate for each gene from the species of the cell type considered. Cells included in the analysis were filtered by fraction of mitochondrial counts to remove poor quality cells. Dropout rate of bulk RNA-seq data was modeled by fitting the function f(x) = a exp(−bx) where x is the expression level using nonlinear least squares. GD50 FPKM numbers were calculated as 0.5 = a exp(−bx) using the fitted coefficients a and b. Dropout rates were similarly calculated for single-cell RNA-seq data by binning cells by mRNA detection rates. 10x 3′ v3 cells were placed into six bins based on distribution percentiles resulting equivalent numbers of cells in each bin. Dropout models were calculated for a random subset of 50 cells for each cell type with at least 50 cells in each bin and results were averaged across 50 iterations.

Differentially expressed gene identification

Statistical differences between gene expression of EL4 and IVA12 cells were identified using the hurdle model provided in the MAST R package v1.12.0 [24], a Wilcoxon rank-sum test, or the negative binomial generalized linear model available in the MASS R package v7.3–51.5. Genes that had an FDR-adjusted p-value < 1e-4 were declared differentially expressed. Cells from multiple replicates from each method were pooled in order to maximize sample sizes. Downsampling of cells was performed to the smallest number of observed cells from a single cell type (n = 199); this step was repeated 10 times to assess the error margin. UMI count data was used for 10x, ddSEQ, Drop-seq, and ICELL8 3′ DE-UMI samples, read count data was used for ICELL8 3′ DE samples. Expression count matrices were normalized by library size factors (i.e., total counts per cell), multiplied by 104, and log-transformed by log2(x + 1). Log-normalized count matrices were subjected to MAST, normalized count matrices were used for the Wilcoxon-rank sum test, and raw count data was input to the negative binomial generalized linear model.

Library size estimates were also determined using the computeSumFactors() function in the scran (v1.14.6) package [25] using cell classifications as cluster identities. These estimates were then correlated to total counts per cell.

Recall and precision of each method were calculated based on true-positive (TP), false-positive (FP) and false-negative DE genes (FN) as follows:

$$ \mathrm{Recall}=\frac{TP}{TP+ FN},\mathrm{Precision}=\frac{TP}{TP+ FP}, $$

where TP are significant DE genes as calculated with MAST with a 1.5 fold change in bulk RNA-seq data, FN are not differentially expressed in single-cell data, but had a 1.5 fold change in bulk data, and FP are significant DE genes with less than a 1.5 fold change in bulk data.

Availability of data and materials

All data generated or analyzed during this study are included in this published article and its supplementary information files. The raw data has been deposited in the Gene Expression Omnibus (accession GSE163793) and the European Genome-phenome Archive (accession number available upon request).

Abbreviations

RNA-seq:

RNA sequencing

PBMC:

Peripheral blood mononuclear cell

DE:

Differentially-expressed

UMI:

Unique Molecular Identifier

CID:

Cell identifier

cDNA:

Complementary DNA

mRNA:

Messenger RNA

PCR:

Polymerase chain reaction

UTR:

Untranslated region

GD50 :

Gene detection 50

FPKM:

Fragments per kilobase of transcript per million mapped reads

TCR:

T-cell receptor

BCR:

B-cell receptor

References

  1. 1.

    Bandura DR, Baranov VI, Ornatsky OI, Antonov A, Kinach R, Lou X, Pavlov S, Vorobiev S, Dick JE, Tanner SD. Mass cytometry: technique for real time single cell multitarget immunoassay based on inductively coupled plasma time-of-flight mass spectrometry. Anal Chem. 2009;81(16):6813–22.

    CAS  Article  Google Scholar 

  2. 2.

    He H, Suryawanshi H, Morozov P, Gay-Mimbrera J, Del Duca E, Kim HJ, Kameyama N, Estrada Y, Der E, Krueger JG, et al. Single-cell transcriptome analysis of human skin identifies novel fibroblast subpopulation and enrichment of immune subsets in atopic dermatitis. J Allergy Clin Immunol. 2020;145(6):1615–28.

  3. 3.

    Sade-Feldman M, Yizhak K, Bjorgaard SL, Ray JP, de Boer CG, Jenkins RW, Lieb DJ, Chen JH, Frederick DT, Barzily-Rokni M, et al. Defining T cell states associated with response to checkpoint immunotherapy in melanoma. Cell. 2018;175(4):998–1013 e1020.

    CAS  Article  Google Scholar 

  4. 4.

    Zhang Y, Zheng L, Zhang L, Hu X, Ren X, Zhang Z. Deep single-cell RNA sequencing data of individual T cells from treatment-naive colorectal cancer patients. Sci Data. 2019;6(1):131.

    CAS  Article  Google Scholar 

  5. 5.

    Zheng C, Zheng L, Yoo JK, Guo H, Zhang Y, Guo X, Kang B, Hu R, Huang JY, Zhang Q, et al. Landscape of infiltrating T cells in liver Cancer revealed by single-cell sequencing. Cell. 2017;169(7):1342–56 e1316.

    CAS  Article  Google Scholar 

  6. 6.

    Zilionis R, Engblom C, Pfirschke C, Savova V, Zemmour D, Saatcioglu HD, Krishnan I, Maroni G, Meyerovitz CV, Kerwin CM, et al. Single-cell Transcriptomics of human and mouse lung cancers reveals conserved myeloid populations across individuals and species. Immunity. 2019;50(5):1317–34 e1310.

    CAS  Article  Google Scholar 

  7. 7.

    Zhang L, Li Z, Skrzypczynska KM, Fang Q, Zhang W, O'Brien SA, He Y, Wang L, Zhang Q, Kim A, et al. Single-cell analyses inform mechanisms of myeloid-targeted therapies in Colon Cancer. Cell. 2020;181(2):442–59 e429.

    CAS  Article  Google Scholar 

  8. 8.

    Ziegenhain C, Vieth B, Parekh S, Reinius B, Guillaumet-Adkins A, Smets M, Leonhardt H, Heyn H, Hellmann I, Enard W. Comparative analysis of single-cell RNA sequencing methods. Mol Cell. 2017;65(4):631–43 e634.

    CAS  Article  Google Scholar 

  9. 9.

    Svensson V, Natarajan KN, Ly LH, Miragaia RJ, Labalette C, Macaulay IC, Cvejic A, Teichmann SA. Power analysis of single-cell RNA-sequencing experiments. Nat Methods. 2017;14(4):381–7.

    CAS  Article  Google Scholar 

  10. 10.

    Mereu E, Lafzi A, Moutinho C, Ziegenhain C, McCarthy DJ, Álvarez-Varela A, Batlle E, Sagar GD, Lau JK, et al. Benchmarking single-cell RNA-sequencing protocols for cell atlas projects. Nat Biotechnol. 2020;38(6):747–55.

  11. 11.

    Tian L, Dong X, Freytag S, Le Cao KA, Su S, JalalAbadi A, Amann-Zalcenstein D, Weber TS, Seidi A, Jabbari JS, et al. Benchmarking single cell RNA-sequencing analysis pipelines using mixture control experiments. Nat Methods. 2019;16(6):479–87.

    CAS  Article  Google Scholar 

  12. 12.

    Zhang X, Li T, Liu F, Chen Y, Yao J, Li Z, Huang Y, Wang J. Comparative analysis of droplet-based ultra-high-throughput single-cell RNA-Seq systems. Mol Cell. 2019;73(1):130–42 e135.

    Article  Google Scholar 

  13. 13.

    Ding J, Adiconis X, Simmons SK, Kowalczyk MS, Hession CC, Marjanovic ND, Hughes TK, Wadsworth MH, Burks T, Nguyen LT, et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat Biotechnol. 2020;38(6):737–46.

  14. 14.

    Zheng GX, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, Ziraldo SB, Wheeler TD, McDermott GP, Zhu J, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017;8:14049.

    CAS  Article  Google Scholar 

  15. 15.

    Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, Tirosh I, Bialas AR, Kamitaki N, Martersteck EM, et al. Highly parallel genome-wide expression profiling of individual cells using Nanoliter droplets. Cell. 2015;161(5):1202–14.

    CAS  Article  Google Scholar 

  16. 16.

    Goldstein LD, Chen YJ, Dunne J, Mir A, Hubschle H, Guillory J, Yuan W, Zhang J, Stinson J, Jaiswal B, et al. Massively parallel nanowell-based single-cell gene expression profiling. BMC Genomics. 2017;18(1):519.

    Article  Google Scholar 

  17. 17.

    Picelli S, Bjorklund AK, Faridani OR, Sagasser S, Winberg G, Sandberg R. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat Methods. 2013;10(11):1096–8.

    CAS  Article  Google Scholar 

  18. 18.

    Picelli S, Faridani OR, Bjorklund AK, Winberg G, Sagasser S, Sandberg R. Full-length RNA-seq from single cells using smart-seq2. Nat Protoc. 2014;9(1):171–81.

    CAS  Article  Google Scholar 

  19. 19.

    Islam S, Zeisel A, Joost S, La Manno G, Zajac P, Kasper M, Lonnerberg P, Linnarsson S. Quantitative single-cell RNA-seq with unique molecular identifiers. Nat Methods. 2014;11(2):163–6.

    CAS  Article  Google Scholar 

  20. 20.

    Yang S, Corbett SE, Koga Y, Wang Z, Johnson WE, Yajima M, Campbell JD. Decontamination of ambient RNA in single-cell RNA-seq with DecontX. Genome Biol. 2020;21(1):57.

    Article  Google Scholar 

  21. 21.

    Andrews TS, Hemberg M. M3Drop: dropout-based feature selection for scRNASeq. Bioinformatics. 2019;35(16):2865–7.

    CAS  Article  Google Scholar 

  22. 22.

    Tunnacliffe E, Chubb JR. What is a transcriptional burst? Trends Genet. 2020.

  23. 23.

    Lenstra TL, Rodriguez J, Chen H, Larson DR. Transcription dynamics in living cells. Annu Rev Biophys. 2016;45:25–47.

    CAS  Article  Google Scholar 

  24. 24.

    Finak G, McDavid A, Yajima M, Deng J, Gersuk V, Shalek AK, Slichter CK, Miller HW, McElrath MJ, Prlic M, et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 2015;16:278.

    Article  Google Scholar 

  25. 25.

    Lun AT, Bach K, Marioni JC. Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol. 2016;17:75.

    Article  Google Scholar 

  26. 26.

    Soneson C, Robinson MD. Bias, robustness and scalability in single-cell differential expression analysis. Nat Methods. 2018;15(4):255–61.

    CAS  Article  Google Scholar 

  27. 27.

    Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck WM 3rd, Hao Y, Stoeckius M, Smibert P, Satija R. Comprehensive integration of single-cell data. Cell. 2019;177(7):1888–902 e1821.

    CAS  Article  Google Scholar 

  28. 28.

    Lun ATL, Riesenfeld S, Andrews T, Dao TP, Gomes T, Participants in the 1st human cell atlas J, Marioni JC. EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data. Genome Biol. 2019;20(1):63.

    Article  Google Scholar 

  29. 29.

    Hao Y, Hao S, Andersen-Nissen E, Mauck WM, Zheng S, Butler A, Lee MJ, Wilk AJ, Darby C, Zagar M et al: Integrated analysis of multimodal single-cell data. BioRxiv 2020, 335331.

  30. 30.

    Wang YJ, Schug J, Lin J, Wang Z, Kossenkov A, , Kaestner KH: Comparative analysis of commercially available single-cell RNA sequencing platforms for their performance in complex human tissues. BioRxiv 2019, 541433.

  31. 31.

    Lu DR, Wu H, Driver I, Ingersoll S, Sohn S, Wang S, Li CM, Phee H. Dynamic changes in the regulatory T-cell heterogeneity and function by murine IL-2 mutein. Life Sci Alliance. 2020;3(5):e201900520.

  32. 32.

    Hanson WM, Chen Z, Jackson LK, Attaf M, Sewell AK, Heemstra JM, Phillips JD. Reversible oligonucleotide chain blocking enables bead capture and amplification of T-cell receptor alpha and beta chain mRNAs. J Am Chem Soc. 2016;138(35):11073–6.

    CAS  Article  Google Scholar 

  33. 33.

    Saikia M, Burnham P, Keshavjee SH, Wang MFZ, Heyang M, Moral-Lopez P, Hinchman MM, Danko CG, Parker JSL, De Vlaminck I. Simultaneous multiplexed amplicon sequencing and transcriptome profiling in single cells. Nat Methods. 2019;16(1):59–62.

    CAS  Article  Google Scholar 

  34. 34.

    Liao Y, Smyth GK, Shi W. FeatureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014;30(7):923–30.

    CAS  Article  Google Scholar 

  35. 35.

    Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21.

    CAS  Article  Google Scholar 

  36. 36.

    Romagnoli D, Boccalini G, Bonechi M, Biagioni C, Fassan P, Bertorelli R, De Sanctis V, Di Leo A, Migliaccio I, Malorni L, et al. ddSeeker: a tool for processing bio-rad ddSEQ single cell RNA-seq data. BMC Genomics. 2018;19(1):960.

    CAS  Article  Google Scholar 

  37. 37.

    Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. Genome project data processing S: the sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9.

    Article  Google Scholar 

  38. 38.

    Hafemeister C, Satija R. Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol. 2019;20(1):296.

    CAS  Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank Wenny Chou for assistance in data generation and Drs. Ian Driver and Wenjun Ouyang for valuable inputs to experimental design and manuscript preparation.

Funding

This study was fully funded and supported by Amgen Inc.

Author information

Affiliations

Authors

Contributions

TMY, DRL, and CML designed the experiments and drafted the manuscript. TMY, DRL, DCE, and CML analyzed the data. HZ, DB, PM, VA, TMY and DRL prepared and sequenced the NGS libraries and coauthored the relevant method sections. DCE, OKY and OH set up the bioinformatics analysis pipeline and provided input and assistance in preparation of this manuscript. SW and CML initiated the study, made contributions to experiment design, and advised on data analysis, validation, and manuscript preparation. All authors read and approved the final draft of the manuscript. This manuscript also has been reviewed and approved by Amgen Final Publication Review (FPR) process.

Corresponding author

Correspondence to Chi-Ming Li.

Ethics declarations

Ethics approval and consent to participate

PBMC samples were collected anonymously through an internal Amgen Research Blood Donor Program that was monitored by Amgen Occupational Health in EH&S and considered exempt from IRB approval.

Consent for publication

Not applicable.

Competing interests

The authors have read the journal’s policy and have the following conflicts: Tracy M. Yamawaki, Daniel R. Lu, Daniel C. Ellwanger, Dev Bhatt, Paolo Manzanillo, Vanessa, Arias, Hong Zhou, Oliver Homann, Songli Wang, and Chi-Ming Li are employees or contract workers at Amgen Inc. Oh Kyu Yoon was employed by Amgen Inc. while working on the study. All authors owned Amgen shares when the experiments were carried out. However, these do not alter the authors’ adherence to all journal policies on sharing data and materials.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yamawaki, T.M., Lu, D.R., Ellwanger, D.C. et al. Systematic comparison of high-throughput single-cell RNA-seq methods for immune cell profiling. BMC Genomics 22, 66 (2021). https://doi.org/10.1186/s12864-020-07358-4

Download citation

Keywords

  • Single cell
  • Transcriptomics
  • Single-cell RNA-seq
  • High throughput sequencing
  • Immune-cell profiling