An open protocol for modeling T Cell Clonotype repertoires using TCRβ CDR3 sequences

T cell receptor repertoires can be profiled using next generation sequencing (NGS) to measure and monitor adaptive dynamical changes in response to disease and other perturbations. Genomic DNA-based bulk sequencing is cost-effective but necessitates multiplex target amplification using multiple primer pairs with highly variable amplification efficiencies. Here, we utilize an equimolar primer mixture and propose a single statistical normalization step that efficiently corrects for amplification bias post sequencing. Using samples analyzed by both our open protocol and a commercial solution, we show high concordance between bulk clonality metrics. This approach is an inexpensive and open-source alternative to commercial solutions.


Background
The receptors on the surface of T cells bind to an enormous array of antigens that play a pivotal role in shaping immune response during health and disease. The T cell receptor (TCR) is a heterodimer composed of one alpha and one beta chain which are encoded by the TCRα and TCRβ genes, respectively. To recognize an extremely large antigen space, the TCR genomic loci undergo somatic recombination of variable (V), diversity (D), and joining (J) gene segments, and generate a diverse repertoire of TCRs. The complementarity determining region 3 (CDR3) region present at the D segment of the recombined TCRβ gene is highly diverse in TCR beta chains. Therefore, surveying the recombined TCRβ gene or transcript as a proxy for overall TCR repertoire diversity has emerged as a rational approach to study TCR repertoire dynamics.
Over the last ten years it became possible to obtain comprehensive pro les of TCR through an array of next generation sequencing (NGS) based approaches (1)(2)(3)(4)(5)(6)(7)(8). These vary based on the experiment type (bulk or single-cell), sample type (RNA or DNA), library preparation (multiplex PCR, bead-based enrichment, 5'RACE) and sequencing platform, each choice presenting a different, and fascinatingly interlocked trade-off. Single cell approaches compared to bulk can be very accurate and unbiased for high frequency clones, but have lower resolution for low frequency clones (9,10). They are also substantially more expensive and require intact cells. RNA-based approaches are affected by the variability in TCR RNA expression levels, but may better re ect diversity when the sample size is limited (8). Genomic DNA (gDNA)-based approaches require either multiplex PCR or target enrichment during library preparation which introduces biases (11). A recent comparison of these two approaches con rmed these issues for both RNA-and DNA-based methods but also found the methodological variability to be smaller than the biological variability (12). New innovations are being introduced rapidly for all of these approaches but currently there is no established gold standard.
Multiplexed PCR-based bulk sequencing approaches using gDNA, however, have become the standard approach for translational and even clinical applications due to reasonable sample requirements and moderate costs (13). This is re ected in the fact that all currently available commercial TCR sequencing products offer this as their major DNA option (Adaptive Biotechnologies (14), BGI (15), iRepertoire).
Multiplex PCR refers to the usage of multiple forward primers speci c for the V segments and multiple reverse primers speci c for the J segments in combination during the initial ampli cation for target enrichment. Since each primer pair will have a different e ciency multiplexing will distort the relative abundances of the VDJ segment combinations (16). Correcting for this ampli cation bias is a key challenge for accurate quanti cation. One approach (14) is using spiked-in oligomers (synthetic templates) for each primer pair to measure primer e ciencies and to carefully control the design of the primers and their concentrations accordingly. Assuming that there is no interaction between the e ciency of primer pairs, ampli cation bias is reduced by iteratively calibrating the primer concentrations to nd the optimal primer mix, and then removing any remaining ampli cation bias computationally using spiked-in oligomer counts.
The proprietary setup described in (14) is currently available for human and murine samples exclusively through commercial kits. These are di cult and labor-intensive to adapt if requiring different settings, e.g. a different mammal or application. Furthermore, synthetic templates when added to all of the samples with the kits can substantially increase preparation and sequencing costs, as well as decrease coverage for clonotypes.
Here we demonstrate, that using a negative binomial mean normalization strategy, it is possible to normalize ampli cation bias within a reasonable error margin without balancing primer concentrations using synthetic templates. Furthermore, we report that ampli cation bias parameters are highly preserved between datasets. This observation indicates that one can use synthetic template normalization controls for a small subset of samples, which can then be used to calculate mean scaling factors per batch/experiment, thus substantially reducing the cost and coverage of TCR repertoire analysis.

Ampli cation bias due to multiplex PCR is reproducible
To amplify all possible TCR somatic recombination products, we performed a multiplex PCR with 20 different V-speci c forward primers and 13 different J-speci c reverse primers. Since the differences in e ciency of primer pairs can produce signi cant ampli cation bias in TCR clonotypes, we spiked-in 260 synthetic TCR templates (ST) in equimolar concentration as internal controls to the multiplex PCR reaction in order to measure and control bias (Fig. 1A). We hypothesized that the estimated mean counts of the 260 ST would scale proportionally across samples and experiments. To test this, we measured the distribution and variation of ST counts across 20 ST-only samples in the absence of genomic DNA (Fig. 1B). Within each sample, ST counts were converted to relative frequencies to reduce the effect of random sample-to-sample variation on the comparisons. For a given ST, deviation of the median relative frequency across all samples from 1/260 is a measure of the PCR ampli cation bias for the corresponding primer pair, while the spread of relative frequencies is an indication of the random sampleto-sample variation. We observed that the ST-to-ST variation was much larger than the sample-to-sample variation within an individual ST (~ 20-fold difference in respective median IQRs). These differences indicate that the observed differences in ST counts are primarily caused by ampli cation bias of different primer pairs. The ln (1/260) line indicates the (log) expected ST relative frequency in the absence of ampli cation bias. To demonstrate that the above observations apply in the presence of TCR clonotypes, we obtained ST counts from samples with genomic DNA extracted from P14 and OT-1 TCR transgenic mice where CD8 T cells primarily recognize OVA 257 − 264 when presented by the MHC I molecule. A similar relationship was found between ST relative frequencies in the presence of TCR clonotypes across samples, though the sample-to-sample variation is noticeably larger (Supplementary Fig. 1 At the high end, we expect an approximate linear relationship between log mean and log variance for the ST counts, with a slope of about 2, since log v ≈ log d + 2log m for large m. In Fig. 2, we took d = 0.125, as this was the median value of the dispersion estimates found by tting separate NB distributions to the 260 sets of 20 ST counts. The fact that the ST counts t NB distributions with approximately similar overdispersion parameters reassures us the variation we are seeing is in some sense natural and that the system is in control (17). revealed that the reduction in spread of ST normalized counts was present, although less pronounced ( Supplementary Fig. 2). To validate the normalization procedure further, we assessed the observed ratio of monoclonal counts before and after normalization utilizing the 50:50 mixture of P14 and OT-1 TCR transgenic monoclonal DNA. After normalization, the differences between the transgenic TCR counts were substantially reduced for all samples, as represented by the smaller deviations of the proportion of the dominant clonotype from ½ following normalization (Fig. 4B).

Scaling factors are relatively stable across experiments
Ampli cation bias reduction bene ts from the dependence of primer pairs A question from data presented in Figs. 4A and 4B that arose had to do with determining how great a reduction in the spread of the 260 ST counts would be possible, given the variation, even in the absence of ampli cation bias. A theoretical analysis is presented in Supplementary Appendix 2 under the assumption that counts from equimolar concentrations of the 260 ST are independent NB distributions with the same mean and dispersion parameters gives a lower bound. Seeing that counts from the 20 STonly samples were well approximated by NB distributions with the same dispersion parameters (albeit with quite different means), our conclusion was that the different ST counts were likely not independent.
Further, this result supports the normalization scheme, as the dependence aids reduction of the ampli cation bias below the level that would be expected under independence. Since each primer pair shares one primer with 32 other primer pairs, it is not surprising that the different ST counts are not independent; indeed, patterns of dependence in counts using chi-squared statistics are observed (Fig. 5).
The interactions revealed as patches of red and blue colors demonstrate that groups of V primers exhibit positive or negative dependence together with groups of J primers, that is, they interact to become over or under-represented in groups.
In their experimental setup, Carlson et al., using an ANOVA based approach, concluded that primer pairs can be treated independently, and employed primer iteration experiments to nd the optimal primer mix (14). Based on our normalization approach, however, the non-random, non-zero interaction terms revealed by chi-squared statistics substantially complicate preparation of an optimal primer mix through primer iteration experiments, and indicate a limit on the extent to which ampli cation bias can be addressed experimentally. We note that Carlson et al. also employed a second, computational normalization step, which we believe is primarily due to this limit.

A negative binomial model supports downstream analyses of T cell repertoire dynamics
With the TCR clonotype counts normalized, we wanted to determine if results from these analyses were concordant with results from a commercially available platform (Adaptive Biotechnologies) described by Carlson et al (14). To achieve this, we utilized gDNA generated from pancreatic ductal adenocarcinoma specimens described in Byrne et al(18) containing 17 samples previously sequenced by Adaptive Biotechnologies' platform. Aliquots of samples were sequenced and processed using the protocol described above. We utilized in-house software (see methods), tcR package, and VDJ tools to compute TCR repertoire metrics such as the diversity, clonality, and clonal distribution and refer to this pipeline as Open TCR Sequencing Protocol (OTSP). The 16 samples with enough DNA were sequenced in parallel and results were evaluated for concordance using Pearson correlation analyses for all combinations of: 1) Adaptive Biotech. platform sequences run by OTSP; 2) OTSP sequences run by OTSP; and 3) Adaptive Biotech. platform sequences run by Adaptive Biotech. Figure 6A and Fig. 6B show concordance between results of OTSP sequences run by OTSP and Adaptive Biotech. sequences run by OTSP for Clonal index (r = 0.8) and Shannon diversity index (r = 0.8). The concordance for other combinations for Clonal Index and Shannon Diversity, as well as concordance between results of OTSP sequences run by OTSP and Adaptive Biotech. platform sequences run by Adaptive Biotech. for the frequency of hyperexpanded clones are shown in Supplementary Fig. 3 (A-E). Overall, results from the OTSP pipeline demonstrates a strong association with results from the Adaptive Biotech. TCR sequencing platform (p < 0.001 for all comparisons).
The descriptions and formulas related to Clonal index, Shannon diversity index and frequency of hyperexpanded clones are included in (19), which deploys the NB mean normalization methodology described here in a biological context.

Discussion
Measuring and monitoring adaptive dynamics in patient TCR repertoires could have a signi cant impact on response and resistance monitoring for patients receiving various forms of immunotherapy in the treatment of cancer or auto-immune diseases. To achieve the goal of capturing the diversity, quantifying the abundance of T-cell clones and performing longitudinal comparisons, TCR sequencing has emerged as an approach to monitor T cell responses to therapy and disease progression.
The OTSP pipeline described herein provides a transparent protocol enabling clonality metrics, including evaluating ampli cation bias speci c to each primer pair, and is reproducible across samples. Count variability approximates the negative binomial distribution that can be exploited using estimated NB means. The NB distribution tells us the variation anticipated in ST counts is indicative of a controlled process.
OTSP pipeline results were compared to data generated from the commercial platform by xing 260 STspeci c scaling factors derived from ST-only measurements (without the presence of genomic DNA). We observed a high concordance between bulk clonality metrics across the two platforms. This observation indicates that the OTSP pipeline can be integrated across batches, samples and platforms, further improving utility of TCR clonality measurements, something not generally possible when using commercial platforms as ST-counts are typically not provided.
OTSP is open and freely available; we anticipate this will allow scaling up the number of measurements substantially. Since we only use computational normalization, rather than addressing differing primer e ciencies at the bench level, the OTSP methodology is also less labor-intensive than previous methods. Notably, OTSP avoids the primer iteration experiments needed to address ampli cation bias problems.
Since PCR ampli cation bias is repeatable across samples we have demonstrated the possibility of conducting analyses without the addition of ST beyond the initial calibration. This is achieved by using ST-speci c normalization scale factors obtained from independent, ST-only measurements. The idea of using 260 universal ST-speci c scaling factors to address ampli cation bias in mouse models could be explored further, for if deployed this would substantially decrease cost of this methodology as ST are costly. An additional advantage stems from the fact that ST reduce sequencing depth due to competition between genomic DNA and the ST (Supplementary Fig. 4). We observed that when gDNA amount was kept constant at 600 ng, increasing concentrations of ST led to decreasing detectability of clonotypes, indicating a competition between the ST and clonotypes during the process. This is especially important as indicated by our results revealing that drop-outs can be frequent, even for the most abundant clonotypes ( Supplementary Fig. 5). For example, when the most frequent 0.3% clonotypes from Wild Type spleen tissue were used, we observed that only 119 distinct clonotypes were detected in the 5 replicates, with 63 clonotypes detected in 4 samples, 69 in 3, and 90 in 2, respectively. These drop-outs stem from the stochastic nature of the sampling and could be reduced by increasing the read coverage by not using ST.

Conclusions
Measuring and tracing changes in the abundance of clones is important for observing the immune response to different therapies such as cancer immunotherapy, leukemia monitoring and predicting patient outcomes.
A high concordance between bulk clonality metrics across Adaptive Biotech., one of the leading commercial companies in the eld, and OTSP is observed. Our approach is a less laborious (as OTSP avoids the primer iteration experiments needed to address ampli cation bias problems), reproducible and an inexpensive alternative for understanding relative abundances of clonotypes. Utilization of STs for every experiment in a batch beyond the initial ST calibration may not be necessary and their utilization in every batch decreases the sequencing depth.
We propose the OTSP as an open-source, transparent protocol that e ciently corrects for ampli cation bias post sequencing for accurate, reproducible measurement of clonality metrics.

Material And Methods
The aim of the methodology is modeling T Cell Clonotype repertoires using TCRβ CDR3 sequences utilizing NGS, multiplex PCR and an NB mean normalization strategy. Some of the experiments were performed in the context of genomic DNA derived from mice.

Mouse handling
To generate the Byrne et al data set, wild-type C57BL/6 mice were purchased from The Jackson Laboratory and housed at the University of Pennsylvania. Animal protocols were reviewed and approved by the Institute of Animal Care and Use Committee at the University of Pennsylvania. Mice were euthanized in a CO2 chamber using a ow meter to ensure CO2 was displaced at a rate of 30-70% of the chamber volume per minute and maintained for at least 1 minute after the loss of righting re ex is observed. Euthanasia was con rmed by bilateral thoracotomy. Animal handling information regarding tumor injections, drug prep and injection are included in Byrne et al (18).
To generate transgenic TCR data sets, spleen tissue from P14 and OT1 TCR transgenic mice (Nolz lab) were obtained. The spleen tissue was homogenized, treated with RBC lysis buffer and the resulting singlecell suspension was pelleted. Genomic DNA was extracted from the cell pellet using DNeasy blood and tissue kit (Qiagen).
To generate all the other data using animals, wild-type C57BL/6 mice were purchased from The Jackson Laboratory and maintained within the UCSF or OHSU laboratory for animal care barrier facility according to IACUC procedures. To generate the ST dilution series data set with mammary tumor and the Mesothelioma data set, we used mammary tumors from mouse mammary tumor virus (MMTV)-Polyomavirus middle T (PyMT) transgenic FVB/N mice and 40L orthotopic mesothelioma tumors respectively. Mammary tumors were resected from day 95 mice. For 40L orthotopic mesothelioma tumors, 2 x 10 6 cancer cells were injected i.p. into wild-type male C57BL/6 that were 6-12 weeks of age.
For both tumor models, all mice were euthanized at a pre-de ned end-stage for tissue harvest by cardiac puncture followed by cervical dislocation. These mice were cardiac perfused, under anesthesia using 1-5% iso urane, with 20 mL solution of heparin in PBS to clear tissues of residual blood followed by tissue harvest for further analysis including TCR sequencing. Tumor tissue was excised and ash frozen in liquid nitrogen and stored at -80 degree Celsius until further use for extracting genomic DNA for TCR sequencing. Murine SCC tumors were obtained from a previously published study, Medler et al (19).

TCR data analysis Pipeline
Fastq les were assessed for initial read quality using the FASTQC public tool (22), including the per-base quality scores. Quality paired-end sequences were combined using the PEAR (Paired-End reAd mergeR) algorithm (23). Merged sequences were then separated into ST and non-ST sequences. ST sequences were identi ed by searching for the common anking 9-bp internal barcodes allowing a one-nucleotide mismatch or indel. Sequences agged as ST via this search were removed from downstream clonotype analyses. The individual ST sequences were distinguished and quanti ed by searching for the speci c 16-bp barcode sequences unique to each ST, again allowing a one-nucleotide mismatch or indel ( Supplementary Fig. 6). Clonotypes were identi ed from puri ed (ST-removed) sequences utilizing the MiXCR pipeline (24), which is a two-step alignment and assembly process. First, reads were aligned to reference V, D, and J sequences, using the align module. Next, the assemble module grouped alignments into distinct clonotypes using a hierarchical clustering method based on sequence similarity and relative abundance. Finally, the export module exported alignments as well as assembled clones in tabular format. Raw clonotype counts were normalized using the NB mean normalization strategy described below. Normalized clonotype counts were exported in tabular format for use in downstream analysis. A number of TCR repertoire metrics, including clonality, maximum clonal frequency, and the Shannon diversity index were calculated. Quality control data was recorded in an overall summary  Fig. 7).
ST dilution series data set: A dilution series was created from ST at six levels as below WT spleen data set: Wild-type mouse spleen genomic DNA were sequenced for a total of ve technical replicates. A single mouse was used to create this data set.
Byrne et al. data set (18): gDNA from 17 murine pancreatic ductal adenocarcinoma specimens were previously sequenced by a commercially available TCR beta platform. Aliquots of these gDNA samples were obtained and sequenced along with an equimolar mixture of ST using the protocol described above. In total, 17 murine samples were used to create this previously published data set.
Mesothelioma data set: Peritoneal mesothelioma tumors derived from the 40L cell line (20)derived genomic DNA samples derived from 12 syngeneic mice were sequenced along with equimolar mixture of ST. In total, 12 mice were used to create this data set.
In total, 60 mice were used to create above datasets to assess methodological approaches to normalizing TCR repertoires. No groups of animals were compared to each other as the purpose of the study has no biological study endpoint. Therefore, no a priori sample size calculations performed.

Batch mean scaling factors
This method creates a scaling factor for each ST (and therefore for each primer pair) based on that ST's counts among all samples, relative to all ST counts within a batch. Given a matrix of ST counts from a batch, where i = 1,…,260 labels ST and j = 1,…,n labels samples in the batch, we denote the batch mean of the counts for ST i by and the batch mean of all ST means by . The scaling factor (SF) for ST i in that batch is .

Negative Binomial means
The above idea of a scale factor is distribution free, but for its use in normalizing counts, would require a full set of ST in every sample. We explored the use of a ST-speci c negative binomial model to dispense with the use of synthetic templates. Consider the set (C 1 , …., C n ) of counts for single, xed ST across a batch of n replicate ST-only samples. A plausible model for these counts is the negative binomial (NB) distribution. We write for this distribution, where m > 0 is the mean parameter and d≥0 is the (over-) dispersion parameter, and refer to (25) for an explicit formula for the NB probability mass function. For present purposes, S has expected value and variance . When d=0, the negative binomial reduces to the Poisson distribution, for which , and thus the use of the term over-dispersion here is relative to the Poisson. Using the methods of generalized linear models (25), we can obtain the maximum likelihood estimates (MLE) of m and d from a replicate set of ST such as (C 1 , …., C n ). In the notation of the previous paragraph, the MLE , that is, the MLE of the ith mean parameter of an NB tted to (C ij ) is the arithmetic mean of the ith set of observed counts (assumed independent and identically distributed across j=1,…,n with common mean and common dispersion parameter. Where no confusion will result in what follows, we will not distinguish the parameters (m i ) from their (maximum likelihood) estimates . . Similarly, we normalize the (C i ) using the (estimated) NB mean scaling factors by dividing: . After these primer-pair totals were normalized, the counts for distinct clonotypes sharing the same primer-pair were normalized: if one such clonotype accounts for a proportion p of the total count C corresponding to its primer pair, then it will be assigned a normalized value equal to the same proportion p of the normalized primer-pair total C' or C". .
The same normalization could be used for ST counts if available. That is, divide the observed count of the fragments arising from primer pair i by the SF i or for primer pair i for both ST and clonotype counts alike. Since the mean count of ST i will be proportional to , normalization should preserve the total count of ST, exactly for batch mean normalization, on average for NB normalization. As long as the observed clonotype counts exhibit the same relative over-and under-representation after ampli cation as that exhibited by the ST, any bias will be reduced by this normalization.       Normalization reduces ampli cation bias.
(A) NB normalization reduces ampli cation bias. Variation in ST counts within samples is apparent before normalization (red), however, the ST-to-ST differences within each sample are reduced to less than two-fold after normalization (cyan). Data derived from twenty ST-only samples described in the ST-only data sets.
(B) Ampli cation bias of spleen genomic DNA of P14 and OT-1 TCR transgenic mice. To further evaluate normalization parameters, a 50:50 mixture of P14 and OT1 TCR transgenic monoclonal DNA was utilized to examine differences between transgenic TCR counts (red) that were reduced for all samples following normalization (cyan), as observed by the smaller deviations from ½ of the proportions of the dominant clonotype. Red color indicates values from clone counts before normalization, green color indicates values from normalized clone counts. Data derived from 12 50:50 mixture ofP14 and OT-1 transgenic micesamples described in the Transgenic TCR data sets. Cluster analysis revealing dependence between forward and reverse primers. positive and negative deviation from independence between forward and reverse primers respectively, and white indicates no deviation. Data derived from twenty ST-only samples described in the ST-only data sets.

Figure 6
Concordance analysis of T cell repertoire metrics.
Concordance analysis comparing commercial and in-house pipelines where samples were evaluated based on the results of OTSP sequences run by OTSP and the Adaptive Biotech platform sequences run by OTSP for Clonal Index (A) and Shannon Diversity Index (B