**A Bayesian model for detecting antibody enrichment.** A succinct summary of the model notation is provided in Supplementary Table S5. On a 96-well plate suppose we observe \(Y_{ij}\) read counts for peptide \(i \in \{1, 2, \ldots , P\}\) in sample \(j \in \{1, 2, \ldots , 96\}\). Let \(n_{j} = \sum _{i} Y_{ij}\) denote the total read count (library size) for sample *j*. Without loss of generality, assume samples \(\{1, 2, \ldots , N\}\) are mock IP (beads-only) samples. To infer reactivity, we compare one sample to all beads-only samples on the same plate. Our hierarchical model to infer peptide reactivity in a sample \(j \in \{N+1, \ldots , 96\}\) is described as follows.

$$\begin{aligned} Y_{ij}|\theta _{ij}&\sim \text {Binomial}(n_{j}, \theta _{ij}) \\ \theta _{ij}|a_{i0}, b_{i0}, c_{j}, \phi _{ij}&\sim \text {Beta}(f_{a}(c_{j} \phi _{ij} \mu _{i0}, \sigma ^{2}_{i0}), f_{b}(c_{j} \phi _{ij} \mu _{i0}, \sigma ^{2}_{i0})) \\ c_{j}&\sim \text {Beta}(a_{c}, b_{c}) \\ \phi _{ij}|Z_{ij}&\sim (1 - Z_{ij}) \cdot 1 + Z_{ij}(\phi _{min} + \text {Gamma}(a_{\phi }, b_{\phi })) \\ Z_{ij}|\pi _{j}&\sim \text {Bernoulli}(\pi _{j})\\ \pi _{j}&\sim \text {Beta}(a_{\pi }, b_{\pi }) \end{aligned}$$

The main parameter of interest \(Z_{ij}\) is a binary indicator denoting whether peptide *i* elicits an enriched antibody response in sample *j* (a 0 indicates no, a 1 indicates yes). The prior is a Bernoulli with success probability \(\pi _j\). For all mock IP samples, this success probability is zero. For sample *j*, \(\pi _{j}\) is modeled as a beta distribution. The shape parameters \(a_{\pi }\) and \(b_{\pi }\) of the Beta prior distributions are chosen as 2 and 300 in our applications, to reflect peptide enrichment seen in previous studies, but also make it sufficiently diffuse to support a range of proportions (Supplementary Fig. S17). The parameter \(\phi _{ij}\) is the fold change observed for peptide *i* in sample *j*. It is equal to 1 if \(Z_{ij} = 0\), i.e., peptide *i* does not elicit an enriched antibody response in sample *j*. Enriched peptides are expected to pull a larger proportion of reads, so only fold changes larger than 1 are considered. Here, we model the fold-change as a shifted gamma distribution (with shape parameters \(a_{\phi }\) = 1.25 and \(b_{\phi }\) = 0.1, Supplementary Fig. S17), with the magnitude of the shift \(\phi _{min}\) being the minimum fold-change assumed for an enriched peptide (chosen as 1 in our applications). In the presence of reactive peptides pulling reads, non-reactive peptides in sample *j* will have less reads than expected from the beads-only samples where no reactive peptides exist by definition. We denote this attenuation constant for sample *j*, which is similar to the trimmed mean of M-values (TMM) scale factor used in edgeR [24], with \(c_{j}\). Typically, only a minority of peptides in a sample show reactivity and the attenuation constant usually is between 0.5 and 1 (being equal to 1 in mock IP samples). In this application, we chose a Beta prior with scaling constants \(a_{c} = 80\) and \(b_{c} = 20\) (Supplementary Fig. S17; the attenuation constant is equal to 1 in the mock IP samples). The observed read counts \(Y_{ij}\) are modeled using a Binomial\((n_{j}, \theta _{ij})\) distribution, where \(\theta _{ij}\) denotes the probability that peptide *i* pulls a read in sample *j*, and \(n_{j}\) denotes the total library size in sample *j*. This Binomial probability is modeled using a Beta prior distribution, and the shape parameters depend on the expected peptide read counts observed in the mock IP samples (estimation procedures described below), the fold change \(\phi _{ij}\), and the attenuation constant based on the reads pulled by all reactive peptides in the sample.

**Shape parameter estimation.** We define two functions, \(f_{a}\) and \(f_{b}\), used for the description of the Beta shape parameters *a* and *b* given a mean \(\mu\) and a variance \(\sigma ^{2}\).

$$\begin{aligned} f_{a}(\mu , \sigma ^{2}) = \frac{\mu ^{2}(1 - \mu )}{ \sigma ^{2}} - \mu \quad \text {and} \quad f_{b}(\mu , \sigma ^{2}) = f_{a}(\mu , \sigma ^{2}) \left( \frac{1}{\mu } - 1\right) . \end{aligned}$$

The parameterization for the Beta shape parameters above is the same as used in the methods of moments estimation, and the mean and variance for peptide *i* in a beads-only sample (e.g., for a Beta distribution with shape parameters \(a_{i0}\) and \(b_{i0}\)) are given by

$$\begin{aligned} \mu _{i0} = \frac{a_{i0}}{a_{i0} + b_{i0}} \quad \text {and} \quad \sigma ^{2}_{i0} = \frac{a_{i0}b_{i0}}{(a_{i0} + b_{i0})^{2}(a_{i0} + b_{i0} + 1)}. \end{aligned}$$

Since each sample generally contains more than a million of reads, estimates of the Binomial probabilities \(\hat{\theta }_{ij} = \frac{Y_{ij}}{n_{j}}\) in the mock IP samples \(j \in \{1, 2, \ldots , N\}\) are very precise. Method of moments (MOM) estimates for the peptide *i* shape parameters \(a_{i0}\) and \(b_{i0}\) can be derived by equating the mean and variance of the above Binomial estimates across all beads-only samples to the mean and variance of the Beta\((a_{i0}, b_{i0})\) distribution.

$$\begin{aligned} \hat{\theta }_{i0}= & {} \frac{1}{N} \sum\limits_{j = 1}^{N} \hat{\theta }_{ij} \\ \hat{\sigma }^{2}_{i0}= & {} \frac{1}{N - 1} \sum\limits_{j = 1}^{N} (\hat{\theta }_{ij} - \hat{\theta }_{i0})^{2}. \end{aligned}$$

The MOM estimates for \(a_{i0}\) and \(b_{i0}\) are then given by

$$\begin{aligned} \hat{a}_{i0}^{\text {MOM}}= & {} f_{a}(\hat{\theta }_{i0}, \hat{\sigma }^{2}_{i0})\\ \hat{b}_{i0}^{\text {MOM}}= & {} f_{b}(\hat{\theta }_{i0}, \hat{\sigma }^{2}_{i0}). \end{aligned}$$

Maximum likelihood estimates (MLEs) for \(a_{i0}\) and \(b_{i0}\) were derived using the Broyden, Fletcher, Goldfarb and Shanno quasi-Newton optimiziation algorithm with box constraints [27], as implemented in the R optim() function.

Numerous papers have demonstrated the benefits of shrinkage or variance stabilization in high throughput genomics experiments, borrowing strength across units such as genes and proteins [20, 22, 28, 29]. This can be particularly important when the sample sizes are small, such as the number of mock IP experiments on each plate in our application, but neither the MLEs nor the MOM estimates described above make use of this. In contrast for example, edgeR uses an emprical Bayes approach [30] to approximate the larger than binomial variability observed in the RNA-Seq read counts, and to stabilize these variance estimates, which are characterized by the tagwise dispersion parameter (the squared coefficient of variation of \(\hat{\theta }_{ij}\), denoted here as \(\tau ^{\text {edgeR}}_{ij}\)) [20]. We note that we can use the estimates of these tagwise dispersion parameters to derive new estimates of the variances for our Binomial probabilities \(\theta _{ij}\). Specifically, for peptide *i* we have \(\left( \hat{\sigma }^{\text {edgeR}}_{i0}\right) ^{2} = \tau ^{\text {edgeR}}_{ij}*\hat{\theta }_{i0}^{2}\), and thus

$$\begin{aligned} \hat{a}_{i0}^{\text {edgeR}}= & {} f_{a}\left( \hat{\theta }_{i0}, \left( \hat{\sigma }^{\text {edgeR}}_{i0}\right) ^{2}\right) \\ \hat{b}_{i0}^{\text {edgeR}}= & {} f_{b}\left( \hat{\theta }_{i0}, \left( \hat{\sigma }^{\text {edgeR}}_{i0}\right) ^{2}\right) \end{aligned}$$

The Beta parameters *a* and *b* can be thought of as the number of successes and the number of failures, respectively, in \(a + b\) trials. The Markov Chain Monte Carlo (MCMC) sampler Just Another Gibbs Sampler (JAGS) can encounter numerical issues when either of those is less than 1. In PhIP-Seq experiments *a* is much smaller than *b* as a peptide only pulls a fraction of the total number of reads even when reactive. Thus, to avoid these numerical problems, we set *a* to be the larger number of the estimated value and 1, and then estimate *b*.

**Markov chain Monte Carlo.** The model was implemented in JAGS (4.3.0) and run using the R interface for JAGS, rjags [31,32,33]. JAGS is a Gibbs sampler based on slice sampling as decribed in Neal (2003) [34]. We use maximum likelihood estimates to select starting values of \(\theta _{ij}, Z_{ij}, \phi _{ij}, c_{j}\), and \(\pi _{j}\) to initialize the Markov chain for non beads-only sample *j*. As described above, \(\hat{\theta }_{ij}^{\text {init}} = \hat{\theta }_{ij}\) is the MLE of the binomial probability calculated from the read counts. Since \(Z_{ij}\) is needed to update \(c_{j}, \pi _{j}, \phi _{ij}\), we set \(\hat{Z}_{ij}^{\text {init}}\) = 1 if its observed read count is at least twice as large as the expected read count in a beads-only sample. That is, \(\hat{Z}_{ij}^{\text {init}}\) = 1 if \(Y_{ij} \ge 2 n_{j} \hat{\theta }_{i0}\), and 0 otherwise. The initial value for the attenuation constant is derived by regressing the observed read counts on the expected reads count for all non-enriched peptides in that sample, with \(\hat{c}_{j}^{\text {init}}\) being the slope estimate

$$\begin{aligned} \hat{c}_{j}^{\text {init}} = \frac{\sum _{i=1}^{P} (1 - \hat{Z}^{\text {init}}_{ij}) \times Y_{ij} \times n_{j} \times \hat{\theta }_{i0}}{\sum _{j=1}^{P} (1 - \hat{Z}^{\text {init}}_{ij}) \times (n_{j} \hat{\theta }_{i0})^{2}}. \end{aligned}$$

The initial value for the proportion of enriched peptides is the average of all enrichment indicators

$$\begin{aligned} \hat{\pi }_{j}^{\text {init}} = \frac{1}{P}\sum\limits_{i=1}^{P} \hat{Z}_{ij}^{\text {init}}, \end{aligned}$$

and the respective peptide fold changes are initialized as

$$\begin{aligned} \hat{\phi }_{ij}^{\text {init}} = (1 - \hat{Z}_{ij}^{\text {init}}) + \hat{Z}_{ij}^{\text {init}} \frac{Y_{ij}}{n_{j} \times \hat{c}_{j}^{\text {init}} \times \hat{\theta }_{i0}}. \end{aligned}$$

Since \(c_{j}\) and \(\pi _{j}\) are modeled using Beta distributions with no support at values 0 and 1, we use a small offset in the event that \(c_{j}^{\text {init}} = 1\) and \(\pi _{j}^{\text {init}} = 0\).

In PhIP-Seq experiments we commonly observe very reactive peptides [17, 35]. Allowing for extreme fold changes in the Bayesian model for a few peptides can affect the inference for other less reactive peptides, and can have negative consequences for numerical stability and scalability. In our applications, clearly enriched peptides defined as \(\hat{\phi }^{\text {init}}_{ij}>\) 15 were filtered out before starting the Markov chain. Reads from such peptides in the mock IP and actual samples were removed, and the library sizes were recalculated.

**Peptide reactivity detection with edgeR.** To identify reactive peptides, each serum sample is compared to all beads-only samples from the same plate. Differential expression in edgeR is assessed for each unit (here, each peptide) using an exact test analogous to Fisher’s comparing means between two groups of negative binomial random variables, but adapted for overdispersed data [36]. Two-sided *p*-values were subsequently converted to one-sided *p*-values as the alternative to the null of no reactivity (fold change = 1) is reactivity, leading to read count enrichment and thus, fold-changes larger than 1. Multiple comparisons corrections were based on the Benjamini-Hochberg procedure, using false discovery rates to delineate enrichment across all peptides.

**Simulation study.** We simulated ten data sets based on the read counts observed in the HIV elite controller data described below. Each of these data sets had eight beads-only samples and twelve simulated serum samples. The twelve samples contain one beads-only sample run as an actual sample and two technical replicates (samples generated from the same parameters). For each simulated serum sample, we randomly selected 50 peptides as reactive. Among those, 10 peptides each had fold changes between 1 and 2, between 2 and 4, between 4 and 8, between 8 and 16, and between 16 and 32. Each data set was analyzed using the first two, four, and all eight beads-only samples to assess the sensitivity of the results to the number of beads-only used for analysis. For each data set and number of beads-only sample combination, we ran BEER with the true beads-only Beta \(a_{0}, b_{0}\) prior parameters, estimated beads-only parameters using maximum likelihood, method of moments, and edgeR derived estimates.

Performance was primarily assessed using ROC and PR curves on the full data and fold-change subsets of the data. For each fold-change bin, curves were generated using all non-enriched peptides and enriched peptides within the specified fold-change group from the simulated serum samples (no peptides from beads-only samples were included). To ensure that all curves had the same support points, we used linear interpolation to approximate the sensitivity or positive predictive value respectively at each support point for each simulation. ROC curves started at 0 sensitivity and 0 false-positive rate, while PRC curves started at 0 sensitivity and perfect positive-predictive value. The interpolated curves were averaged point-wise to generate an average curve for each condition. The area under each ROC curve was calculated using trapezoidal approximation from the interpolated data points. We also used logistic regression to model the probability of identifying and enriched peptide by fold-change in each data set. Multiple comparisons for edgeR *p*-values were addressed using the Benjamini-Hochberg procedure to ensure a 5% FDR. Cut-offs for the posterior probabilities were selected in each data set to achieve 5% false positive calls.

**Examples.** Antibody reactivity counts for eight plates of data were generated using the PhIP-Seq assay and the VirScan library on serum samples from HIV elite controllers with HIV subtype A and B infections, and analyzed by Kammers et al. [14] to assess antibody profiles in HIV controllers and persons with treatment-induced viral suppression. We used count data for the 3,395 phage-displayed peptides spanning the HIV proteome in the VirScan library for ten samples and six-beads-only samples from one plate of data. Two of the ten samples are identical, run in duplicate on the same plate. To quantify the false-positive rate of each algorithm, we also ran each beads-only sample against the remaining five-beads-only samples in a round-robin.

The CoronaScan data consists of counts for 6,932 peptides for 10 serum samples and 8 beads-only samples from one plate of data [6]. Among the ten samples, six were pre-pandemic samples and four samples were from one individual infected with SARS-CoV-2. Samples from this individual were collected on days 10 through 13 since symptom onset. By design, each peptide is present in duplicate in the CoronaScan library, enabling us to assess the concordance of the fold-change estimates and the enrichment status within samples. We again ran each beads-only sample against the remaining 7 beads-only samples to assess false positive rates.

The example in the Discussion to highlight the strong peptide-dependent background binding to the beads was from a previous study to evaluates HIV antibody responses and their evolution during the course of HIV infection [17] and to generate a classifier for recent HIV infections [35].