 Research article
 Open Access
 Published:
Comparison of preprocessing methods for multiplex beadbased immunoassays
BMC Genomics volume 17, Article number: 601 (2016)
Abstract
Background
High throughput protein expression studies can be performed using beadbased protein immunoassays, such as the Luminex® xMAP® technology. Technical variability is inherent to these experiments and may lead to systematic bias and reduced power. To reduce technical variability, data preprocessing is performed. However, no recommendations exist for the preprocessing of Luminex® xMAP® data.
Results
We compared 37 different data preprocessing combinations of transformation and normalization methods in 42 samples on 384 analytes obtained from a multiplex immunoassay based on the Luminex® xMAP® technology. We evaluated the performance of each preprocessing approach with 6 different performance criteria. Three performance criteria were plots. All plots were evaluated by 15 independent and blinded readers. Four different combinations of transformation and normalization methods performed well as preprocessing procedure for this beadbased protein immunoassay.
Conclusions
The following combinations of transformation and normalization were suitable for preprocessing Luminex® xMAP® data in this study: weighted BoxCox followed by quantile or robust spline normalization (rsn), asinh transformation followed by loess normalization and BoxCox followed by rsn.
Background
Beadbased protein immunoassays using the Luminex® xMAP® technology are subject to variability caused by both biological and technical effects. While systematic effects resulting from differences in biological conditions are of interest, technical variability should be reduced to the minimum. The highest proportion of technical variability is systematic and potentially introduced during different protein processing steps [1, 2].
In the ideal experimental setting, all samples would be processed in a single run, and, depending on the aim of the study, all analytes would be measured simultaneously or each analyte separately. However, technical limitations do not permit such an approach.
Beadbased immunoassays are a technological derivative to conventional immunoassays such as ELISAs, where antigen/antibody reactions are measured. The solid phase of the ELISA plate is reduced to multiple, small fluorescent colorcoded bead particles, which allows the conduction of multiplex experiments by simultaneous incubation of different bead species with samples. The analytical readouts are fluorescence signals reading each bead color (attribute channel) together with the signal from fluorescence labeled antibodies or proteins (quantitative measure).
Currently, 500 different bead colors can be differentiated, allowing for the simultaneous analysis of 500 analytes with the Luminex® xMAP® technology. Furthermore, wellplate layouts and robotic automation requirements typically restrict the number of used samples per batch to 96 or 384. In consequence, any largescale analysis needs to be run in batches, which can introduce technical variability on the sample level and the analyte level [3].
The presence of technical variability generally affects downstream statistical analysis. For example, the power for detecting biological effects may be reduced or effect estimates may be biased. As a result, the reduction of technical effects is mandatory for reliable protein expression analysis, and a suitable preprocessing strategy is required for minimizing technical variability.
The most important steps for data preprocessing are transformation and normalization of raw data after initial quality control [4]. The optimal preprocessing approach should be carefully selected prior to data analysis based on both the employed technology and the actual data [5] because the preprocessing method may greatly influence downstream analysis. As a result, microarray gene expression data are preprocessed differently [6] than RTPCR data [7] or data from genotype microarrays [8, 9]. Several authors compared methods to find optimal techniques for data preprocessing of different Omicstypeof data [5, 10–14]. However, preprocessing methods have not been compared for the Luminex® xMAP® technology.
The aim of this paper therefore is to identify an appropriate approach for the preprocessing of multiplex data generated with the Luminex® xMAP® technology. The analytical setting investigated here, is based on the reaction of the presence of human autoantibodies in patient serum to identify their binding partners. For this purpose we couple recombinantly produced human proteins to different, color coded beads and let them simultaneously react with individual serum samples. We used control sera and sera of patients having the autoimmune diseases multiple sclerosis and neuromyelitis optica for demonstration purposes. In summary, the assay is a multiplexed direct immunoassay with autoantibodies (IgG) as target analytes.
To this end, we compared 37 different combinations of transformation and normalization for a real data set of 384 analytes (i.e. antibody – antigen reactions) using 42 serum samples.
Methods
Biological experiment
Subjects
The data considered in this study consisted of 384 potential autoantigens measured for 42 serum samples. The ethics committee of the HeinrichHeineUniversität of Düsseldorf approved this study (vote number 2850, January 22, 2007). All participants gave written informed consent. The samples data were obtained from 12 measurements from a pooled reference serum, 12 control samples and 30 affected subjects (18 patients with multiple sclerosis, 12 patients with neuromyelitis optica). The 42 patient samples were measured on four plates. A reference serum was measured 12 times for each analyte. Additionally, 12 measurements of a pooled serum sample were measured three times on four plates each. This was used for estimating the repeatability of measurements. The amount of antibodies was measured as signal intensities using the Luminex® xMAP® technology in combination with the FLEXMAP 3D® instrument in serums of cases and controls.
Wet lab procedures
All 384 protein antigens were recombinantly produced inhouse, using E.coli and a SCS1 carrying plasmid pSE111, containing an Nterminally located hexahistidinetag [15, 16]. Each purified antigen was coupled to magnetic carboxylated colorcoded beads (MagPlexTM microspheres, Luminex Corporation, Austin, Texas). The manufacturer’s protocols were adapted to enable multiplexing using semiautomated procedures. All liquid handling steps were carried out by either an eightchannel pipetting system (Starlet, Hamilton Robotics, Bonaduz, Switzerland) or a 96channel pipetting system (Evo Freedom 150, Tecan, Männedorf, Switzerland). For each coupling reaction up to 12.5 μg antigen and 8.8 × 10^{5} MagPlexTM beads per color were used. Finally, beads were combined and stored at 4–8 °C until use.
Autoantibody profiling
Serum samples were diluted 1:100 in assay buffer (PBS, 0.5 % BSA, 50 % LowCross buffer (Candor Biosciences, Wangen, Germany)), added to the bead mix of 384 proteins and incubated for 20 h at 4–8 °C. After washing with PBS/0.05 % Tween20 the beads were incubated with a fluorescence labeled (Rphycoerythrin) detection antibody (5 μg/ml, goat antihuman or goatantimouse IgG, Dianova, Hamburg, Germany) for 45 min at RT to detect the target analyte, antigenspecific human IgG species from human serum.
The beads were washed and then analyzed in a FlexMap3D instrument (Luminex Corporation, Austin, Texas). The instrument aspirates the beads containing patient antibodies bound to the respective protein antigens, and which have bound the detection antibody, and analyses each individual particle by using a flow cytometric technology. The analytical measure is the median fluorescence intensity (MFI) for the particles partitioned according to their respective identification color. According to the manufactures recommendations, the MFI readout fulfilling a minimum bead count criterion (>35 beads measured per bead ID) were exported for data analysis.
Preprocessing procedure
The following steps were used for data preprocessing: First, raw data were quality controlled. In brief, antigens with a proportion of null values exceeding 19 % and samples with a proportion of null values exceeding 20 % were excluded. Signal intensities ≤ 0 were set to missing values. Second, we applied a transformation to the qualitycontrolled data. Next, we imputed missing data by median imputation [17] to the transformed and quality controlled data. Finally, we applied a normalization method to the data.
Transformation and normalization methods
We used the notation transformation_normalization to label the used methods for the transformation and normalization, which we applied as combinations during the preprocessing procedure to the data. Here, transformation is one of the following transformations: no transformation (no), log_{2} transformation (log2), asinh transformation (asinh), BoxCox transformation (boxcox) [18], BoxCox transformation with weights (boxcoxweights) [18] and variance stabilizing transformation (vst) [19]. boxcox is the original BoxCox transformation, where the transformation y _{ t } is obtained as \( {y}_t=\frac{y^{\lambda }1}{\lambda } \) if λ ≠ 0 and y _{ t } = log y if λ = 0. The transformation boxcoxweights uses the geometric mean \( \overset{.}{y} \) as a weight so that \( {y}_t=\frac{y^{\lambda }1}{\lambda {\overset{.}{y}}^{\lambda 1}\ } \) if λ ≠ 0 and \( {y}_t= log\ (y)\cdot \overset{.}{y} \) if λ = 0.
The normalization method was one of the following: loess normalization (loess) [20], global median normalization (global) [21], quantile normalization (quantile) [22], an improved quantile normalization (quanimpr), robust spline normalization (rsn) [23], zscore normalization (zscore) or variance stabilizing normalization (vsn) [24]. vsn has a builtin transformation, and it was therefore applied directly to the quality controlled and imputed data. The improved quantile normalization is a modification of the common quantile normalization, which we have developed to deal with very few large signal intensities. Specifically, borrowing from the technique of dithering in digital video and audio signal processing [25], noise was added to the original dataset to reduce the influence of the few strong signals on the normalization.
Evaluation criteria
We used 6 different criteria to evaluate the effects of the preprocessing methods. Two of the six criteria were based on empirical thresholds for statistical characteristics describing the distribution of the data, one measures variation of the signal intensities, and the remaining three criteria were based on visual inspection of plots. All evaluation criteria were graded as poor, fair or good and scored with 0, 1 and 2, respectively, for all preprocessing methods. Fifteen blinded readers rated the plots independently. The readers rated the plots twice i.e., at two different time points, where plots of the preprocessing methods were shuffled for the second run to test intrarater reliability. Plots could reach a score between 0 and 30 and were classified as good (2) for a score between 21 and 30, as fair (1) for 11 to 20 and poor (0), otherwise. As total score we added the scores of the 6 criteria, and the preprocessing methods could reach a total score between 0 and 12. The best preprocessing method was the one with the highest score when the evaluation criteria were summed. The evaluation criteria used are described in detail in the following sections.
MeanSD (standard deviation) plot
In MeanSD plots ranked means are plotted against the standard deviation. If the variability, i.e., standard deviation of intensities, depends on the magnitude of measured intensities, data are not homoscedastic. This, in turn, invalidates the use of many statistical methods, such as the analysis of variance (ANOVA) [26]. The variation should therefore be independent of measured signal intensities, thus independent of the mean in an optimally preprocessed data set. We estimated the mean and the standard deviation from the reference pool serum for each preprocessing combination.
The rating instructions for the raters of this plot were the following: If the scatterplot parallels the xaxis with low variation and the standard deviation is stable over the mean of signal intensities, the preprocessing method has to be judged as good (2). The loess curve (orange) in the plot should help to identify a potential trend; a trend should be judged as poor (0). Plots with no trend to a larger standard deviation for larger means but a variation around the loess curve have to be judged as fair (1). Figure 1 shows the example plots, which were given to the raters to help them with their decision.
BlandAltman plot
The BlandAltman plot [27] is generally used to plot the difference of two measurements against their mean where one is a new method to find out how much the new method differs from the old one. Here, we plotted all pairs of the 12 measurements in the reference pool serum for each preprocessing method in one BlandAltman plot. The following rating instruction was given to the blinded readers: The mean of the differences should be close to zero with a small and constant variation around this mean. If these criteria are fulfilled, the plot should be judged as good (2). A plot with a visible trend or funnel has to be judged as poor (0). If neither trend nor funnel but a mean difference deviating widely from zero or an increased scatter is present, the plot has to be judged as fair (1).
Volcano plot
In general, −log_{10} transformed pvalues are plotted against log_{2} fold changes in volcano plots [28]. The pvalues are taken from the ttest. In this situation pvalues were estimated for cases versus control using the nonparametric Wilcoxon rank sum test because antigen intensities might not be normally distributed. Hence, rankbased relative effects [29] as a nonparametric effect measure were used instead of fold changes. The shape of the plot is therefore a funnel and not the typical volcano shape as relative effects and pvalues for the Wilcoxontest are based on the same rank sums.
The evaluation instructions for volcano plots were the following: For plots where a funnel is visible and both sides are similar in length, the plot has to be judged as good (2). If one side of the funnel is considerably shorter than the other the plot has to be judged as fair (1). If the plot has no funnel shape at all the plot has to be judged as poor (0).
Skewness and tail length
Skewness and tail length were determined to assess similarities to the normal distribution for the distribution of the signal intensities of all data. Both statistics are computed through quantile estimators.
Skewness was estimated by \( log\ S= log\frac{{\tilde{x}}_{0.975}{\tilde{x}}_{0.5}}{{\tilde{x}}_{0.5}{\tilde{x}}_{0.025}} \) [30], where \( {\tilde{x}}_q \) denotes the q quantile. For symmetric distributions log S equaling zero, and it is negative or positive for leftskewed and rightskewed distributions, respectively. Tail length was estimated by \( T=\frac{{\tilde{x}}_{0.975}{\tilde{x}}_{0.025}}{{\tilde{x}}_{0.875}{\tilde{x}}_{0.125}} \) [30], which can take values between 1 and infinity [30]. The larger T, the longer the tail of a distribution. The normal distribution has a tail length of T = 1.704.
Thresholds were taken from the literature for scoring skewness and tail length [31]. A distribution was almost symmetric if − 0.5 < log S < 0.5 and scored with 2. If log S deviated more than 0.75 from 0, it received a score of 0; otherwise it received a score of 1.
Similarly, the tail length of the distribution was judged to be good (2), i.e., close to the normal distribution, if 1.625 < T < 2. The score for tail length was 0 if T ≤ 1.525 or T ≥ 2.1, and otherwise it received a score of 1.
Coefficient of variation
We used the coefficient of variation (CV) to judge repeatability by considering the 12 measurements from the reference pool serum. A pooled measure was computed for each preprocessing method in the following steps:

1.
Get CVs for each antigen in each preprocessed data set separately.

2.
Rank CVs for one antigen over all preprocessed data sets; start with the smallest.

3.
Sum these ranks (CV _{ s }) across all antigens for each preprocessing method separately.
A small value for CV _{ s } indicates that this preprocessing method has small CVs for the majority of antigens; smallest possible CV _{ s } equals the number of antigens, highest is the product of the number of antigens and of the preprocessing methods.
Before scoring CV _{ s } it was transformed to percentages CV _{ s,p } and scored with 2 if CV _{ s,p } ≤ 50 %, with 1 if 50 % < CV _{ s,p } ≤ 80 %, and 0, otherwise.
QQplot
To illustrate the effects of the preprocessing we randomly drew equally sized groups from one case group and performed Mann–Whitney U tests. We repeated this 25 times and plotted it in a QQplot. If the preprocessing reduces variability between subjects the lines in the QQplot should scatter narrowly around the diagonal line.
Software used
R version 3.1.1 was used together with Bioconductor Version 3.0 for all computations and visualizations [32]. For both BoxCox transformations we employed the Rfunction boxcox (package MASS (7.340)) for estimating λ. Unweighted BoxCox transformed data were obtained from the Rfunction BoxCox (package forecast (5.9)). The Rfunction bct (package TeachingDemos (2.9)) transforms data with the weighted BoxCox transformation but cannot handle missing data. We therefore implemented this transformation as an Rfunction. We used the function lumiN from the Bioconductorpackage lumi (2.18.0) to perform quantile normalization, loess normalization, vsn and rsn. All plots were generated using the R package ggplot2 (1.0.1).
Results
A systematic literature search was used to identify methods for transformation and normalization (Table 1). Search criteria were combinations of “transformation”, “normalization”, “preprocess”, “comparison”, “microarray” and modifications of them. We only included methods which were already implemented in the statistical software R [33] or simple to implement. Furthermore, we aimed at investigating the effects of no transformation. In total, we applied 37 different combinations of 6 transformation methods and 7 normalization methods to the data. We excluded 4 of 384 antigens during the quality control for further studies because values were missing for at least 8 of 42 patients (19.05 %).
Scores for the visual ratings and the statistical characteristics are provided in Fig. 2 for the 37 different combinations of transformation and normalization methods for 6 different evaluation criteria. Each criterion was either assessed as poor, fair or good, corresponding to the letter sizes small, medium and tall in Fig. 1. We considered the first of the two runs for the plot evaluation for the sum of scores. The following four preprocessing methods obtained the maximum total score of 12:

1.
Asinh transformation with loess normalization,

2.
BoxCox transformation with robust spline normalization,

3.
Weighted BoxCox transformation with quantile normalization and

4.
Weighted BoxCox transformation with robust spline normalization.
In general, preprocessing methods without a transformation and methods with a variance stabilizing transformation (VST) reached small total scores. The improved quantile normalization reached only a higher total score in combination with the log_{2} transformation. Global median and zscore normalization had a highest total score of 9 in combination with either log_{2} transformation or asinh transformation but failed, otherwise. Figure 3 shows the QQplot of the raw data, and Fig. 4 the QQplots of the four best preprocessing methods. Additional file 1: Figure S1 shows a selection of QQplots with preprocessing combinations with smaller total quality scores. Finally, (Additional file 1: Figures S2–S38) shows the QQplots of all 37 combinations of the investigated preprocessing methods.
Discussion
The best four approaches for preprocessing the Luminex® xMAP® data identified in this work were a weighted BoxCox transformation followed by a quantile, a robust spline normalization (rsn), an asinh transformation followed by a loess normalization and a BoxCox transformation followed by an rsn. Our findings demonstrate that data transformation is necessary prior to downstream analysis, as all combinations without prior transformation reached considerably bad evaluation scores. Unexpectedly, the VST was rated poorly in this study although this approach performed well in gene expression studies [5, 6]. In the future, it would be helpful if other groups replicated our findings using independent data. The results of the QQplots show how the results in one case group behave after preprocessing. The scattering of the test statistics around the line in the QQplot of the raw data (Fig. 3) is much larger than in the QQplots of the four best methods (Fig. 4). In comparison, the QQplots of log2_rsn and vst_loess show a larger scattering and the QQplots of boxcox_global and boxcoxweights_zscore scatter largely and are inflated (Additional file 1: Figure S1).
To ensure all important information are stored for proteomics experiments for further data handling, a standard reporting guideline for minimum information about a proteomics experiment (MIAPE) has been developed for methods, such as gel electrophoresis and mass spectrometry [34]. However, MIAPE standards are lacking for the Luminex® xMAP® technology. Such a development would be important for future reports of experiments based on the Luminex® xMAP® technology. At this stage our aim was to provide data handling recommendations to allow for later indepth analysis of the different steps in laboratory work including Luminexbased data generation. To allow for this, we here produced first data sets following recommendations from Luminex both for multiplex assay setup and raw data collection.
A limitation of the transformation methods in our study is the usage of the same method for all antigens within the transformation step except for both BoxCox transformations. If each antigen is transformed separately, results might be different. This should, however, be investigated in future studies. Another limitation of this study is the small sample size (42 samples in total). As a result, the power of group comparisons is limited. However, this sample size has been used in very early stages of several biomarker studies.
As demonstrated by Ziegler et al. [35], the coefficient of variation varies with the strength of gene expression and decreases with increasing expression levels. For that reason removal of transcripts with low intensity values from expression data with a detection call algorithm [36] is often used. In this study, we followed standard manufacturer recommendations and used data only if there were at least 35 beads. The dependency of the coefficient of variation on the number of beads warrants further investigation.
In summary, our investigation about appropriate data transformation and normalization methods for the Luminex® xMAP® technology has shown that either one of the four following data preprocessing approaches is appropriate: a weighted BoxCox transformation followed by a robust spline normalization, an asinh transformation followed by a loess normalization, a BoxCox transformation followed by an rsn and a weighted BoxCox transformation followed by a quantile normalization.
Conclusions
We identified four adequate transformation methods for antigen intensities obtained by the Luminex® xMAP® technology using simple graphical and statistical characteristics. The suitable methods are a weighted BoxCox transformation followed by a quantile or robust spline normalization (rsn), an asinh transformation followed by a loess normalization or a BoxCox transformation followed by an rsn.
References
 1.
Molloy MP, Brzezinski EE, Hang J, McDowell MT, VanBogelen RA. Overcoming technical variation and biological variation in quantitative proteomics. Proteomics. 2003;3(10):1912–9.
 2.
Russell MR, Lilley KS. Pipeline to assess the greatest source of technical variance in quantitative proteomics using metabolic labelling. J Proteomics. 2012;77:441–54.
 3.
Dunbar SA, Hoffmeyer MR. Microspherebased multiplex immunoassays: development and applications using Luminex® xMAP® technology. In: Wild D, editor. The Immunoassay Handbook. 4th ed. Amsterdam: Elsevier; 2013. p. 157–74.
 4.
Quackenbush J. Microarray data normalization and transformation. Nat Genet. 2002;32(Suppl):496–501.
 5.
Schmid R, Baum P, Ittrich C, FundelClemens K, Huber W, Brors B, et al. Comparison of normalization methods for Illumina BeadChip HumanHT12 v3. BMC Genomics. 2010;11:349.
 6.
Schurmann C, Heim K, Schillert A, Blankenberg S, Carstensen M, Dörr M, et al. Analyzing Illumina gene expression microarray data from different tissues: methodological aspects of data analysis in the MetaXpress Consortium. PLoS ONE. 2012;7(12):e50938.
 7.
Vandesompele J, De Preter K, Pattyn F, Poppe B, Van Roy N, De Paepe A, et al. Accurate normalization of realtime quantitative RTPCR data by geometric averaging of multiple internal control genes. Genome Biol. 2002;3(7):research0034.111.
 8.
Weale ME. Quality control for genomewide association studies. Methods Mol Biol. 2010;628:341–72.
 9.
Ziegler A. Genomewide association studies: quality control and populationbased measures. Genet Epidemiol. 2009;33 Suppl 1:S45–50.
 10.
Boes T, Neuhäuser M. Normalization for affymetrix GeneChips. Methods Inf Med. 2005;44(3):414–7.
 11.
Thygesen HH, Zwinderman AH. Comparing transformation methods for DNA microarray data. BMC Bioinformatics. 2004;5:77.
 12.
Cui X, Kerr MK, Churchill GA. Transformations for cDNA microarray data. Stat Appl Genet Mol Biol. 2003;2(1):Article4.
 13.
Adriaens ME, Jaillard M, Eijssen LM, Mayer CD, Evelo CTA. An evaluation of twochannel ChIPonchip and DNA methylation microarray normalization strategies. BMC Genomics. 2012;13:42.
 14.
Rocke DM, Durbin B. Approximate variancestabilizing transformations for geneexpression microarray data. Bioinformatics. 2003;19(8):966–72.
 15.
Büssow K, Cahill D, Nietfeld W, Bancroft D, Scherzinger E, Lehrach H, et al. A method for global protein expression and antibody screening on highdensity filters of an arrayed cDNA library. Nucleic Acids Res. 1998;26(21):5007–8.
 16.
Brinkmann U, Mattes RE, Buckel P. Highlevel expression of recombinant genes in Escherichia coli is dependent on the availability of the dnaY gene product. Gene. 1989;85(1):109–14.
 17.
Simon RM, Korn EL, McShane LM, Radmacher MD, Wright GW, Zhao Y. Design and analysis of DNA microarray investigations. 1st ed. New York: Springer; 2003.
 18.
Box GEP, Cox DR. An analysis of transformations. J Roy Stat Soc B Met. 1964;26(2):211–52.
 19.
Lin SM, Du P, Huber W, Kibbe WA. Modelbased variancestabilizing transformation for Illumina microarray data. Nucleic Acids Res. 2008;36(2):e11.
 20.
Cleveland WS. Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc. 1979;74(368):829–36.
 21.
Wu W, Xing EP, Myers C, Mian IS, Bissell MJ. Evaluation of normalization methods for cDNA microarray data by kNN classification. BMC Bioinformatics. 2005;6:191.
 22.
Bolstad BM, Irizarry RA, Åstrand M, Speed TP. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003;19(2):185–93.
 23.
Du P, Kibbe WA, Lin SM. lumi: a pipeline for processing Illumina microarray. Bioinformatics. 2008;24(13):1547–8.
 24.
Huber W, von Heydebreck A, Sültmann H, Poustka A, Vingron M. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics. 2002;18 Suppl 1:S96–104.
 25.
Gray RM, Neuhoff DL. Quantization. Ieee T Inform Theory. 1998;44(6):2325–83.
 26.
Durbin BP, Hardin JS, Hawkins DM, Rocke DM. A variancestabilizing transformation for geneexpression microarray data. Bioinformatics. 2002;18 Suppl 1:S105–10.
 27.
Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307–10.
 28.
Cui X, Churchill GA. Statistical tests for differential expression in cDNA microarray experiments. Genome Biol. 2003;4(4):210.
 29.
Brunner E, Munzel U. Nichtparametrische Datenanalyse: Unverbundene Stichproben. 2nd ed. Berlin: Springer; 2013.
 30.
Büning H. Robustness and power of parametric, nonparametric, robustified and adaptive tests  the multisample location problem. Stat Pap. 2000;41(4):381–407.
 31.
Szymczak S, Scheinhardt MO, Zeller T, Wild PS, Blankenberg S, Ziegler A. Adaptive linear rank tests for eQTL studies. Stat Med. 2013;32(3):524–37.
 32.
Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5(10):R80.
 33.
R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2014.
 34.
Taylor CF, Paton NW, Lilley KS, Binz PA, Julian Jr RK, Jones AR, et al. The minimum information about a proteomics experiment (MIAPE). Nat Biotechnol. 2007;25(8):887–93.
 35.
Ziegler A, König IR, SchulzKnappe P. Challenges in planning and conducting diagnostic studies with molecular biomarkers. Dtsch Med Wochenschr. 2013;138(19):e14–24.
 36.
Archer KJ, Reese SE. Detection call algorithms for highthroughput gene expression microarray data. Brief Bioinform. 2010;11(2):244–52.
 37.
Kreil DP, Russell RR. Tutorial section: There is no silver bullet  a guide to lowlevel data transforms and normalisation methods for microarray data. Brief Bioinform. 2005;6(1):86–97.
 38.
Shi W, Oshlack A, Smyth GK. Optimizing the noise versus bias tradeoff for Illumina whole genome expression BeadChips. Nucleic Acids Res. 2010;38(22):e204.
 39.
Schmidt MT, Handschuh L, Zyprych J, Szabelska A, OlejnikSchmidt AK, Siatkowski I, et al. Impact of DNA microarray data transformation on gene expression analysis  comparison of two normalization methods. Acta Biochim Pol. 2011;58(4):573–80.
 40.
Durinck S. Preprocessing of microarray data and analysis of differential expression. Methods Mol Biol. 2008;452:89–110.
 41.
Autio R, Kilpinen S, Saarela M, Kallioniemi O, Hautaniemi S, Astola J. Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations. BMC Bioinformatics. 2009;10 Suppl 1:S24.
Acknowledgments
We thank two anonymous reviewers for their constructive comments and suggestions to improve the manuscript.
Funding
The European Union (BiomarCaRE, grant number HEALTHTHF2–2011–278913) supported this study.
Availability of data and materials
The datasets supporting the conclusions of this article are included in Additional file 3.
Authors’ contributions
AS, AZ, HDZ and PSK designed the study. AL performed the experiments. AS and TKR analyzed the data. TKR, AS, AZ and PSK drafted the paper. All authors critically reviewed the paper. All authors have read and approved the final version of the manuscript
Authors’ information
TKR and AS are employees of Universitätsklinikum SchleswigHolstein, Campus Lübeck, Germany. AZ is employee of Universität zu Lübeck, Germany. HDZ, AL and PSK are employees of Protagen AG, Dortmund, Germany.
Competing interests
The University of Lübeck (project leader AZ) received a grant for BiomarCaRE (grant number HEALTHTHF2–2011–278913) from the European Union. AZ is statistical advisor for Protagen AG, Dortmund Germany. The authors declare no other financial conflicts relevant to the manuscript.
Consent for publication
Not applicable.
Ethics approval and consent to participate
The ethics committee of the HeinrichHeineUniversität of Düsseldorf approved this study (vote number 2850, January 22, 2007). All participants gave written informed consent.
Author information
Affiliations
Corresponding authors
Additional files
Additional file 1:
Supplementary figures. All results for the QQ, MeanSD, BlandAltman and Volcano plots. (PDF 5133 kb)
Additional file 2:
Supplementary table. Detailed results for the evaluation criteria skewness, tail length and coefficient of variation. (PDF 54 kb)
Additional file 3:
Supplementary material. Datasets, which support the conclusions of this article. (ZIP 58 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Rausch, T.K., Schillert, A., Ziegler, A. et al. Comparison of preprocessing methods for multiplex beadbased immunoassays. BMC Genomics 17, 601 (2016). https://doi.org/10.1186/s1286401628887
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1286401628887
Keywords
 Autoantibody
 Beadbased
 Immunoassay
 Luminex
 Multiplex
 Omics
 Preprocessing
 Protein