I
| Centering |
|
O
| Focus on the differences and not the similarities in the data | Remove the offset from the data | When data is heteroscedastic, the effect of this pretreatment method is not always sufficient |
II
| Autoscaling |
| (-) | Compare metabolites based on correlations | All metabolites become equally important | Inflation of the measurement errors |
| Range scaling |
| (-) | Compare metabolites relative to the biological response range | All metabolites become equally important. Scaling is related to biology | Inflation of the measurement errors and sensitive to outliers |
| Pareto scaling |
|
O
| Reduce the relative importance of large values, but keep data structure partially intact | Stays closer to the original measurement than autoscaling | Sensitive to large fold changes |
| Vast scaling |
| (-) | Focus on the metabolites that show small fluctuations | Aims for robustness, can use prior group knowledge | Not suited for large induced variation without group structure |
| Level scaling |
| (-) | Focus on relative response | Suited for identification of e.g. biomarkers | Inflation of the measurement errors |
III
| Log transformation |
| Log O | Correct for heteroscedasticity, pseudo scaling. Make multiplicative models additive | Reduce heteroscedasticity, multiplicative effects become additive | Difficulties with values with large relative standard deviation and zeros |
| Power transformation |
|
āO
| Correct for heteroscedasticity, pseudo scaling | Reduce heteroscedasticity, no problems with small values | Choice for square root is arbitrary. |