Skip to main content

Table 1 Data-driven methods for normalization of shotgun metagenomic data included in this study

From: Comparison of normalization methods for the analysis of metagenomic gene abundance data

Method

Description

Availability

Total counts

Calculates scaling factors based on the total gene abundances

-

Median

Calculates scaling factors based on the median gene abundance

edgeR package in Bioconductor

Upper quartile [19]

Calculates scaling factors based on the upper quartile of the gene abundances

edgeR package in Bioconductor

Trimmed mean of M-values (TMM) [21]

Calculates scaling factors based on robust analysis of the difference in relative abundance between samples.

edgeR package in Bioconductor

Relative Log Expression (RLE) [30]

Calculates scaling factors using the ratio between gene abundances and their geometric mean

DESeq package in Bioconductor

Cumulative sum scaling (CSS) [20]

Calculates scaling factors as the cumulative sum of gene abundances up to a data-derived threshold

metagenomeSeq package in Bioconductor

Reversed cumulative sum scaling (RCSS)

Calculates scaling factors as the cumulative sum of high abundant genes

-

Quantile-quantile [19]

Transforms each sample to follow a data-derived reference distribution

-

Rarefying [55]

Randomly removes gene fragments until the sequencing depth is equal in all samples

phyloseq package in Bioconductor