- Open Access
A genome-wide systems analysis reveals strong link between colorectal cancer and trimethylamine N-oxide (TMAO), a gut microbial metabolite of dietary meat and fat
© Xu et al.; licensee BioMed Central Ltd. 2015
- Published: 11 June 2015
Dietary intakes of red meat and fat are established risk factors for both colorectal cancer (CRC) and cardiovascular disease (CVDs). Recent studies have shown a mechanistic link between TMAO, an intestinal microbial metabolite of red meat and fat, and risk of CVDs. Data linking TMAO directly to CRC is, however, lacking. Here, we present an unbiased data-driven network-based systems approach to uncover a potential genetic relationship between TMAO and CRC.
Materials and methods
We constructed two different epigenetic interaction networks (EINs) using chemical-gene, disease-gene and protein-protein interaction data from multiple large-scale data resources. We developed a network-based ranking algorithm to ascertain TMAO-related diseases from EINs. We systematically analyzed disease categories among TMAO-related diseases at different ranking cutoffs. We then determined which genetic pathways were associated with both TMAO and CRC.
We show that CVDs and their major risk factors were ranked highly among TMAO-related diseases, confirming the newly discovered mechanistic link between CVDs and TMAO, and thus validating our algorithms. CRC was ranked highly among TMAO-related disease retrieved from both EINs (top 0.02%, #1 out of 4,372 diseases retrieved based on Mendelian genetics and top 10.9% among 882 diseases based on genome-wide association genetics), providing strong supporting evidence for our hypothesis that TMAO is genetically related to CRC. We have also identified putative genetic pathways that may link TMAO to CRC, which warrants further investigation. Through systematic disease enrichment analysis, we also demonstrated that TMAO is related to metabolic syndromes and cancers in general.
Our genome-wide analysis demonstrates that systems approaches to studying the epigenetic interactions among diet, microbiome metabolisms, and disease genetics hold promise for understanding disease pathogenesis. Our results show that TMAO is genetically associated with CRC. This study suggests that TMAO may be an important intermediate marker linking dietary meat and fat and gut microbiota metabolism to risk of CRC, underscoring opportunities for the development of new gut microbiome-dependent diagnostic tests and therapeutics for CRC.
- systems biology
- network medicine
- colorectal cancer
- trimethylamine N-oxide (TMAO)
- human gut microbiome
- dietary meat and fat
Colorectal cancer (CRC) represents the second most common cause of cancer in women (9.2%) and the third most common in men (10.0%). Diet clearly plays an important role in colon carcinogenesis. The Western diet, characterized by high fat and meat consumption, has been associated with increased risk of colorectal cancer in a large number of epidemiological studies [1–3]. The risk association is particularly strong for red meat intake. In effect, an extensive review of the existing evidence by an international panel of experts concluded that a high intake of red meat is a convincing and probable cause of colorectal cancer .
The complex gut microbiota harbored by individuals have long been proposed to play an important role in colon carcinogenesis [5–7]. Recent studies comparing patients with colorectal neoplasia and healthy controls have found differences either in the relative abundance of certain microbial species or in the taxonomic composition of the microbiome. In particular, three studies using high throughput sequencing to characterize the composition of microbiota have discovered enrichment of Fusobacterium species in human colorectal tumors or adenomas as compared to matched normal control tissues, providing direct evidence for a link of gut microbiome to colorectal cancer [8–10]. The exact mechanisms by which gut microflora may modulate colorectal cancer risk, however, remain largely unexplored.
Recent studies have discovered that trimethylamine N-oxide (TMAO) generated by gut microbiota metabolism of dietary L-carnitine, a trimethylamine abundant in red meat, and dietary phosphatidylcholine is mechanistically linked to risk of cardiovascular diseases (CVDs) [11–14]. It is further shown that human gut microbiota are required to form TMAO from dietary red meat and fat, and specific bacterial taxa are associated with both plasma levels of TMAO and dietary meat and fat intakes. These studies suggest a novel mechanism involving a complex interplay of human gut microbial community and diet for the observed relationship between dietary red meat and fat consumption and cardiovascular disease.
Whether TMAO plays a similar role in colon carcinogenesis has not been explored. Given the striking similarity of colorectal cancer and cardiovascular diseases in risk association with dietary red meat/fat intakes, we hypothesize that TMAO is an intermediate marker linking dietary red meat and fat and gut microbial metabolism to colorectal cancer. Here, we represent a genome-wide systems approach to the discovery of the genetic links between CRC and TMAO by reasoning over vast amounts of disease-gene association, protein-protein interaction and chemical-gene association data from multiple databases using advanced network-based ranking algorithms.
The experimental framework consists of the following steps: (1) we constructed two different genetic disease networks (GDNs) using disease-gene and protein-protein interaction data from multiple large-scale data resources; (2) we modeled the epigenetic interactions between TMAO and diseases by transforming GDNs into epigenetic interaction networks (EINs); (3) we developed a network-based ranking algorithm to find TMAO-related diseases from GDNs. These diseases share a high degree of genetic similarities with TMAO; (4) we validated recent findings that TMAO is associated with cardiovascular diseases; (5) we tested our hypothesis that TMAO might be genetically linked to CRC; (6) we systematically analyzed disease categories among TMAO-related diseases at different ranking cutoffs; and (7) we determined which genetic pathways were associated with both TMAO and CRC.
Construct genetic disease networks (GDNs)
Construct GDN based on OMIM genetics (GDN_OMIM)
We constructed two separate GDNs using disease-gene association data from two complementary data resources. The first one is the Online Mendelian Inheritance in Man (OMIM), a comprehensive database of human genes and genetic phenotypes mainly for rare Mendelian genetic disorders . We downloaded the OMIM database and mapped gene names to their corresponding approved human gene symbols as defined by the HUGO Gene Nomenclature Committee (HGNC) . We extracted a total of 15,462 disease-gene pairs from the OMIM database, representing 5,983 diseases and 8,831 genes. On GDN_OMIM, two diseases were connected if their associated genes (proteins) interact. The edge weights were determined by the numbers of protein-protein interaction (PPI) pairs between two diseases (D i and D j ) and is defined as: , where G i k is a gene associated with D i , G j l is a gene associated with D j , and G i k is the same as or interacts with G j l according to known protein-protein (PPI) association data. The PPI data was obtained from the STRING database, a database of known and predicted protein interactions . Currently, STRING contains 5,214,234 proteins from 1,133 organisms. From the STRING database, we obtained a total of 4,137,054 human PPI pairs representing 17,756 human proteins.
Construct GDN based on GWAS genetics (GDN_GWAS)
The second source of disease genetics we utilized in constructing GDNs was the Catalog of Published Genome-Wide Association Studies from the US National Human Genome Research Institute (NHGRI), an exhaustive source containing the description of diseaseand trait-associated single nucleotide polymorphisms (SNPs) from published GWAS data . Different from diseases in the OMIM database, diseases in the GWAS catalog are mainly common complex diseases. We first mapped SNPs to their associated strongest genes, which were subsequently mapped to their corresponding approved human gene symbols as defined by the HGNC. In total, we obtained 22,470 disease/trait-gene pairs, representing 881 diseases/traits and 8,689 genes. On GDN_GWAS, two diseases were connected if their associated genes (proteins) interact and the edge weights were the numbers of PPI pairs between two diseases as described above. In summary, the disease network GDN_OMIM consisted of 4,848 nodes and 882,751 edges; GDN_GWAS consisted of 882 nodes and 200,758 edges. Compared to GDN_GWAS, GDN_OMIM contained significantly more diseases, but fewer edges between any two nodes.
Model the epigenetic interactions between TMAO and diseases by transforming disease networks into epigenetic interaction networks (EINs)
Ten TMAO-associated human genes.
Methyl-CpG binding domain protein 2
Flavin containing monooxygenase 3
RAR-related orphan receptor C
Sarcoglycan, gamma (35kDa dystrophin-associated glycoprotein)
Paroxysmal nonkinesigenic dyskinesia
Ribonuclease, RNase A family, 1 (pancreatic)
NFKB repressing factor
Molybdenum cofactor synthesis 1
Transformer 2 alpha homolog
Develop network-based ranking algorithm to find diseases that share high genetic similarities with TMAO
We then developed a network-based ranking algorithm to prioritize diseases on EINs based on their genetic commonalities with TMAO. We retargeted the TopicSensitive PageRank (TSPR) algorithm to rank similar diseases for a given input (TMAO in our study). TSPR is a context-sensitive ranking algorithm for web searches developed by Taher Haveliwala . Versions of this approach have been used in prioritizing disease genes using networks consisting of same node types (i.e. diseases or genes) [21, 22]. In this study, we applied the same algorithm to a heterogeneous network consisting of disease nodes and chemical nodes (TMAO in this study) in order to find TMAO-related diseases. The iterative networkbased ranking algorithm in finding similar diseases to a given input is defined as: pt+1 = (1 − r)M p t + rp0, wherein M is the column-normalized adjacency matrix of EINs, γ is a preset probability of restarting from the initial seed node (γ = 0.1 in this study), and p t is a vector in which the i th element holds the normalized ranking score of disease i at t th iteration. The initial probability vector p0 contains normalized probability for input. In our study, p0 contains TMAO, with a probability of 1.0. Diseases are then ranked according to the value in the steady-state probability vector, which is obtained by iterating the algorithm until the change between pt+1 and p t is less than 106.
Validate recent findings that TMAO is associated with cardiovascular diseases
Recent studies indicate that high levels of TMAO in the blood are associated with an increased risk of cardiovascular diseases [11–14]. We examined the rankings of cardiovascular diseases and their major risk factors, including high blood cholesterol and triglyceride, high blood pressure, diabetes, and obesity, among diseases retrieved from EINs using TMAO as seed. As positive controls, these diseases are expected to rank highly among TMAO-related diseases.
Test our hypothesis that TMAO may be genetically associated with CRC
In order to provide evidence supporting our hypothesis that TMAO may be involved in CRC pathogenesis, we tested whether CRC would rank highly among TMAOrelated diseases retrieved from both EINs. High rankings of CRC would imply that TMAO and CRC share high genetics and that TMAO might be associated with CRC carcinogenesis.
Analyze diseases enriched among top-ranked TMAO-related diseases
Sixteen disease chapters (classes) and numbers of diseases in each chapter.
Certain infectious and parasitic dis-
Diseases of the circulatory system
Diseases of the respiratory system
Diseases of the blood and blood forming organs and certain disorders
involving the immune mechanism
Diseases of the digestive system
Endocrine, nutritional and metabolic
Diseases of the skin and subcutaneous tissue
Mental and behavioural disorders
Diseases of the musculoskeletal system and connective tissue
Diseases of the nervous system
Diseases of the genitourinary system
Diseases of the eye and adnexa
Congenital malformations, deformations and chromosomal abnormalities
Diseases of the ear and mastoid process
Certain conditions originating in the
Since EIN_OMIM contains 4,848 disease nodes and EIN_GWAS contains 882 nodes, we performed disease class enrichment analysis on TMAO-related diseases retrieved from EIN_OIMIM only. For diseases ranked at 10 different ranking cutoffs (top 10%, 20%, . . . 100%), we calculated percentages of the sixteen ICD10 disease classes among them.
Identify genetic pathways linking TMAO to CRC
Cardiovascular diseases (CVDs) are genetically related to TMAO
Top 10 ranked cardiovascular diseases and its related risk factors.
Diseases/traits Based on GWAS genetics (878)
Diseases Based on OMIM genetics (4732)
Myocardial infarction, susceptibility to
Coronary heart disease
Diabetes mellitus, noninsulin-dependent
Type 2 diabetes
Coronary artery disease, susceptibility to
LDL cholesterol level qt
Microvascular complications of diabetes
Lipid metabolism phenotypes
Atherosclerosis, susceptibility to
Obesity, susceptibility to
Cardiovascular disease risk factors
Diabetes mellitus, type 2, susceptibility
We retrieved a total of 4,732 diseases from EIN_OMIM using TMAO as input. Similar to results based on EIN_OMIM, CVDs and their major risk factors, including myocardial infarction (top 0.23%), ventricular tachycardia (0.25%), diabetes mellitus, noninsulin-dependent (top 0.32%), and coronary artery disease, susceptibility to (top 0.51%), were ranked highly. Even though the diseases from EIN_GWAS (mainly common complex diseases) and from EIN_OMIM (mainly rare Mendelian disorders) are largely complementary, the high rankings of CVDs and their major risk factors among TMAO-related diseases retrieved from both networks confirmed recent studies and validated our network-based approach in finding TMAO-related diseases.
Colorectal cancer is highly related to TMAO
Top ten TMAO-related diseases/traits retrieved from EIN_OMIM and from EIN_GWAS.
Diseases/traits from EIN_GWAS
Diseases from EIN_OMIM
Colorectal cancer, somatic
Breast cancer, somatic
Gastric cancer, somatic
Ovarian cancer, somatic
Inflammatory bowel disease
Schizophrenia, susceptibility to
Asthma, susceptibility to
Coronary heart disease
Leukemia, acute myeloid
Bladder cancer, somatic
Malaria, cerebral, susceptibility to
Thyroid carcinoma, follicular, somatic
Strikingly, among top ten TMAO-related diseases retrieved from EIN_OMIM, seven are cancers, including CRC, breast cancer, gastric cancer and leukemia. Because of the strong (causal) disease-gene associations in the large OMIM database, the observed strong relationship between TMAO and cancers implies that TMAO might be genetically involved in not only CRC but also cancers in general, which we further confirmed in the next section.
Cancers and metabolic syndromes are highly related to TMAO in general
We examined the distributions of sixteen disease classes among 4,732 TMAO-related diseases retrieved from EIN_OMIM at 10 different ranking cutoffs (top 10%, 20%,
Putative genetic pathways linking CRC to TMAO
Numbers of shared genes and pathways between TMAO and CRC.
CRC (OMIM) ∩ CRC (GWAS)
TMAO ∩ CRC (OMIM)
TMAO ∩ CRC (GWAS)
TMAO ∩ CRC (OMIM) ∩ CRC (GWAS)
Top ten ranked genetic pathways shared between TMAO and CRC.
TMAO ∩ CRC (OMIM)
TMAO ∩ CRC (GWAS)
TMAO ∩ CRC (OMIM) ∩ CRC
Pathways in cancer
Neurotrophin signaling pathway Cell cycle
Adaptive immune system
MARK signaling pathway Hemostasis
Pathways in cancer
Metabolism of proteins
TCA cycle and respiratory electron transport
Adaptive immune system
MAPK signaling pathway Metabolism of lipids and lipoproteins
Pathways in cancer
Adaptive immune system
MAPK signaling pathway Chromosome maintenance Telomere maintenance
Metabolism of lipids and lipoproteins
Recent studies have shown a mechanistic link between TMAO, gut microbial metabolism of dietary meat and fat, and risk of cardiovascular diseases (CVDs), and established an obligatory role of gut microbiota in the generation of the proatherosclerotic TMAO from dietary L-carnitine and phosphatidylcholine, abundant in red meat and dietary fat respectively [11–14]. Employing a genome-wide systems analysis approach, we confirmed the association of TMAO with CVDs and other related metabolic disorders such as dyslipidemia. Indeed, inhibition of reverse cholesterol transport has been identified as an important mechanism by which TMAO promotes atherosclerosis [11, 13]. Although in vitro and in vivo study data linking TMAO to CRC is still lacking, our present study revealed a striking strong association between TMAO and CRC, and TMAO appears to be involved in many genetic pathways clearly implicated in cancer in general and colon carcinogenesis in particular.
High red meat and animal fat intakes have been well established as risk factors for both CVDs and colorectal cancer. The discovery of the TMAO-CVDs connection mediated by gut microbial metabolism provides evidence for a novel mechanism by which human gut microbiota may influence health and disease. Gut microbiota has long been postulated to modulate risk of CRC. Although increasing evidence shows gut microbial community differences in patients with and without colorectal neoplasia [8–10], the exact mechanisms by which gut microbiota may affect colon carcinogenesis is unknown. Our current study, motivated by the similarity of CVDs and CRC in risk association with dietary red meat and fat consumption suggests that TMAO may also be an important and unappreciated intermediate linking red meat and fat intakes and gut microbiota metabolism to the development of CRC. In vitro and in vivo data directly linking TMAO, gut microbial metabolism of meat and fat to CRC is still lacking. Results from our present study thus shall be only considered as hypothesis generating and warrant further investigations.
In this study, we present an unbiased data-driven network-based approach to uncover genetic links between TMAO and CRC by integrating and reasoning over vast amounts of disease genetics, protein interactions, and interactions of chemicals and proteins. Our approach is generic and can be readily retargeted to discover novel genetic links among any diseases and chemicals. Our genome-wide analysis demonstrates that systems approaches hold promise for the discovery of novel disease genetic basis. Our results show that TMAO is genetically associated with CRC. This study suggests that TMAO may be an important intermediate marker linking dietary meat and fat and gut microbiota metabolism to risk of CRC, underscoring opportunities for the development of new gut microbiome-dependent diagnostic tests and therapeutics for CRC.
We would like to thank the funding resources that have made this work possible. RX is funded by Case Western Reserve University/Cleveland Clinic CTSA Grant (UL1 RR024989), the Eunice Kennedy Shriver National Institute Of Child Health & Human Development of the National Institutes of Health under Award Number DP2HD084068, the Training grant in Computational Genomic Epidemiology of Cancer (CoGE) (R25 CA094186-06), and Grant #IRG-91-022-18 to the Case Comprehensive Cancer Center from the American Cancer Society. QW is partly funded by ThinTek LLC. LL is funded by National Cancer Institute U01CA181770 and R01CA136726.
Publication charges for this article have been funded by the Training grant in Computational Genomic Epidemiology of Cancer (CoGE) (R25 CA094186-06).
This article has been published as part of BMC Genomics Volume 16 Supplement 7, 2015: Selected articles from The International Conference on Intelligent Biology and Medicine (ICIBM) 2014: Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/16/S7.
- Sandhu MS, White IR, McPherson K: Systematic review of the prospective cohort studies on meat consumption and colorectal cancer risk: a meta-analytical approach. Cancer Epidemiol Biomarkers Prev. 2001, 10 (5): 439-446.PubMedGoogle Scholar
- Norat T, Lukanova A, Ferrari P, Riboli E: Meat consumption and colorectal cancer risk: dose-response meta-analysis of epidemiological studies. Int J Cancer. 2002, 98 (2): 241-256. 10.1002/ijc.10126.View ArticlePubMedGoogle Scholar
- Larsson SC, Wolk A: Meat consumption and risk of colorectal cancer: a meta-analysis of prospective studies. Int J Cancer. 2006, 119 (11): 2657-2664. 10.1002/ijc.22170.View ArticlePubMedGoogle Scholar
- World Cancer Research Fund/American Institute of Cancer Research: Food, Nutrition, Physical Activity, and the Prevention of Cancer: a Global Perspective. Washington DC:AICR (2007). World Cancer Research Fund/American Institute of Cancer Research.Google Scholar
- Hope ME, Hold GL, Kain R, Omar EM: Sporadic colorectal cancer-role of the commensal microbiota. FEMS microbiology letters. 2005, 244 (1): 1-7. 10.1016/j.femsle.2005.01.029.View ArticlePubMedGoogle Scholar
- Rowland IR: The role of the gastrointestinal microbiota in colorectal cancer. Current Pharmaceutical Design. 2009, 15 (13): 1524-1527. 10.2174/138161209788168191.View ArticlePubMedGoogle Scholar
- Yang L, Pei Z: Bacteria, inflammation, and colon cancer. World Journal of Gastroenterology. 2006, 12 (42): 6741-6746.PubMed CentralPubMedGoogle Scholar
- Kostic AD, D G, Pedamallu CS, Michaud M, Duke F, Earl AM, Meyerson M: Genomic analysis identifies association of fusobacterium with colorectal carcinoma. Genome research. 2012, 22 (2): 292-298. 10.1101/gr.126573.111.PubMed CentralView ArticlePubMedGoogle Scholar
- Castellarin M, Warren RL, Freeman JD, Dreolini L, Krzywinski M, Strauss J, Holt RA: Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma. Genome research. 2012, 22 (2): 299-306. 10.1101/gr.126516.111.PubMed CentralView ArticlePubMedGoogle Scholar
- McCoy AN, Araujo-Perez F, Azcarate-Peril A, Yeh JJ, Sandler RS, Keku TO: Fusobacterium is associated with colorectal adenomas. PloS one. 2013, 8 (1): 53653-10.1371/journal.pone.0053653.View ArticleGoogle Scholar
- Wang Z, Klipfell E, Bennett BJ, Koeth R, Levison BS, DuGar B, L HS: Gut flora metabolism of phosphatidylcholine promotes cardiovascular disease. Nature. 2011, 472 (7341): 57-63. 10.1038/nature09922.PubMed CentralView ArticlePubMedGoogle Scholar
- Bennett BJ, Vallim TQ, Wang Z, Shih DM, Meng Y, Gregory J, Lusis AJ: Trimethylamine-n-oxide, a metabolite associated with atherosclerosis, exhibits complex genetic and dietary regulation. Cell metabolism. 2013, 17 (1): 49-60. 10.1016/j.cmet.2012.12.011.PubMed CentralView ArticlePubMedGoogle Scholar
- Koeth RA, Wang Z, Levison BS, Buffa JA, Sheehy BT, Britt EB, Hazen SL: Intestinal microbiota metabolism of l-carnitine, a nutrient in red meat, promotes atherosclerosis. Nature Medicine. 2013, 19 (5): 576-585. 10.1038/nm.3145.PubMed CentralView ArticlePubMedGoogle Scholar
- Tang WW, Wang Z, Levison BS, Koeth RA, Britt EB, Fu X, Hazen SL: Intestinal microbial metabolism of phosphatidylcholine and cardiovascular risk. New England Journal of Medicine. 2013, 1575-1584.Google Scholar
- Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA: Online mendelian inheritance in man (omim), a knowledgebase of human genes and genetic disorders. Nucleic acids research. 2005, 33 (suppl 1): 514-517.Google Scholar
- Povey S, Lovering R, Bruford E, Wright M, Lush M: The hugo gene nomenclature committee (hgnc). Human genetics. 2001, 109 (6): 678-680. 10.1007/s00439-001-0615-0.View ArticlePubMedGoogle Scholar
- Franceschini A, Szklarczyk D, Frankild S, Kuhn M, Simonovic M: String v9. 1: protein-protein interaction networks, with increased coverage and integration. Nucleic acids research. 2013, 41 (D1): 808-815. 10.1093/nar/gks1094.View ArticleGoogle Scholar
- Welter D, MacArthur J, Morales J, Burdett T, Hall P: The nhgri gwas catalog, a curated resource of snp-trait associations. Nucleic Acids Research. 2014, 42 (Database): 1001-1006.View ArticleGoogle Scholar
- Kuhn M, Szklarczyk D, Pletscher-Frankild S, Blicher TH, von Mering C, Jensen LJ, Bork P: Stitch 4: integration of protein-chemical interactions with user data. Nucleic acids research. 2013, gkt1207-Google Scholar
- H HT: Topic-sensitive pagerank. Proceedings of the 11th International Conference on World Wide Web. 2002, ACM, 517-526.Google Scholar
- Kohler S, Bauer S, Horn D, Robinson PN: Walking the interactome for prioritization of candidate disease genes. The American Journal of Human Genetics. 2008, 82 (4): 949-958. 10.1016/j.ajhg.2008.02.013.View ArticlePubMedGoogle Scholar
- Vanunu O, Magger O, Ruppin E, Shlomi T, Sharan R: Associating genes and protein complexes with disease via network propagation. PLoS computational biology. 2010, 6 (1): e1000641-10.1371/journal.pcbi.1000641. doi:10.1371/journal.pcbi.1000641PubMed CentralView ArticlePubMedGoogle Scholar
- World Health Organization: International Statistical Classification of Diseases and Related Health Problems. 2004, World Health Organization, 1:Google Scholar
- Bodenreider O: The unified medical language system (umls): integrating biomedical terminology. Nucleic acids research. 2004, 32 (suppl 1): 267-270.View ArticleGoogle Scholar
- Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov JP: Molecular signatures database (msigdb) 3.0. Bioinformatics. 2011, 27 (12): 1739-1740. 10.1093/bioinformatics/btr260.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.