Analysis of Microbiome Data in the Presence of Excess Zeros

An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding...

Full description

Saved in:

Bibliographic Details
Published in	Frontiers in microbiology Vol. 8; p. 2114
Main Authors	Kaul, Abhishek, Mandal, Siddhartha, Davidov, Ori, Peddada, Shyamal D.
Format	Journal Article
Language	English
Published	Switzerland Frontiers Media S.A 07.11.2017
Subjects	Aitchisons log-ratio bootstrap covariates cross-sectional data false discovery rate (FDR) Microbiology Microbiome data covariates bootstrap false discovery rate (FDR) Aitchisons log-ratio cross-sectional data Microbiome data
Online Access	Get full text
ISSN	1664-302X 1664-302X
DOI	10.3389/fmicb.2017.02114

Cover

Loading…

Abstract	An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding a pseudo-count is simple and widely used, as demonstrated in this paper, it is not ideal. On the other hand, methods that model excess zeros using a probability model often make an implicit assumption that all zeros can be explained by a common probability models. As described in this article, this is not always recommended as there are potentially three types/sources of zeros in a microbiome data. The purpose of this paper is to develop a simple methodology to identify and accomodate three different types of zeros and to test hypotheses regarding the relative abundance of taxa in two or more experimental groups. Another major contribution of this paper is to perform constrained (directional or ordered) inference when there are more than two ordered experimental groups (e.g., subjects ordered by diet or age groups or environmental exposure groups). As far as we know this is the first paper that addresses such problems in the analysis of microbiome data. Using extensive simulation studies, we demonstrate that the proposed methodology not only controls the false discovery rate at a desired level of significance while competing well in terms of power with DESeq2, a popular procedure derived from RNASeq literature. As expected, the method using pseudo-counts tends to be very conservative and the classical t-test that ignores the underlying simplex structure in the data has an inflated FDR.
AbstractList	Motivation: An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding a pseudo-count is simple and widely used, as demonstrated in this paper, it is not ideal. On the other hand, methods that model excess zeros using a probability model often make an implicit assumption that all zeros can be explained by a common probability models. As described in this article, this is not always recommended as there are potentially three types/sources of zeros in a microbiome data. The purpose of this paper is to develop a simple methodology to identify and accomodate three different types of zeros and to test hypotheses regarding the relative abundance of taxa in two or more experimental groups. Another major contribution of this paper is to perform constrained (directional or ordered) inference when there are more than two ordered experimental groups (e.g., subjects ordered by diet or age groups or environmental exposure groups). As far as we know this is the first paper that addresses such problems in the analysis of microbiome data. Results: Using extensive simulation studies, we demonstrate that the proposed methodology not only controls the false discovery rate at a desired level of significance while competing well in terms of power with DESeq2, a popular procedure derived from RNASeq literature. As expected, the method using pseudo-counts tends to be very conservative and the classical t-test that ignores the underlying simplex structure in the data has an inflated FDR. Motivation: An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding a pseudo-count is simple and widely used, as demonstrated in this paper, it is not ideal. On the other hand, methods that model excess zeros using a probability model often make an implicit assumption that all zeros can be explained by a common probability models. As described in this article, this is not always recommended as there are potentially three types/sources of zeros in a microbiome data. The purpose of this paper is to develop a simple methodology to identify and accomodate three different types of zeros and to test hypotheses regarding the relative abundance of taxa in two or more experimental groups. Another major contribution of this paper is to perform constrained (directional or ordered) inference when there are more than two ordered experimental groups (e.g., subjects ordered by diet or age groups or environmental exposure groups). As far as we know this is the first paper that addresses such problems in the analysis of microbiome data.Results: Using extensive simulation studies, we demonstrate that the proposed methodology not only controls the false discovery rate at a desired level of significance while competing well in terms of power with DESeq2, a popular procedure derived from RNASeq literature. As expected, the method using pseudo-counts tends to be very conservative and the classical t-test that ignores the underlying simplex structure in the data has an inflated FDR. Motivation: An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding a pseudo-count is simple and widely used, as demonstrated in this paper, it is not ideal. On the other hand, methods that model excess zeros using a probability model often make an implicit assumption that all zeros can be explained by a common probability models. As described in this article, this is not always recommended as there are potentially three types/sources of zeros in a microbiome data. The purpose of this paper is to develop a simple methodology to identify and accomodate three different types of zeros and to test hypotheses regarding the relative abundance of taxa in two or more experimental groups. Another major contribution of this paper is to perform constrained (directional or ordered) inference when there are more than two ordered experimental groups (e.g., subjects ordered by diet or age groups or environmental exposure groups). As far as we know this is the first paper that addresses such problems in the analysis of microbiome data. Results: Using extensive simulation studies, we demonstrate that the proposed methodology not only controls the false discovery rate at a desired level of significance while competing well in terms of power with DESeq2, a popular procedure derived from RNASeq literature. As expected, the method using pseudo-counts tends to be very conservative and the classical t-test that ignores the underlying simplex structure in the data has an inflated FDR.Motivation: An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding a pseudo-count is simple and widely used, as demonstrated in this paper, it is not ideal. On the other hand, methods that model excess zeros using a probability model often make an implicit assumption that all zeros can be explained by a common probability models. As described in this article, this is not always recommended as there are potentially three types/sources of zeros in a microbiome data. The purpose of this paper is to develop a simple methodology to identify and accomodate three different types of zeros and to test hypotheses regarding the relative abundance of taxa in two or more experimental groups. Another major contribution of this paper is to perform constrained (directional or ordered) inference when there are more than two ordered experimental groups (e.g., subjects ordered by diet or age groups or environmental exposure groups). As far as we know this is the first paper that addresses such problems in the analysis of microbiome data. Results: Using extensive simulation studies, we demonstrate that the proposed methodology not only controls the false discovery rate at a desired level of significance while competing well in terms of power with DESeq2, a popular procedure derived from RNASeq literature. As expected, the method using pseudo-counts tends to be very conservative and the classical t-test that ignores the underlying simplex structure in the data has an inflated FDR. An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding a pseudo-count is simple and widely used, as demonstrated in this paper, it is not ideal. On the other hand, methods that model excess zeros using a probability model often make an implicit assumption that all zeros can be explained by a common probability models. As described in this article, this is not always recommended as there are potentially three types/sources of zeros in a microbiome data. The purpose of this paper is to develop a simple methodology to identify and accomodate three different types of zeros and to test hypotheses regarding the relative abundance of taxa in two or more experimental groups. Another major contribution of this paper is to perform constrained (directional or ordered) inference when there are more than two ordered experimental groups (e.g., subjects ordered by diet or age groups or environmental exposure groups). As far as we know this is the first paper that addresses such problems in the analysis of microbiome data. Using extensive simulation studies, we demonstrate that the proposed methodology not only controls the false discovery rate at a desired level of significance while competing well in terms of power with DESeq2, a popular procedure derived from RNASeq literature. As expected, the method using pseudo-counts tends to be very conservative and the classical t-test that ignores the underlying simplex structure in the data has an inflated FDR.
Author	Peddada, Shyamal D. Mandal, Siddhartha Kaul, Abhishek Davidov, Ori
AuthorAffiliation	1 Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences (NIH) , Durham, NC , United States 2 Public Health Foundation of India , Gurgaon , India 3 Department of Statistics, University of Haifa , Haifa , Israel
AuthorAffiliation_xml	– name: 2 Public Health Foundation of India , Gurgaon , India – name: 1 Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences (NIH) , Durham, NC , United States – name: 3 Department of Statistics, University of Haifa , Haifa , Israel
Author_xml	– sequence: 1 givenname: Abhishek surname: Kaul fullname: Kaul, Abhishek – sequence: 2 givenname: Siddhartha surname: Mandal fullname: Mandal, Siddhartha – sequence: 3 givenname: Ori surname: Davidov fullname: Davidov, Ori – sequence: 4 givenname: Shyamal D. surname: Peddada fullname: Peddada, Shyamal D.
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/29163406$$D View this record in MEDLINE/PubMed
BookMark	eNp1kUtrGzEUhUVJaR71vqswy2zs6D0aAoGQN6S0ixZKN0IjXdkKM6NEGpfm30e2U5MEqoV0kc75rrhnH-0McQCEvhA8Y0w1x74Ptp1RTOoZpoTwD2iPSMmnDNNfO6_qXTTJ-R6XxTEt-ye0SxsiGcdyD52cDaZ7yiFX0Vdfg02xDbGH6sKMpgpDNS6g-p4gw2BhJbn8ayHn6jekmD-jj950GSYv5wH6eXX54_xmevft-vb87G5qBVXjVNWNp4K1hlEO3ilsHCPCMEeZbaWQDaMSFOXMk9pTRYXxjtTO1TXnyhTnAbrdcF009_ohhd6kJx1N0OuLmObapDHYDrQVBQuNV9JbjpVVLW1r6ai3DRcgXGGdblgPy7YHZ2EYk-neQN--DGGh5_GPFlKV4akCOHoBpPi4hDzqPmQLXWcGiMusSSNrvtYW6eHrXtsm_6ZfBHgjKGPPOYHfSgjWq4z1OmO9ylivMy4W-c5iw2jGEFe_Dd3_jc8sbasK
CitedBy_id	crossref_primary_10_1038_s41598_020_69931_x crossref_primary_10_1016_j_soilbio_2021_108534 crossref_primary_10_3390_ijms22115876 crossref_primary_10_3389_fgene_2022_812828 crossref_primary_10_1186_s12967_023_04199_z crossref_primary_10_1111_rssc_12497 crossref_primary_10_1093_bioinformatics_btad470 crossref_primary_10_3389_fmicb_2024_1320812 crossref_primary_10_1186_s13059_022_02601_5 crossref_primary_10_1186_s40793_024_00578_1 crossref_primary_10_3389_fmicb_2022_961020 crossref_primary_10_1002_ieam_4812 crossref_primary_10_1186_s40168_022_01275_2 crossref_primary_10_1093_bib_bbab482 crossref_primary_10_3390_nu14030721 crossref_primary_10_1080_19490976_2023_2180317 crossref_primary_10_1016_j_jmb_2024_168841 crossref_primary_10_1128_mbio_00491_23 crossref_primary_10_1186_s12859_022_04786_9 crossref_primary_10_7717_peerj_12982 crossref_primary_10_1038_s41467_022_28034_z crossref_primary_10_3389_fmicb_2022_787628 crossref_primary_10_1080_20002297_2019_1586421 crossref_primary_10_1128_Spectrum_01525_21 crossref_primary_10_3389_fmicb_2021_711861 crossref_primary_10_3390_bioengineering10020231 crossref_primary_10_1002_jev2_12487 crossref_primary_10_1080_19490976_2024_2363012 crossref_primary_10_1093_bib_bbae653 crossref_primary_10_1016_j_csbj_2022_04_001 crossref_primary_10_1128_msystems_00033_22 crossref_primary_10_3390_math11132830 crossref_primary_10_1038_s41598_023_34818_0 crossref_primary_10_1186_s40168_022_01423_8 crossref_primary_10_1371_journal_pcbi_1008108 crossref_primary_10_1080_19490976_2024_2399260 crossref_primary_10_1371_journal_pcbi_1009442 crossref_primary_10_1002_oby_23717 crossref_primary_10_1016_j_csbj_2020_09_014 crossref_primary_10_1080_19490976_2023_2244139 crossref_primary_10_3389_fevo_2024_1168288 crossref_primary_10_3389_fmicb_2023_1261889 crossref_primary_10_1158_1078_0432_CCR_22_1254 crossref_primary_10_3389_fgene_2024_1417533 crossref_primary_10_1111_1365_2745_13143 crossref_primary_10_1016_j_fm_2021_103754 crossref_primary_10_3390_nu14081545 crossref_primary_10_1186_s13073_024_01336_1 crossref_primary_10_3390_genes13071139 crossref_primary_10_1016_j_jep_2024_118415 crossref_primary_10_1371_journal_pone_0261032 crossref_primary_10_1016_j_soilbio_2021_108468 crossref_primary_10_1371_journal_pone_0283287 crossref_primary_10_3389_fmars_2025_1491476 crossref_primary_10_1016_j_scitotenv_2023_167815 crossref_primary_10_1128_msystems_01190_23 crossref_primary_10_1146_annurev_statistics_040522_120734 crossref_primary_10_1128_spectrum_01689_21 crossref_primary_10_1099_jmm_0_001903 crossref_primary_10_1002_gepi_22438 crossref_primary_10_3390_f13040622 crossref_primary_10_1038_s42255_023_00961_1 crossref_primary_10_1016_j_ynpai_2020_100053 crossref_primary_10_1111_1755_0998_13426 crossref_primary_10_1186_s12859_024_05918_z crossref_primary_10_1016_j_coemr_2021_05_005 crossref_primary_10_1021_jasms_4c00434 crossref_primary_10_1094_PBIOMES_02_24_0021_R crossref_primary_10_1038_s41467_020_17041_7 crossref_primary_10_1128_mSystems_00507_21 crossref_primary_10_1016_j_imr_2023_100998 crossref_primary_10_1111_gbi_12517 crossref_primary_10_1139_cjm_2019_0052 crossref_primary_10_1093_bioinformatics_bty414 crossref_primary_10_1080_19490976_2024_2375679 crossref_primary_10_1038_s41598_024_60409_8 crossref_primary_10_3389_fmicb_2022_728146 crossref_primary_10_1136_thorax_2023_220455 crossref_primary_10_1016_j_crmeth_2024_100899 crossref_primary_10_1016_j_ygyno_2022_02_021 crossref_primary_10_1016_j_aquaculture_2020_735287 crossref_primary_10_1093_nargab_lqae038 crossref_primary_10_3389_fimmu_2021_692225 crossref_primary_10_1080_19490976_2023_2208501 crossref_primary_10_3389_fnut_2022_987216 crossref_primary_10_1186_s40168_021_01103_z crossref_primary_10_1088_1478_3975_ac3ad6 crossref_primary_10_1038_s41598_021_93345_y crossref_primary_10_1371_journal_pone_0285674 crossref_primary_10_1186_s40168_022_01320_0 crossref_primary_10_3389_fmicb_2024_1394204 crossref_primary_10_1038_s41531_022_00395_8 crossref_primary_10_1080_20018525_2025_2470499 crossref_primary_10_1128_mSystems_00065_20 crossref_primary_10_3389_fmicb_2022_848611 crossref_primary_10_1080_01621459_2021_1933499 crossref_primary_10_1186_s40168_023_01460_x crossref_primary_10_1038_s41531_021_00244_0 crossref_primary_10_1186_s40168_023_01696_7 crossref_primary_10_3389_fcimb_2021_646467 crossref_primary_10_1186_s13059_022_02655_5 crossref_primary_10_1002_sim_9431 crossref_primary_10_1152_ajpregu_00072_2020 crossref_primary_10_1007_s12561_020_09294_z crossref_primary_10_1016_j_chom_2023_01_004 crossref_primary_10_1038_s41522_024_00598_2 crossref_primary_10_1016_j_ajog_2024_12_016 crossref_primary_10_1093_bib_bbac607 crossref_primary_10_1214_22_AOAS1607 crossref_primary_10_2147_JAA_S478329 crossref_primary_10_1002_ajmg_b_32926 crossref_primary_10_3390_microorganisms11092245 crossref_primary_10_3390_genes11091015 crossref_primary_10_1016_j_envpol_2024_124434 crossref_primary_10_3390_children9111764 crossref_primary_10_1016_j_csda_2022_107659 crossref_primary_10_1017_gmb_2023_12 crossref_primary_10_3390_nu16162752 crossref_primary_10_1002_hep4_1944 crossref_primary_10_1093_bioinformatics_btaa260 crossref_primary_10_3389_fmicb_2021_833726 crossref_primary_10_1177_09622802211061634 crossref_primary_10_1109_ACCESS_2021_3094529 crossref_primary_10_1038_s41598_021_02343_7 crossref_primary_10_1186_s42523_024_00315_6 crossref_primary_10_1186_s12866_023_03157_5 crossref_primary_10_1021_acs_jafc_3c02949 crossref_primary_10_1371_journal_pcbi_1009838 crossref_primary_10_1371_journal_pcbi_1011240 crossref_primary_10_3390_microorganisms10091833 crossref_primary_10_3389_fmicb_2023_1334623 crossref_primary_10_1038_s41522_020_00160_w crossref_primary_10_1038_s41467_024_48717_z crossref_primary_10_1093_bioinformatics_btae071 crossref_primary_10_1038_s41396_020_00868_9 crossref_primary_10_1214_24_STS925 crossref_primary_10_1111_mec_16491 crossref_primary_10_1186_s40168_024_01823_y crossref_primary_10_1002_ptr_8478 crossref_primary_10_1093_gastro_goad022 crossref_primary_10_1371_journal_pone_0316616 crossref_primary_10_1186_s40168_023_01588_w crossref_primary_10_1186_s40168_024_01797_x crossref_primary_10_1038_s41598_021_85897_w crossref_primary_10_1128_mBio_02323_21 crossref_primary_10_3389_fcimb_2021_671413 crossref_primary_10_1016_j_csbj_2022_12_035 crossref_primary_10_1371_journal_pone_0292055 crossref_primary_10_1016_j_soilbio_2022_108604 crossref_primary_10_3389_fimmu_2020_01245 crossref_primary_10_1093_bioinformatics_btab012 crossref_primary_10_1111_1755_0998_13730 crossref_primary_10_1186_s41231_024_00187_7 crossref_primary_10_1038_s42003_024_05908_0 crossref_primary_10_1016_j_psychres_2024_115775 crossref_primary_10_1186_s13059_021_02400_4 crossref_primary_10_1128_msystems_00434_24 crossref_primary_10_1186_s40168_022_01310_2 crossref_primary_10_1038_s41467_023_38058_8 crossref_primary_10_3389_fmicb_2022_952238 crossref_primary_10_1016_j_jnutbio_2022_109247 crossref_primary_10_1038_s41592_023_02092_7 crossref_primary_10_1186_s40168_021_01167_x crossref_primary_10_3389_fmicb_2021_640253 crossref_primary_10_3389_fendo_2023_1139056 crossref_primary_10_1093_molbev_msac263 crossref_primary_10_1038_s41584_020_0450_0 crossref_primary_10_1002_ijc_33428 crossref_primary_10_3389_fmars_2023_1281691 crossref_primary_10_1158_1055_9965_EPI_20_1417 crossref_primary_10_1182_blood_2021014255 crossref_primary_10_1186_s40168_024_01971_1 crossref_primary_10_1002_wics_1586 crossref_primary_10_1016_j_jnutbio_2022_109117 crossref_primary_10_1111_1462_2920_16445 crossref_primary_10_1002_hep_32197 crossref_primary_10_1038_s41522_023_00391_7 crossref_primary_10_3233_JAD_230117 crossref_primary_10_1093_femsec_fiac013 crossref_primary_10_1038_s41598_020_66178_4 crossref_primary_10_1038_s41598_023_30615_x crossref_primary_10_1093_bioinformatics_btae661 crossref_primary_10_1186_s12985_023_02113_z crossref_primary_10_1038_s41598_024_64452_3 crossref_primary_10_1080_20002297_2020_1761135 crossref_primary_10_1002_ece3_11001 crossref_primary_10_1093_bib_bbac273 crossref_primary_10_1038_s41467_022_33313_w crossref_primary_10_1093_bib_bbae205 crossref_primary_10_1371_journal_pone_0230170 crossref_primary_10_1186_s40168_021_01168_w crossref_primary_10_3389_fnut_2022_896348 crossref_primary_10_1080_19490976_2023_2176119 crossref_primary_10_1016_j_scitotenv_2024_176519
Cites_doi	10.1093/biomet/63.3.581 10.1038/ajgsup.2012.4 10.1038/nature11053 10.1109/TGRS.2002.802517 10.1093/bioinformatics/btw308 10.1038/ismej.2009.37 10.3402/mehd.v26.27663 10.1007/978-94-009-4109-0 10.1186/s12859-016-0937-5 10.1371/journal.pone.0084778 10.1371/journal.pone.0020647 10.1111/biom.12079 10.1111/j.2517-6161.1982.tb01195.x 10.1194/jlr.R036012 10.1093/bioinformatics/btg093 10.1038/nmeth.2658 10.1111/j.2517-6161.1985.tb01341.x 10.1016/j.cell.2012.01.035 10.1111/j.1541-0420.2009.01292.x 10.1146/annurev-statistics-010814-020351
ContentType	Journal Article
Copyright	Copyright © 2017 Kaul, Mandal, Davidov and Peddada. 2017 Kaul, Mandal, Davidov and Peddada
Copyright_xml	– notice: Copyright © 2017 Kaul, Mandal, Davidov and Peddada. 2017 Kaul, Mandal, Davidov and Peddada
DBID	AAYXX CITATION NPM 7X8 5PM DOA
DOI	10.3389/fmicb.2017.02114
DatabaseName	CrossRef PubMed MEDLINE - Academic PubMed Central (Full Participant titles) Directory of Open Access Journals
DatabaseTitle	CrossRef PubMed MEDLINE - Academic
DatabaseTitleList	MEDLINE - Academic PubMed
Database_xml	– sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Biology
EISSN	1664-302X
ExternalDocumentID	oai_doaj_org_article_c53d2e9f86fc408c8b2b76d2fc945e5d PMC5682008 29163406 10_3389_fmicb_2017_02114
Genre	Journal Article
GrantInformation_xml	– fundername: Intramural NIH HHS grantid: Z01 ES101744 – fundername: Israeli Science Foundation grantid: 1256/13 – fundername: National Institute of Environmental Health Sciences grantid: Z01 ES101744-04
GroupedDBID	53G 5VS 9T4 AAFWJ AAKDD AAYXX ACGFO ACGFS ACXDI ADBBV ADRAZ AENEX AFPKN ALMA_UNASSIGNED_HOLDINGS AOIJS BAWUL BCNDV CITATION DIK ECGQY GROUPED_DOAJ GX1 HYE KQ8 M48 M~E O5R O5S OK1 PGMZT RNS RPM IPNFZ NPM RIG 7X8 5PM
ID	FETCH-LOGICAL-c528t-879f253ba324efd80ad315a3d23cb6569326e8243f17f2825afd17dd77448a253
IEDL.DBID	M48
ISSN	1664-302X
IngestDate	Wed Aug 27 01:32:46 EDT 2025 Thu Aug 21 14:07:47 EDT 2025 Thu Jul 10 22:51:21 EDT 2025 Sat May 31 02:07:52 EDT 2025 Thu Apr 24 22:50:56 EDT 2025 Tue Jul 01 00:54:50 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Keywords	covariates bootstrap false discovery rate (FDR) Aitchisons log-ratio cross-sectional data Microbiome data
Language	English
License	This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c528t-879f253ba324efd80ad315a3d23cb6569326e8243f17f2825afd17dd77448a253
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 This article was submitted to Systems Microbiology, a section of the journal Frontiers in Microbiology Edited by: George Tsiamis, University of Patras, Greece Present Address: Abhishek Kaul, Department of Mathematics and Statistics, Washington State University, Pullman, WA, United States Reviewed by: Magnus Øverlie Arntzen, Norwegian University of Life Sciences, Norway; Bradley Stevenson, University of Oklahoma, United States Shyamal D. Peddada, Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, United States
OpenAccessLink	http://journals.scholarsportal.info/openUrl.xqy?doi=10.3389/fmicb.2017.02114
PMID	29163406
PQID	1967468200
PQPubID	23479
ParticipantIDs	doaj_primary_oai_doaj_org_article_c53d2e9f86fc408c8b2b76d2fc945e5d pubmedcentral_primary_oai_pubmedcentral_nih_gov_5682008 proquest_miscellaneous_1967468200 pubmed_primary_29163406 crossref_primary_10_3389_fmicb_2017_02114 crossref_citationtrail_10_3389_fmicb_2017_02114
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2017-11-07
PublicationDateYYYYMMDD	2017-11-07
PublicationDate_xml	– month: 11 year: 2017 text: 2017-11-07 day: 07
PublicationDecade	2010
PublicationPlace	Switzerland
PublicationPlace_xml	– name: Switzerland
PublicationTitle	Frontiers in microbiology
PublicationTitleAlternate	Front Microbiol
PublicationYear	2017
Publisher	Frontiers Media S.A
Publisher_xml	– name: Frontiers Media S.A
References	Xia (B20) 2013; 69 Chen (B4) 2016; 32 Aitchison (B3) 1986 Peddada (B16) 2003; 19 Paulson (B14) 2013; 10 Yatsunenko (B21) 2012; 486 Mai (B12) 2011; 6 Peddada (B15) 2002; 40 Sartor (B18) 2012; 1 Aitchison (B2) 1985; 47 Li (B11) 2015; 2 Guo (B9) 2010; 66 Wang (B19) 2009; 3 Jelsema (B10) 2016 den Besten (B6) 2013; 54 Grandhi (B8) 2016; 17 Aitchison (B1) 1982; 44 Clemente (B5) 2012; 148 Farnan (B7) 2014; 9 Mandal (B13) 2015; 26 Rubin (B17) 1976; 63
References_xml	– volume: 63 start-page: 581 year: 1976 ident: B17 article-title: Inference and missing data publication-title: Biometrika doi: 10.1093/biomet/63.3.581 – volume: 1 start-page: 15 year: 2012 ident: B18 article-title: Intestinal Microbes in Infl ammatory Bowel Diseases publication-title: Am. J. Gastroenterol. Suppl. doi: 10.1038/ajgsup.2012.4 – volume: 486 start-page: 222 year: 2012 ident: B21 article-title: Human gut microbiome viewed across age and geography publication-title: Nature doi: 10.1038/nature11053 – volume: 40 start-page: 1879 year: 2002 ident: B15 article-title: Classification of pixels in a noisy greyscale image of polar ice publication-title: IEEE Trans. Geosci. Remote Sensing doi: 10.1109/TGRS.2002.802517 – volume: 32 start-page: 2611 year: 2016 ident: B4 article-title: A two-part mixed-effect model for analyzing longitudinal microbiome compositional data publication-title: Bioinformatics doi: 10.1093/bioinformatics/btw308 – volume: 3 start-page: 944 year: 2009 ident: B19 article-title: 16S rRNA gene-based analysis of fecal microbiota from preterm infants with and without necrotizing enterocolitis publication-title: ISME J. doi: 10.1038/ismej.2009.37 – volume: 26 start-page: 1 year: 2015 ident: B13 article-title: Analysis of composition of microbiomes: a novel method for studying microbial composition publication-title: Microb. Ecol. Health Dis. doi: 10.3402/mehd.v26.27663 – volume-title: The Statistical Analysis of Compositional Data. year: 1986 ident: B3 doi: 10.1007/978-94-009-4109-0 – volume: 17 start-page: 104 year: 2016 ident: B8 article-title: A multiple testing procedure for multi-dimensional pairwise comparisons with application to gene expression studies publication-title: BMC Bioinformatics doi: 10.1186/s12859-016-0937-5 – volume: 9 start-page: e84778 year: 2014 ident: B7 article-title: Constrained inference in biological sciences: linear mixed effects models under constraints publication-title: PLoS ONE doi: 10.1371/journal.pone.0084778 – volume: 6 start-page: e20647 year: 2011 ident: B12 article-title: Fecal microbiota in premature infants prior to necrotizing enterocolitis publication-title: PLoS ONE doi: 10.1371/journal.pone.0020647 – volume: 69 start-page: 1053 year: 2013 ident: B20 article-title: A logistic normal multinomial regression model for microbiome compositional data analysis publication-title: Biometrics doi: 10.1111/biom.12079 – volume: 44 start-page: 139 year: 1982 ident: B1 article-title: The statistical analysis of compositional data (with discussion) publication-title: J. R. Statist. Soc. B doi: 10.1111/j.2517-6161.1982.tb01195.x – volume: 54 start-page: 2325 year: 2013 ident: B6 article-title: The role of short-chain fatty acids in the interplay between diet, gut microbiota, and host energy metabolism publication-title: J Lipid Res. doi: 10.1194/jlr.R036012 – volume: 19 start-page: 834 year: 2003 ident: B16 article-title: Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference publication-title: Bioinformatics doi: 10.1093/bioinformatics/btg093 – volume: 10 start-page: 1200 year: 2013 ident: B14 article-title: Differential abundance analysis for microbial marker-gene surveys publication-title: Nat. Methods doi: 10.1038/nmeth.2658 – volume: 47 start-page: 136 year: 1985 ident: B2 article-title: A general class of distributions on the simplex publication-title: J. R. Statist. Soc. B doi: 10.1111/j.2517-6161.1985.tb01341.x – volume: 148 start-page: 1258 year: 2012 ident: B5 article-title: The Impact of the Gut Microbiota on Human Health: an Integrative View publication-title: Cell doi: 10.1016/j.cell.2012.01.035 – volume: 66 start-page: 485 year: 2010 ident: B9 article-title: Controlling false discoveries in multidimensional directional decisions, with applications to gene expression data on ordered categories publication-title: Biometrics doi: 10.1111/j.1541-0420.2009.01292.x – volume-title: CLME: An R Package for Linear Mixed Effects Models under Inequality Constraints. year: 2016 ident: B10 – volume: 2 start-page: 73 year: 2015 ident: B11 article-title: Microbiome, metagenomics and high-dimensional compositional data analysis publication-title: Annu. Rev. Stat. Appl. doi: 10.1146/annurev-statistics-010814-020351
SSID	ssj0000402000
Score	2.573039
Snippet	An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small... Motivation: An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add... Motivation: An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add...
SourceID	doaj pubmedcentral proquest pubmed crossref
SourceType	Open Website Open Access Repository Aggregation Database Index Database Enrichment Source
StartPage	2114
SubjectTerms	Aitchisons log-ratio bootstrap covariates cross-sectional data false discovery rate (FDR) Microbiology Microbiome data
SummonAdditionalLinks	– databaseName: Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA4iCF7Et_VFBC8e6nbTpEnx5GsRYcWDC-IlJGmCC9oVXUH_vTPp7rIrohevadIO3zSZb0jmCyGHwjKYRoVNg8lMynMm09JLkzI8LwRNjCsscO7eFFc9fn0v7qeu-sIzYY08cANcy4m8Yr4MqgiOZ8opy6wsKhZcyYUXFa6-EPOmkqm4BmNalGXNviRkYSW4qe8sHuWSxxDW2nwmDkW5_p845vejklOxp7NMlkakkZ42xq6QOV-vkoXmGsnPNXIyVhahg0C7_UZa6dnTCzM0tF9T4Hj0NpYZOY9dLj-wNoA-eLBsnfQ6l3fnV-noUoTUCaaGsHqVAeC1BpiQD5XKTJW3hQGQcmeBnCEf84rxPLRlwMJUE6q2rCqgeVwZGLlB5utB7bcIVQbGFcGrzJbcZawU2LkAQsYsRDabkNYYIu1GiuF4ccWThswBQdURVI2g6ghqQo4mI14atYxf-p4h6pN-qHMdG8D7euR9_Zf3E3Iw9pmGeYGbHab2g_c3DSuL5AXwmywhm40PJ59iwIlzYDIJkTPenbFl9kndf4za2yK-Um3_h_E7ZBHhiJWNcpfMD1_f_R5QnKHdj3_zF7rE-U0 priority: 102 providerName: Directory of Open Access Journals
Title	Analysis of Microbiome Data in the Presence of Excess Zeros
URI	https://www.ncbi.nlm.nih.gov/pubmed/29163406 https://www.proquest.com/docview/1967468200 https://pubmed.ncbi.nlm.nih.gov/PMC5682008 https://doaj.org/article/c53d2e9f86fc408c8b2b76d2fc945e5d
Volume	8
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwELYQCKkXVKAtoYCMxKWH0Kxjx45QhXgjpEUcutKqF8t27HYlyLbLIsG_74yTXVi04sAlB8eTOJ8f803smSFkT1gG06iwaTCZSXnOZFp6aVKG54WgiHGFDs7d6-Kyx6_6ov_sHt0CeD_XtMN8Ur3R7f7jv6dDmPA_0OIEfQs9MHAWT2nJfdBYmNV6CfSSxEQO3Zbsx3UZTaUsa_Yq5wpiZGCgSznH_Ecv1FSM5j-Pgr4-SflCNZ1_JCstp6RHzSBYJQu-XiPLTZbJp3VyMAk8QoeBdgdN5KU7T0_N2NBBTYEC0pvoheQ8Vjl7RNcB-stDyz6R3vnZz5PLtM2ZkDrB1BgWtzIA-tYAUfKhUpmp8o4wecVyZ4G7IV3zivE8dGRAv1UTqo6sKmCBXBmQ_EwW62HtNwhVBuSK4FVmS-4yVgqsXABfYxYUn03I9wlE2rUBxTGvxa0GwwLx1RFfjfjqiG9Cvk0l_jbBNN6oe4yoT-thGOxYMBz91u2s0k7Ah_kyqCI4nimnLLOyqFhwJRdeVAnZnfSZhmmDeyGm9sOHew0Lj-QF0J8sIV-aPpy-ajIGEiJnenemLbN36sGfGJpbxEeqzXdLfiUfEIPo7Si3yOJ49OC3gfaM7U78XQDXi35nJ47s_4LkApg
linkProvider	Scholars Portal
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Analysis+of+Microbiome+Data+in+the+Presence+of+Excess+Zeros&rft.jtitle=Frontiers+in+microbiology&rft.au=Kaul%2C+Abhishek&rft.au=Mandal%2C+Siddhartha&rft.au=Davidov%2C+Ori&rft.au=Peddada%2C+Shyamal+D.&rft.date=2017-11-07&rft.pub=Frontiers+Media+S.A&rft.eissn=1664-302X&rft.volume=8&rft_id=info:doi/10.3389%2Ffmicb.2017.02114&rft_id=info%3Apmid%2F29163406&rft.externalDocID=PMC5682008
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1664-302X&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1664-302X&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1664-302X&client=summon