Analysis of Microbiome Data in the Presence of Excess Zeros

An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in microbiology Vol. 8; p. 2114
Main Authors Kaul, Abhishek, Mandal, Siddhartha, Davidov, Ori, Peddada, Shyamal D.
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media S.A 07.11.2017
Subjects
Online AccessGet full text
ISSN1664-302X
1664-302X
DOI10.3389/fmicb.2017.02114

Cover

Loading…
Abstract An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding a pseudo-count is simple and widely used, as demonstrated in this paper, it is not ideal. On the other hand, methods that model excess zeros using a probability model often make an implicit assumption that all zeros can be explained by a common probability models. As described in this article, this is not always recommended as there are potentially three types/sources of zeros in a microbiome data. The purpose of this paper is to develop a simple methodology to identify and accomodate three different types of zeros and to test hypotheses regarding the relative abundance of taxa in two or more experimental groups. Another major contribution of this paper is to perform constrained (directional or ordered) inference when there are more than two ordered experimental groups (e.g., subjects ordered by diet or age groups or environmental exposure groups). As far as we know this is the first paper that addresses such problems in the analysis of microbiome data. Using extensive simulation studies, we demonstrate that the proposed methodology not only controls the false discovery rate at a desired level of significance while competing well in terms of power with DESeq2, a popular procedure derived from RNASeq literature. As expected, the method using pseudo-counts tends to be very conservative and the classical t-test that ignores the underlying simplex structure in the data has an inflated FDR.
AbstractList Motivation: An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding a pseudo-count is simple and widely used, as demonstrated in this paper, it is not ideal. On the other hand, methods that model excess zeros using a probability model often make an implicit assumption that all zeros can be explained by a common probability models. As described in this article, this is not always recommended as there are potentially three types/sources of zeros in a microbiome data. The purpose of this paper is to develop a simple methodology to identify and accomodate three different types of zeros and to test hypotheses regarding the relative abundance of taxa in two or more experimental groups. Another major contribution of this paper is to perform constrained (directional or ordered) inference when there are more than two ordered experimental groups (e.g., subjects ordered by diet or age groups or environmental exposure groups). As far as we know this is the first paper that addresses such problems in the analysis of microbiome data. Results: Using extensive simulation studies, we demonstrate that the proposed methodology not only controls the false discovery rate at a desired level of significance while competing well in terms of power with DESeq2, a popular procedure derived from RNASeq literature. As expected, the method using pseudo-counts tends to be very conservative and the classical t-test that ignores the underlying simplex structure in the data has an inflated FDR.
Motivation: An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding a pseudo-count is simple and widely used, as demonstrated in this paper, it is not ideal. On the other hand, methods that model excess zeros using a probability model often make an implicit assumption that all zeros can be explained by a common probability models. As described in this article, this is not always recommended as there are potentially three types/sources of zeros in a microbiome data. The purpose of this paper is to develop a simple methodology to identify and accomodate three different types of zeros and to test hypotheses regarding the relative abundance of taxa in two or more experimental groups. Another major contribution of this paper is to perform constrained (directional or ordered) inference when there are more than two ordered experimental groups (e.g., subjects ordered by diet or age groups or environmental exposure groups). As far as we know this is the first paper that addresses such problems in the analysis of microbiome data.Results: Using extensive simulation studies, we demonstrate that the proposed methodology not only controls the false discovery rate at a desired level of significance while competing well in terms of power with DESeq2, a popular procedure derived from RNASeq literature. As expected, the method using pseudo-counts tends to be very conservative and the classical t-test that ignores the underlying simplex structure in the data has an inflated FDR.
Motivation: An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding a pseudo-count is simple and widely used, as demonstrated in this paper, it is not ideal. On the other hand, methods that model excess zeros using a probability model often make an implicit assumption that all zeros can be explained by a common probability models. As described in this article, this is not always recommended as there are potentially three types/sources of zeros in a microbiome data. The purpose of this paper is to develop a simple methodology to identify and accomodate three different types of zeros and to test hypotheses regarding the relative abundance of taxa in two or more experimental groups. Another major contribution of this paper is to perform constrained (directional or ordered) inference when there are more than two ordered experimental groups (e.g., subjects ordered by diet or age groups or environmental exposure groups). As far as we know this is the first paper that addresses such problems in the analysis of microbiome data. Results: Using extensive simulation studies, we demonstrate that the proposed methodology not only controls the false discovery rate at a desired level of significance while competing well in terms of power with DESeq2, a popular procedure derived from RNASeq literature. As expected, the method using pseudo-counts tends to be very conservative and the classical t-test that ignores the underlying simplex structure in the data has an inflated FDR.Motivation: An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding a pseudo-count is simple and widely used, as demonstrated in this paper, it is not ideal. On the other hand, methods that model excess zeros using a probability model often make an implicit assumption that all zeros can be explained by a common probability models. As described in this article, this is not always recommended as there are potentially three types/sources of zeros in a microbiome data. The purpose of this paper is to develop a simple methodology to identify and accomodate three different types of zeros and to test hypotheses regarding the relative abundance of taxa in two or more experimental groups. Another major contribution of this paper is to perform constrained (directional or ordered) inference when there are more than two ordered experimental groups (e.g., subjects ordered by diet or age groups or environmental exposure groups). As far as we know this is the first paper that addresses such problems in the analysis of microbiome data. Results: Using extensive simulation studies, we demonstrate that the proposed methodology not only controls the false discovery rate at a desired level of significance while competing well in terms of power with DESeq2, a popular procedure derived from RNASeq literature. As expected, the method using pseudo-counts tends to be very conservative and the classical t-test that ignores the underlying simplex structure in the data has an inflated FDR.
An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small number called pseudo-count (e.g., 1). Other strategies include using various probability models to model the excess zero counts. Although adding a pseudo-count is simple and widely used, as demonstrated in this paper, it is not ideal. On the other hand, methods that model excess zeros using a probability model often make an implicit assumption that all zeros can be explained by a common probability models. As described in this article, this is not always recommended as there are potentially three types/sources of zeros in a microbiome data. The purpose of this paper is to develop a simple methodology to identify and accomodate three different types of zeros and to test hypotheses regarding the relative abundance of taxa in two or more experimental groups. Another major contribution of this paper is to perform constrained (directional or ordered) inference when there are more than two ordered experimental groups (e.g., subjects ordered by diet or age groups or environmental exposure groups). As far as we know this is the first paper that addresses such problems in the analysis of microbiome data. Using extensive simulation studies, we demonstrate that the proposed methodology not only controls the false discovery rate at a desired level of significance while competing well in terms of power with DESeq2, a popular procedure derived from RNASeq literature. As expected, the method using pseudo-counts tends to be very conservative and the classical t-test that ignores the underlying simplex structure in the data has an inflated FDR.
Author Peddada, Shyamal D.
Mandal, Siddhartha
Kaul, Abhishek
Davidov, Ori
AuthorAffiliation 1 Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences (NIH) , Durham, NC , United States
2 Public Health Foundation of India , Gurgaon , India
3 Department of Statistics, University of Haifa , Haifa , Israel
AuthorAffiliation_xml – name: 2 Public Health Foundation of India , Gurgaon , India
– name: 1 Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences (NIH) , Durham, NC , United States
– name: 3 Department of Statistics, University of Haifa , Haifa , Israel
Author_xml – sequence: 1
  givenname: Abhishek
  surname: Kaul
  fullname: Kaul, Abhishek
– sequence: 2
  givenname: Siddhartha
  surname: Mandal
  fullname: Mandal, Siddhartha
– sequence: 3
  givenname: Ori
  surname: Davidov
  fullname: Davidov, Ori
– sequence: 4
  givenname: Shyamal D.
  surname: Peddada
  fullname: Peddada, Shyamal D.
BackLink https://www.ncbi.nlm.nih.gov/pubmed/29163406$$D View this record in MEDLINE/PubMed
BookMark eNp1kUtrGzEUhUVJaR71vqswy2zs6D0aAoGQN6S0ixZKN0IjXdkKM6NEGpfm30e2U5MEqoV0kc75rrhnH-0McQCEvhA8Y0w1x74Ptp1RTOoZpoTwD2iPSMmnDNNfO6_qXTTJ-R6XxTEt-ye0SxsiGcdyD52cDaZ7yiFX0Vdfg02xDbGH6sKMpgpDNS6g-p4gw2BhJbn8ayHn6jekmD-jj950GSYv5wH6eXX54_xmevft-vb87G5qBVXjVNWNp4K1hlEO3ilsHCPCMEeZbaWQDaMSFOXMk9pTRYXxjtTO1TXnyhTnAbrdcF009_ohhd6kJx1N0OuLmObapDHYDrQVBQuNV9JbjpVVLW1r6ai3DRcgXGGdblgPy7YHZ2EYk-neQN--DGGh5_GPFlKV4akCOHoBpPi4hDzqPmQLXWcGiMusSSNrvtYW6eHrXtsm_6ZfBHgjKGPPOYHfSgjWq4z1OmO9ylivMy4W-c5iw2jGEFe_Dd3_jc8sbasK
CitedBy_id crossref_primary_10_1038_s41598_020_69931_x
crossref_primary_10_1016_j_soilbio_2021_108534
crossref_primary_10_3390_ijms22115876
crossref_primary_10_3389_fgene_2022_812828
crossref_primary_10_1186_s12967_023_04199_z
crossref_primary_10_1111_rssc_12497
crossref_primary_10_1093_bioinformatics_btad470
crossref_primary_10_3389_fmicb_2024_1320812
crossref_primary_10_1186_s13059_022_02601_5
crossref_primary_10_1186_s40793_024_00578_1
crossref_primary_10_3389_fmicb_2022_961020
crossref_primary_10_1002_ieam_4812
crossref_primary_10_1186_s40168_022_01275_2
crossref_primary_10_1093_bib_bbab482
crossref_primary_10_3390_nu14030721
crossref_primary_10_1080_19490976_2023_2180317
crossref_primary_10_1016_j_jmb_2024_168841
crossref_primary_10_1128_mbio_00491_23
crossref_primary_10_1186_s12859_022_04786_9
crossref_primary_10_7717_peerj_12982
crossref_primary_10_1038_s41467_022_28034_z
crossref_primary_10_3389_fmicb_2022_787628
crossref_primary_10_1080_20002297_2019_1586421
crossref_primary_10_1128_Spectrum_01525_21
crossref_primary_10_3389_fmicb_2021_711861
crossref_primary_10_3390_bioengineering10020231
crossref_primary_10_1002_jev2_12487
crossref_primary_10_1080_19490976_2024_2363012
crossref_primary_10_1093_bib_bbae653
crossref_primary_10_1016_j_csbj_2022_04_001
crossref_primary_10_1128_msystems_00033_22
crossref_primary_10_3390_math11132830
crossref_primary_10_1038_s41598_023_34818_0
crossref_primary_10_1186_s40168_022_01423_8
crossref_primary_10_1371_journal_pcbi_1008108
crossref_primary_10_1080_19490976_2024_2399260
crossref_primary_10_1371_journal_pcbi_1009442
crossref_primary_10_1002_oby_23717
crossref_primary_10_1016_j_csbj_2020_09_014
crossref_primary_10_1080_19490976_2023_2244139
crossref_primary_10_3389_fevo_2024_1168288
crossref_primary_10_3389_fmicb_2023_1261889
crossref_primary_10_1158_1078_0432_CCR_22_1254
crossref_primary_10_3389_fgene_2024_1417533
crossref_primary_10_1111_1365_2745_13143
crossref_primary_10_1016_j_fm_2021_103754
crossref_primary_10_3390_nu14081545
crossref_primary_10_1186_s13073_024_01336_1
crossref_primary_10_3390_genes13071139
crossref_primary_10_1016_j_jep_2024_118415
crossref_primary_10_1371_journal_pone_0261032
crossref_primary_10_1016_j_soilbio_2021_108468
crossref_primary_10_1371_journal_pone_0283287
crossref_primary_10_3389_fmars_2025_1491476
crossref_primary_10_1016_j_scitotenv_2023_167815
crossref_primary_10_1128_msystems_01190_23
crossref_primary_10_1146_annurev_statistics_040522_120734
crossref_primary_10_1128_spectrum_01689_21
crossref_primary_10_1099_jmm_0_001903
crossref_primary_10_1002_gepi_22438
crossref_primary_10_3390_f13040622
crossref_primary_10_1038_s42255_023_00961_1
crossref_primary_10_1016_j_ynpai_2020_100053
crossref_primary_10_1111_1755_0998_13426
crossref_primary_10_1186_s12859_024_05918_z
crossref_primary_10_1016_j_coemr_2021_05_005
crossref_primary_10_1021_jasms_4c00434
crossref_primary_10_1094_PBIOMES_02_24_0021_R
crossref_primary_10_1038_s41467_020_17041_7
crossref_primary_10_1128_mSystems_00507_21
crossref_primary_10_1016_j_imr_2023_100998
crossref_primary_10_1111_gbi_12517
crossref_primary_10_1139_cjm_2019_0052
crossref_primary_10_1093_bioinformatics_bty414
crossref_primary_10_1080_19490976_2024_2375679
crossref_primary_10_1038_s41598_024_60409_8
crossref_primary_10_3389_fmicb_2022_728146
crossref_primary_10_1136_thorax_2023_220455
crossref_primary_10_1016_j_crmeth_2024_100899
crossref_primary_10_1016_j_ygyno_2022_02_021
crossref_primary_10_1016_j_aquaculture_2020_735287
crossref_primary_10_1093_nargab_lqae038
crossref_primary_10_3389_fimmu_2021_692225
crossref_primary_10_1080_19490976_2023_2208501
crossref_primary_10_3389_fnut_2022_987216
crossref_primary_10_1186_s40168_021_01103_z
crossref_primary_10_1088_1478_3975_ac3ad6
crossref_primary_10_1038_s41598_021_93345_y
crossref_primary_10_1371_journal_pone_0285674
crossref_primary_10_1186_s40168_022_01320_0
crossref_primary_10_3389_fmicb_2024_1394204
crossref_primary_10_1038_s41531_022_00395_8
crossref_primary_10_1080_20018525_2025_2470499
crossref_primary_10_1128_mSystems_00065_20
crossref_primary_10_3389_fmicb_2022_848611
crossref_primary_10_1080_01621459_2021_1933499
crossref_primary_10_1186_s40168_023_01460_x
crossref_primary_10_1038_s41531_021_00244_0
crossref_primary_10_1186_s40168_023_01696_7
crossref_primary_10_3389_fcimb_2021_646467
crossref_primary_10_1186_s13059_022_02655_5
crossref_primary_10_1002_sim_9431
crossref_primary_10_1152_ajpregu_00072_2020
crossref_primary_10_1007_s12561_020_09294_z
crossref_primary_10_1016_j_chom_2023_01_004
crossref_primary_10_1038_s41522_024_00598_2
crossref_primary_10_1016_j_ajog_2024_12_016
crossref_primary_10_1093_bib_bbac607
crossref_primary_10_1214_22_AOAS1607
crossref_primary_10_2147_JAA_S478329
crossref_primary_10_1002_ajmg_b_32926
crossref_primary_10_3390_microorganisms11092245
crossref_primary_10_3390_genes11091015
crossref_primary_10_1016_j_envpol_2024_124434
crossref_primary_10_3390_children9111764
crossref_primary_10_1016_j_csda_2022_107659
crossref_primary_10_1017_gmb_2023_12
crossref_primary_10_3390_nu16162752
crossref_primary_10_1002_hep4_1944
crossref_primary_10_1093_bioinformatics_btaa260
crossref_primary_10_3389_fmicb_2021_833726
crossref_primary_10_1177_09622802211061634
crossref_primary_10_1109_ACCESS_2021_3094529
crossref_primary_10_1038_s41598_021_02343_7
crossref_primary_10_1186_s42523_024_00315_6
crossref_primary_10_1186_s12866_023_03157_5
crossref_primary_10_1021_acs_jafc_3c02949
crossref_primary_10_1371_journal_pcbi_1009838
crossref_primary_10_1371_journal_pcbi_1011240
crossref_primary_10_3390_microorganisms10091833
crossref_primary_10_3389_fmicb_2023_1334623
crossref_primary_10_1038_s41522_020_00160_w
crossref_primary_10_1038_s41467_024_48717_z
crossref_primary_10_1093_bioinformatics_btae071
crossref_primary_10_1038_s41396_020_00868_9
crossref_primary_10_1214_24_STS925
crossref_primary_10_1111_mec_16491
crossref_primary_10_1186_s40168_024_01823_y
crossref_primary_10_1002_ptr_8478
crossref_primary_10_1093_gastro_goad022
crossref_primary_10_1371_journal_pone_0316616
crossref_primary_10_1186_s40168_023_01588_w
crossref_primary_10_1186_s40168_024_01797_x
crossref_primary_10_1038_s41598_021_85897_w
crossref_primary_10_1128_mBio_02323_21
crossref_primary_10_3389_fcimb_2021_671413
crossref_primary_10_1016_j_csbj_2022_12_035
crossref_primary_10_1371_journal_pone_0292055
crossref_primary_10_1016_j_soilbio_2022_108604
crossref_primary_10_3389_fimmu_2020_01245
crossref_primary_10_1093_bioinformatics_btab012
crossref_primary_10_1111_1755_0998_13730
crossref_primary_10_1186_s41231_024_00187_7
crossref_primary_10_1038_s42003_024_05908_0
crossref_primary_10_1016_j_psychres_2024_115775
crossref_primary_10_1186_s13059_021_02400_4
crossref_primary_10_1128_msystems_00434_24
crossref_primary_10_1186_s40168_022_01310_2
crossref_primary_10_1038_s41467_023_38058_8
crossref_primary_10_3389_fmicb_2022_952238
crossref_primary_10_1016_j_jnutbio_2022_109247
crossref_primary_10_1038_s41592_023_02092_7
crossref_primary_10_1186_s40168_021_01167_x
crossref_primary_10_3389_fmicb_2021_640253
crossref_primary_10_3389_fendo_2023_1139056
crossref_primary_10_1093_molbev_msac263
crossref_primary_10_1038_s41584_020_0450_0
crossref_primary_10_1002_ijc_33428
crossref_primary_10_3389_fmars_2023_1281691
crossref_primary_10_1158_1055_9965_EPI_20_1417
crossref_primary_10_1182_blood_2021014255
crossref_primary_10_1186_s40168_024_01971_1
crossref_primary_10_1002_wics_1586
crossref_primary_10_1016_j_jnutbio_2022_109117
crossref_primary_10_1111_1462_2920_16445
crossref_primary_10_1002_hep_32197
crossref_primary_10_1038_s41522_023_00391_7
crossref_primary_10_3233_JAD_230117
crossref_primary_10_1093_femsec_fiac013
crossref_primary_10_1038_s41598_020_66178_4
crossref_primary_10_1038_s41598_023_30615_x
crossref_primary_10_1093_bioinformatics_btae661
crossref_primary_10_1186_s12985_023_02113_z
crossref_primary_10_1038_s41598_024_64452_3
crossref_primary_10_1080_20002297_2020_1761135
crossref_primary_10_1002_ece3_11001
crossref_primary_10_1093_bib_bbac273
crossref_primary_10_1038_s41467_022_33313_w
crossref_primary_10_1093_bib_bbae205
crossref_primary_10_1371_journal_pone_0230170
crossref_primary_10_1186_s40168_021_01168_w
crossref_primary_10_3389_fnut_2022_896348
crossref_primary_10_1080_19490976_2023_2176119
crossref_primary_10_1016_j_scitotenv_2024_176519
Cites_doi 10.1093/biomet/63.3.581
10.1038/ajgsup.2012.4
10.1038/nature11053
10.1109/TGRS.2002.802517
10.1093/bioinformatics/btw308
10.1038/ismej.2009.37
10.3402/mehd.v26.27663
10.1007/978-94-009-4109-0
10.1186/s12859-016-0937-5
10.1371/journal.pone.0084778
10.1371/journal.pone.0020647
10.1111/biom.12079
10.1111/j.2517-6161.1982.tb01195.x
10.1194/jlr.R036012
10.1093/bioinformatics/btg093
10.1038/nmeth.2658
10.1111/j.2517-6161.1985.tb01341.x
10.1016/j.cell.2012.01.035
10.1111/j.1541-0420.2009.01292.x
10.1146/annurev-statistics-010814-020351
ContentType Journal Article
Copyright Copyright © 2017 Kaul, Mandal, Davidov and Peddada. 2017 Kaul, Mandal, Davidov and Peddada
Copyright_xml – notice: Copyright © 2017 Kaul, Mandal, Davidov and Peddada. 2017 Kaul, Mandal, Davidov and Peddada
DBID AAYXX
CITATION
NPM
7X8
5PM
DOA
DOI 10.3389/fmicb.2017.02114
DatabaseName CrossRef
PubMed
MEDLINE - Academic
PubMed Central (Full Participant titles)
Directory of Open Access Journals
DatabaseTitle CrossRef
PubMed
MEDLINE - Academic
DatabaseTitleList

MEDLINE - Academic
PubMed
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1664-302X
ExternalDocumentID oai_doaj_org_article_c53d2e9f86fc408c8b2b76d2fc945e5d
PMC5682008
29163406
10_3389_fmicb_2017_02114
Genre Journal Article
GrantInformation_xml – fundername: Intramural NIH HHS
  grantid: Z01 ES101744
– fundername: Israeli Science Foundation
  grantid: 1256/13
– fundername: National Institute of Environmental Health Sciences
  grantid: Z01 ES101744-04
GroupedDBID 53G
5VS
9T4
AAFWJ
AAKDD
AAYXX
ACGFO
ACGFS
ACXDI
ADBBV
ADRAZ
AENEX
AFPKN
ALMA_UNASSIGNED_HOLDINGS
AOIJS
BAWUL
BCNDV
CITATION
DIK
ECGQY
GROUPED_DOAJ
GX1
HYE
KQ8
M48
M~E
O5R
O5S
OK1
PGMZT
RNS
RPM
IPNFZ
NPM
RIG
7X8
5PM
ID FETCH-LOGICAL-c528t-879f253ba324efd80ad315a3d23cb6569326e8243f17f2825afd17dd77448a253
IEDL.DBID M48
ISSN 1664-302X
IngestDate Wed Aug 27 01:32:46 EDT 2025
Thu Aug 21 14:07:47 EDT 2025
Thu Jul 10 22:51:21 EDT 2025
Sat May 31 02:07:52 EDT 2025
Thu Apr 24 22:50:56 EDT 2025
Tue Jul 01 00:54:50 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords covariates
bootstrap
false discovery rate (FDR)
Aitchisons log-ratio
cross-sectional data
Microbiome data
Language English
License This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c528t-879f253ba324efd80ad315a3d23cb6569326e8243f17f2825afd17dd77448a253
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
This article was submitted to Systems Microbiology, a section of the journal Frontiers in Microbiology
Edited by: George Tsiamis, University of Patras, Greece
Present Address: Abhishek Kaul, Department of Mathematics and Statistics, Washington State University, Pullman, WA, United States
Reviewed by: Magnus Øverlie Arntzen, Norwegian University of Life Sciences, Norway; Bradley Stevenson, University of Oklahoma, United States
Shyamal D. Peddada, Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, United States
OpenAccessLink http://journals.scholarsportal.info/openUrl.xqy?doi=10.3389/fmicb.2017.02114
PMID 29163406
PQID 1967468200
PQPubID 23479
ParticipantIDs doaj_primary_oai_doaj_org_article_c53d2e9f86fc408c8b2b76d2fc945e5d
pubmedcentral_primary_oai_pubmedcentral_nih_gov_5682008
proquest_miscellaneous_1967468200
pubmed_primary_29163406
crossref_primary_10_3389_fmicb_2017_02114
crossref_citationtrail_10_3389_fmicb_2017_02114
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2017-11-07
PublicationDateYYYYMMDD 2017-11-07
PublicationDate_xml – month: 11
  year: 2017
  text: 2017-11-07
  day: 07
PublicationDecade 2010
PublicationPlace Switzerland
PublicationPlace_xml – name: Switzerland
PublicationTitle Frontiers in microbiology
PublicationTitleAlternate Front Microbiol
PublicationYear 2017
Publisher Frontiers Media S.A
Publisher_xml – name: Frontiers Media S.A
References Xia (B20) 2013; 69
Chen (B4) 2016; 32
Aitchison (B3) 1986
Peddada (B16) 2003; 19
Paulson (B14) 2013; 10
Yatsunenko (B21) 2012; 486
Mai (B12) 2011; 6
Peddada (B15) 2002; 40
Sartor (B18) 2012; 1
Aitchison (B2) 1985; 47
Li (B11) 2015; 2
Guo (B9) 2010; 66
Wang (B19) 2009; 3
Jelsema (B10) 2016
den Besten (B6) 2013; 54
Grandhi (B8) 2016; 17
Aitchison (B1) 1982; 44
Clemente (B5) 2012; 148
Farnan (B7) 2014; 9
Mandal (B13) 2015; 26
Rubin (B17) 1976; 63
References_xml – volume: 63
  start-page: 581
  year: 1976
  ident: B17
  article-title: Inference and missing data
  publication-title: Biometrika
  doi: 10.1093/biomet/63.3.581
– volume: 1
  start-page: 15
  year: 2012
  ident: B18
  article-title: Intestinal Microbes in Infl ammatory Bowel Diseases
  publication-title: Am. J. Gastroenterol. Suppl.
  doi: 10.1038/ajgsup.2012.4
– volume: 486
  start-page: 222
  year: 2012
  ident: B21
  article-title: Human gut microbiome viewed across age and geography
  publication-title: Nature
  doi: 10.1038/nature11053
– volume: 40
  start-page: 1879
  year: 2002
  ident: B15
  article-title: Classification of pixels in a noisy greyscale image of polar ice
  publication-title: IEEE Trans. Geosci. Remote Sensing
  doi: 10.1109/TGRS.2002.802517
– volume: 32
  start-page: 2611
  year: 2016
  ident: B4
  article-title: A two-part mixed-effect model for analyzing longitudinal microbiome compositional data
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btw308
– volume: 3
  start-page: 944
  year: 2009
  ident: B19
  article-title: 16S rRNA gene-based analysis of fecal microbiota from preterm infants with and without necrotizing enterocolitis
  publication-title: ISME J.
  doi: 10.1038/ismej.2009.37
– volume: 26
  start-page: 1
  year: 2015
  ident: B13
  article-title: Analysis of composition of microbiomes: a novel method for studying microbial composition
  publication-title: Microb. Ecol. Health Dis.
  doi: 10.3402/mehd.v26.27663
– volume-title: The Statistical Analysis of Compositional Data.
  year: 1986
  ident: B3
  doi: 10.1007/978-94-009-4109-0
– volume: 17
  start-page: 104
  year: 2016
  ident: B8
  article-title: A multiple testing procedure for multi-dimensional pairwise comparisons with application to gene expression studies
  publication-title: BMC Bioinformatics
  doi: 10.1186/s12859-016-0937-5
– volume: 9
  start-page: e84778
  year: 2014
  ident: B7
  article-title: Constrained inference in biological sciences: linear mixed effects models under constraints
  publication-title: PLoS ONE
  doi: 10.1371/journal.pone.0084778
– volume: 6
  start-page: e20647
  year: 2011
  ident: B12
  article-title: Fecal microbiota in premature infants prior to necrotizing enterocolitis
  publication-title: PLoS ONE
  doi: 10.1371/journal.pone.0020647
– volume: 69
  start-page: 1053
  year: 2013
  ident: B20
  article-title: A logistic normal multinomial regression model for microbiome compositional data analysis
  publication-title: Biometrics
  doi: 10.1111/biom.12079
– volume: 44
  start-page: 139
  year: 1982
  ident: B1
  article-title: The statistical analysis of compositional data (with discussion)
  publication-title: J. R. Statist. Soc. B
  doi: 10.1111/j.2517-6161.1982.tb01195.x
– volume: 54
  start-page: 2325
  year: 2013
  ident: B6
  article-title: The role of short-chain fatty acids in the interplay between diet, gut microbiota, and host energy metabolism
  publication-title: J Lipid Res.
  doi: 10.1194/jlr.R036012
– volume: 19
  start-page: 834
  year: 2003
  ident: B16
  article-title: Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btg093
– volume: 10
  start-page: 1200
  year: 2013
  ident: B14
  article-title: Differential abundance analysis for microbial marker-gene surveys
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.2658
– volume: 47
  start-page: 136
  year: 1985
  ident: B2
  article-title: A general class of distributions on the simplex
  publication-title: J. R. Statist. Soc. B
  doi: 10.1111/j.2517-6161.1985.tb01341.x
– volume: 148
  start-page: 1258
  year: 2012
  ident: B5
  article-title: The Impact of the Gut Microbiota on Human Health: an Integrative View
  publication-title: Cell
  doi: 10.1016/j.cell.2012.01.035
– volume: 66
  start-page: 485
  year: 2010
  ident: B9
  article-title: Controlling false discoveries in multidimensional directional decisions, with applications to gene expression data on ordered categories
  publication-title: Biometrics
  doi: 10.1111/j.1541-0420.2009.01292.x
– volume-title: CLME: An R Package for Linear Mixed Effects Models under Inequality Constraints.
  year: 2016
  ident: B10
– volume: 2
  start-page: 73
  year: 2015
  ident: B11
  article-title: Microbiome, metagenomics and high-dimensional compositional data analysis
  publication-title: Annu. Rev. Stat. Appl.
  doi: 10.1146/annurev-statistics-010814-020351
SSID ssj0000402000
Score 2.573039
Snippet An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add a small...
Motivation: An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add...
Motivation: An important feature of microbiome count data is the presence of a large number of zeros. A common strategy to handle these excess zeros is to add...
SourceID doaj
pubmedcentral
proquest
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage 2114
SubjectTerms Aitchisons log-ratio
bootstrap
covariates
cross-sectional data
false discovery rate (FDR)
Microbiology
Microbiome data
SummonAdditionalLinks – databaseName: Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA4iCF7Et_VFBC8e6nbTpEnx5GsRYcWDC-IlJGmCC9oVXUH_vTPp7rIrohevadIO3zSZb0jmCyGHwjKYRoVNg8lMynMm09JLkzI8LwRNjCsscO7eFFc9fn0v7qeu-sIzYY08cANcy4m8Yr4MqgiOZ8opy6wsKhZcyYUXFa6-EPOmkqm4BmNalGXNviRkYSW4qe8sHuWSxxDW2nwmDkW5_p845vejklOxp7NMlkakkZ42xq6QOV-vkoXmGsnPNXIyVhahg0C7_UZa6dnTCzM0tF9T4Hj0NpYZOY9dLj-wNoA-eLBsnfQ6l3fnV-noUoTUCaaGsHqVAeC1BpiQD5XKTJW3hQGQcmeBnCEf84rxPLRlwMJUE6q2rCqgeVwZGLlB5utB7bcIVQbGFcGrzJbcZawU2LkAQsYsRDabkNYYIu1GiuF4ccWThswBQdURVI2g6ghqQo4mI14atYxf-p4h6pN-qHMdG8D7euR9_Zf3E3Iw9pmGeYGbHab2g_c3DSuL5AXwmywhm40PJ59iwIlzYDIJkTPenbFl9kndf4za2yK-Um3_h_E7ZBHhiJWNcpfMD1_f_R5QnKHdj3_zF7rE-U0
  priority: 102
  providerName: Directory of Open Access Journals
Title Analysis of Microbiome Data in the Presence of Excess Zeros
URI https://www.ncbi.nlm.nih.gov/pubmed/29163406
https://www.proquest.com/docview/1967468200
https://pubmed.ncbi.nlm.nih.gov/PMC5682008
https://doaj.org/article/c53d2e9f86fc408c8b2b76d2fc945e5d
Volume 8
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwELYQCKkXVKAtoYCMxKWH0Kxjx45QhXgjpEUcutKqF8t27HYlyLbLIsG_74yTXVi04sAlB8eTOJ8f803smSFkT1gG06iwaTCZSXnOZFp6aVKG54WgiHGFDs7d6-Kyx6_6ov_sHt0CeD_XtMN8Ur3R7f7jv6dDmPA_0OIEfQs9MHAWT2nJfdBYmNV6CfSSxEQO3Zbsx3UZTaUsa_Yq5wpiZGCgSznH_Ecv1FSM5j-Pgr4-SflCNZ1_JCstp6RHzSBYJQu-XiPLTZbJp3VyMAk8QoeBdgdN5KU7T0_N2NBBTYEC0pvoheQ8Vjl7RNcB-stDyz6R3vnZz5PLtM2ZkDrB1BgWtzIA-tYAUfKhUpmp8o4wecVyZ4G7IV3zivE8dGRAv1UTqo6sKmCBXBmQ_EwW62HtNwhVBuSK4FVmS-4yVgqsXABfYxYUn03I9wlE2rUBxTGvxa0GwwLx1RFfjfjqiG9Cvk0l_jbBNN6oe4yoT-thGOxYMBz91u2s0k7Ah_kyqCI4nimnLLOyqFhwJRdeVAnZnfSZhmmDeyGm9sOHew0Lj-QF0J8sIV-aPpy-ajIGEiJnenemLbN36sGfGJpbxEeqzXdLfiUfEIPo7Si3yOJ49OC3gfaM7U78XQDXi35nJ47s_4LkApg
linkProvider Scholars Portal
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Analysis+of+Microbiome+Data+in+the+Presence+of+Excess+Zeros&rft.jtitle=Frontiers+in+microbiology&rft.au=Kaul%2C+Abhishek&rft.au=Mandal%2C+Siddhartha&rft.au=Davidov%2C+Ori&rft.au=Peddada%2C+Shyamal+D.&rft.date=2017-11-07&rft.pub=Frontiers+Media+S.A&rft.eissn=1664-302X&rft.volume=8&rft_id=info:doi/10.3389%2Ffmicb.2017.02114&rft_id=info%3Apmid%2F29163406&rft.externalDocID=PMC5682008
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1664-302X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1664-302X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1664-302X&client=summon