A new paradigm for high‐dimensional data: Distance‐based semiparametric feature aggregation framework via between‐subject attributes
This article proposes a distance‐based framework incentivized by the paradigm shift toward feature aggregation for high‐dimensional data, which does not rely on the sparse‐feature assumption or the permutation‐based inference. Focusing on distance‐based outcomes that preserve information without tru...
Saved in:
Published in | Scandinavian journal of statistics Vol. 51; no. 2; pp. 672 - 696 |
---|---|
Main Authors | , , , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
England
Blackwell Publishing Ltd
01.06.2024
|
Subjects | |
Online Access | Get full text |
ISSN | 0303-6898 1467-9469 |
DOI | 10.1111/sjos.12695 |
Cover
Loading…
Abstract | This article proposes a distance‐based framework incentivized by the paradigm shift toward feature aggregation for high‐dimensional data, which does not rely on the sparse‐feature assumption or the permutation‐based inference. Focusing on distance‐based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high‐dimensional variables using pairwise outcomes of between‐subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U‐statistics‐based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root‐n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model's interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands. |
---|---|
AbstractList | This article proposes a distance-based framework incentivized by the paradigm shift towards feature aggregation for high-dimensional data, which does not rely on the sparse-feature assumption or the permutation-based inference. Focusing on distance-based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high-dimensional variables using pairwise outcomes of between-subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U-statistics-based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root-n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model’s interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands. This article proposes a distance‐based framework incentivized by the paradigm shift toward feature aggregation for high‐dimensional data, which does not rely on the sparse‐feature assumption or the permutation‐based inference. Focusing on distance‐based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high‐dimensional variables using pairwise outcomes of between‐subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U‐statistics‐based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root‐n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model's interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands. This article proposes a distance-based framework incentivized by the paradigm shift towards feature aggregation for high-dimensional data, which does not rely on the sparse-feature assumption or the permutation-based inference. Focusing on distance-based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high-dimensional variables using pairwise outcomes of between-subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U-statistics-based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root-n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model's interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands.This article proposes a distance-based framework incentivized by the paradigm shift towards feature aggregation for high-dimensional data, which does not rely on the sparse-feature assumption or the permutation-based inference. Focusing on distance-based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high-dimensional variables using pairwise outcomes of between-subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U-statistics-based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root-n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model's interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands. This article proposes a distance‐based framework incentivized by the paradigm shift toward feature aggregation for high‐dimensional data, which does not rely on the sparse‐feature assumption or the permutation‐based inference. Focusing on distance‐based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high‐dimensional variables using pairwise outcomes of between‐subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U‐statistics‐based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root‐ n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model's interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands. |
Author | Tu, Xin M. Chen, Ruohui Liu, Jinyuan Zhang, Xinlian Huang, Anna Chen, Tian Lin, Tuo Liu, Chenyu Jeste, Dilip V. Lee, Ellen E. Zhong, Yuan Nguyen, Tanya T. Wu, Tsungchin |
AuthorAffiliation | 4 Takeda Pharmaceuticals Cambridge, Massachusetts, U.S.A 3 Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, U.S.A 9 Stein Institute for Research on Aging, UC San Diego, San Diego, California, U.S.A 7 Center for Microbiome Innovation, UC San Diego, San Diego, California, U.S.A 2 Department of Family Medicine and Public Health, UC San Diego, San Diego, California, U.S.A 5 Department of Psychiatry, Vanderbilt University, Nashville, Tennessee, U.S.A 1 Department of Biostatistics, Vanderbilt University, Nashville, Tennessee, U.S.A 6 Veterans Affairs San Diego Healthcare System, La Jolla, California, U.S.A 8 Department of Psychiatry, UC San Diego, San Diego, California, U.S.A |
AuthorAffiliation_xml | – name: 4 Takeda Pharmaceuticals Cambridge, Massachusetts, U.S.A – name: 6 Veterans Affairs San Diego Healthcare System, La Jolla, California, U.S.A – name: 9 Stein Institute for Research on Aging, UC San Diego, San Diego, California, U.S.A – name: 1 Department of Biostatistics, Vanderbilt University, Nashville, Tennessee, U.S.A – name: 3 Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, U.S.A – name: 2 Department of Family Medicine and Public Health, UC San Diego, San Diego, California, U.S.A – name: 5 Department of Psychiatry, Vanderbilt University, Nashville, Tennessee, U.S.A – name: 7 Center for Microbiome Innovation, UC San Diego, San Diego, California, U.S.A – name: 8 Department of Psychiatry, UC San Diego, San Diego, California, U.S.A |
Author_xml | – sequence: 1 givenname: Jinyuan orcidid: 0000-0001-6689-8245 surname: Liu fullname: Liu, Jinyuan organization: Vanderbilt University – sequence: 2 givenname: Xinlian surname: Zhang fullname: Zhang, Xinlian email: xizhang@health.ucsd.edu organization: UC San Diego – sequence: 3 givenname: Tuo surname: Lin fullname: Lin, Tuo organization: UC San Diego – sequence: 4 givenname: Ruohui surname: Chen fullname: Chen, Ruohui organization: UC San Diego – sequence: 5 givenname: Yuan surname: Zhong fullname: Zhong, Yuan organization: University of Michigan – sequence: 6 givenname: Tian surname: Chen fullname: Chen, Tian organization: Takeda Pharmaceuticals – sequence: 7 givenname: Tsungchin surname: Wu fullname: Wu, Tsungchin organization: UC San Diego – sequence: 8 givenname: Chenyu surname: Liu fullname: Liu, Chenyu organization: UC San Diego – sequence: 9 givenname: Anna surname: Huang fullname: Huang, Anna organization: Vanderbilt University – sequence: 10 givenname: Tanya T. surname: Nguyen fullname: Nguyen, Tanya T. organization: UC San Diego – sequence: 11 givenname: Ellen E. surname: Lee fullname: Lee, Ellen E. organization: UC San Diego – sequence: 12 givenname: Dilip V. surname: Jeste fullname: Jeste, Dilip V. organization: UC San Diego – sequence: 13 givenname: Xin M. surname: Tu fullname: Tu, Xin M. organization: UC San Diego |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/39101047$$D View this record in MEDLINE/PubMed |
BookMark | eNp9ks1u1DAUhS1URH9gwwMgS2wQ0hQ7jj0xG1SVf1XqorC2bpybjIckHmyno-5Ys-IZeRKcTqmgQnhjWec7R_f63kOyN_oRCXnM2THP50Vc-3jMC6XlPXLAS7Vc6FLpPXLABBMLVelqnxzGuGaMq5JXD8i-0JxxVi4PyPcTOuKWbiBA47qBtj7QletWP7_9aNyAY3R-hJ42kOAlfe1igtFiFmuI2NCIg5utA6bgLG0R0hSQQtcF7CBlL21ndevDF3rpgNaYtohjDohTvUabKKRsraeE8SG530If8dHNfUQ-v33z6fT94uz83YfTk7OFLUstFxZB2koVthayVU1bIVoulhoZ1Fbb_BK64hK4LJZ8qSpobFUyoYVUAHUjxBF5tcvdTPWAjcUxBejNJrgBwpXx4MzfyuhWpvOXhvNCK6VkTnh2kxD81wljMoOLFvseRvRTNIJVlVRMKJ7Rp3fQtZ9C_tKZkkXBdCnnwCd_lnRby-85ZeD5DrDBxxiwvUU4M_MSmHkJzPUSZJjdga1L19PI7bj-3xa-s2xdj1f_CTcXH88vdp5fCSvLrA |
CitedBy_id | crossref_primary_10_6339_25_JDS1169 |
Cites_doi | 10.18637/jss.v022.i07 10.1111/j.0006-341X.2001.01173.x 10.1080/01621459.2017.1319839 10.1038/s41591-022-01688-4 10.1214/13-AOS1175 10.1111/biom.13487 10.1007/s11258-006-9126-3 10.1080/02664763.2018.1426739 10.3389/fpsyg.2015.00223 10.1073/pnas.1507583112 10.1073/pnas.2005634117 10.1016/j.jpsychires.2017.09.005 10.1111/j.1541-0420.2009.01300.x 10.1111/j.1558-5646.1994.tb02191.x 10.1038/nature06244 10.1177/0956797611417000 10.1111/j.1442-9993.2001.01070.pp.x 10.2196/18403 10.1111/j.1467-9868.2011.00771.x 10.1111/j.1095-8312.1998.tb01520.x 10.1111/sjos.12450 10.18637/jss.v040.i08 10.1111/ectj.12097 10.1007/BF00893322 10.1080/01621459.2021.1956937 10.1176/appi.ajp.2018.18040429 10.1214/08-AOAS206 10.1214/aoms/1177698394 10.1111/sjos.12368 10.1002/hep.30832 10.1097/00005650-199206000-00002 10.1128/AEM.71.12.8228-8235.2005 10.1371/journal.pmed.1001953 10.3389/fpsyt.2021.648475 10.1016/j.neuroimage.2013.05.041 10.1198/016214506000000735 |
ContentType | Journal Article |
Copyright | 2023 Board of the Foundation of the Scandinavian Journal of Statistics. 2024 Board of the Foundation of the Scandinavian Journal of Statistics |
Copyright_xml | – notice: 2023 Board of the Foundation of the Scandinavian Journal of Statistics. – notice: 2024 Board of the Foundation of the Scandinavian Journal of Statistics |
DBID | AAYXX CITATION NPM 7SC 8FD H8D JQ2 L7M L~C L~D 7X8 5PM |
DOI | 10.1111/sjos.12695 |
DatabaseName | CrossRef PubMed Computer and Information Systems Abstracts Technology Research Database Aerospace Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional MEDLINE - Academic PubMed Central (Full Participant titles) |
DatabaseTitle | CrossRef PubMed Aerospace Database Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional MEDLINE - Academic |
DatabaseTitleList | Aerospace Database MEDLINE - Academic CrossRef PubMed |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Statistics Mathematics |
EISSN | 1467-9469 |
EndPage | 696 |
ExternalDocumentID | PMC11296665 39101047 10_1111_sjos_12695 SJOS12695 |
Genre | article Journal Article |
GrantInformation_xml | – fundername: UC San Diego Center for Healthy Aging – fundername: National Institute of Mental Health funderid: K23/MH118435; K23/MH119375‐01; R01/MH094151 – fundername: NIMH NIH HHS grantid: R01 MH135147 – fundername: NIMH NIH HHS grantid: K23 MH118435 – fundername: NIMH NIH HHS grantid: K23 MH119375 – fundername: NIMH NIH HHS grantid: R01 MH094151 |
GroupedDBID | -ET .3N .GA .GJ .L6 .Y3 05W 0R~ 10A 123 1OC 3-9 31~ 33P 3SF 4.4 44B 50Y 50Z 51W 51X 52M 52N 52O 52P 52S 52T 52U 52W 52X 5HH 5LA 5VS 66C 6OB 702 7PT 8-0 8-1 8-3 8-4 8-5 8UM 8V8 930 A03 AAESR AAEVG AAHHS AAHQN AAKYL AAMNL AANHP AANLZ AAONW AASGY AAXRX AAYCA AAZKR ABBHK ABCQN ABCUV ABDBF ABEML ABFAN ABIVO ABJNI ABPVW ABQDR ABXSQ ABYWD ACAHQ ACBWZ ACCFJ ACCZN ACDIW ACGFS ACIWK ACMTB ACPOU ACRPL ACSCC ACTMH ACUHS ACXBN ACXQS ACYXJ ADBBV ADEOM ADIZJ ADKYN ADMGS ADNMO ADODI ADOZA ADULT ADXAS ADZMN AEEZP AEGXH AEIGN AEIMD AELLO AELPN AEMOZ AENEX AEQDE AEUPB AEUQT AEUYR AFBPY AFEBI AFFPM AFGKR AFPWT AFVYC AFWVQ AFZJQ AHBTC AHQJS AITYG AIURR AIWBW AJBDE AJXKR AKBRZ AKVCP ALAGY ALMA_UNASSIGNED_HOLDINGS ALRMG ALUQN ALVPJ AMBMR AMYDB ASPBG AS~ ATUGU AUFTA AVWKF AZBYB AZFZN AZVAB BAFTC BDRZF BFHJK BHBCM BHOJU BMNLL BMXJE BNHUX BROTX BRXPI BY8 CAG COF CS3 D-E D-F DCZOG DPXWK DQDLB DR2 DRFUL DRSTM DSRWC DU5 EAD EAP EBA EBO EBR EBS EBU ECEWR EJD EMK EST ESX F00 F01 F04 F5P FEDTE G-S G.N GIFXF GODZA H.T H.X HF~ HGLYW HQ6 HVGLF HZI HZ~ IHE IPSME IX1 J0M JAA JAAYA JBMMH JBZCM JENOY JHFFW JKQEH JLEZI JLXEF JMS JPL JSODD JST K1G K48 LATKE LC2 LC3 LEEKS LH4 LITHE LOXES LP6 LP7 LUTES LW6 LYRES MEWTI MK4 MRFUL MRSTM MSFUL MSSTM MXFUL MXSTM N04 N05 N9A NF~ O66 O9- OIG P2W P2X P4D PQQKQ Q.N Q11 QB0 R.K RNS ROL RX1 SA0 SUPJJ TH9 TN5 TUS UB1 V8K W8V W99 WBKPD WIH WIK WOHZO WQJ WRC WXSBR WYISQ XBAML XG1 ZZTAW ~IA ~WT AAWIL AAYXX ABAWQ ACHJO AEYWJ AGHNM AGLNM AGQPQ AGYGG AIHAF AMVHM CITATION AAMMB AEFGJ AGXDD AIDQK AIDYY NPM 7SC 8FD H8D JQ2 L7M L~C L~D 7X8 5PM |
ID | FETCH-LOGICAL-c4495-cea5c862cb35f6df8eec1379e0abc9ceec39815a15271768adc84039356aabd33 |
IEDL.DBID | DR2 |
ISSN | 0303-6898 |
IngestDate | Thu Aug 21 18:36:33 EDT 2025 Fri Jul 11 05:06:30 EDT 2025 Fri Jul 25 19:43:38 EDT 2025 Fri Aug 15 02:01:13 EDT 2025 Tue Jul 01 01:27:17 EDT 2025 Thu Apr 24 23:12:33 EDT 2025 Wed Jan 22 17:19:11 EST 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 2 |
Keywords | Pairwise distance Semiparametric efficient influence function (EIF) Dimension reduction Robust inference Multivariable regression |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c4495-cea5c862cb35f6df8eec1379e0abc9ceec39815a15271768adc84039356aabd33 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 AUTHOR CONTRIBUTIONS JL: methodology, formal analysis, and writing - original draft. XZ: methodology, supervision, writing – review, and editing. TL, RC: conceptualization, writing – review, and editing. YZ, TC, TW, CL: software and visualization. AH, TN, EL, DJ: data curation, resources, writing – review, and editing. XT: methodology, resources, supervision, writing – review, and editing. All authors reviewed and approved the final manuscript. |
ORCID | 0000-0001-6689-8245 |
OpenAccessLink | https://www.ncbi.nlm.nih.gov/pmc/articles/11296665 |
PMID | 39101047 |
PQID | 3052209455 |
PQPubID | 30873 |
PageCount | 696 |
ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_11296665 proquest_miscellaneous_3088560361 proquest_journals_3052209455 pubmed_primary_39101047 crossref_primary_10_1111_sjos_12695 crossref_citationtrail_10_1111_sjos_12695 wiley_primary_10_1111_sjos_12695_SJOS12695 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | June 2024 |
PublicationDateYYYYMMDD | 2024-06-01 |
PublicationDate_xml | – month: 06 year: 2024 text: June 2024 |
PublicationDecade | 2020 |
PublicationPlace | England |
PublicationPlace_xml | – name: England – name: Oxford |
PublicationTitle | Scandinavian journal of statistics |
PublicationTitleAlternate | Scand Stat Theory Appl |
PublicationYear | 2024 |
Publisher | Blackwell Publishing Ltd |
Publisher_xml | – name: Blackwell Publishing Ltd |
References | 2002; 15 2021; 48 2015; 6 2007; 449 2021; 23 1989; 21 2000; 3 2007; 188 2011; 40 2007 2001; 26 1950 2019; 108 1994; 48 1967; 27 2018; 45 2018; 21 2022; 28 2016; 13 1992; 30 2014; 42 2010; 66 2021; 78 2021; 12 2022 2019; 46 2020; 71 2018; 113 1991; 20 2015; 112 2011; 73 2008; 714 2013; 80 2020; 117 2011; 22 2014 2005; 71 2023; 118 2009; 3 2001; 57 2007; 22 2006; 101 1968 2003; 482 2019; 176 e_1_2_10_23_1 e_1_2_10_46_1 e_1_2_10_24_1 e_1_2_10_45_1 e_1_2_10_21_1 e_1_2_10_44_1 e_1_2_10_43_1 Kowalski J. (e_1_2_10_22_1) 2008 e_1_2_10_41_1 Hinton G. E. (e_1_2_10_19_1) 2002; 15 Katz D. (e_1_2_10_20_1) 1950 e_1_2_10_4_1 e_1_2_10_18_1 e_1_2_10_3_1 Cox T. F. (e_1_2_10_9_1) 1991 e_1_2_10_6_1 e_1_2_10_16_1 Gower J. C. (e_1_2_10_17_1) 2014 e_1_2_10_39_1 e_1_2_10_5_1 e_1_2_10_38_1 e_1_2_10_8_1 e_1_2_10_14_1 e_1_2_10_37_1 e_1_2_10_7_1 e_1_2_10_15_1 e_1_2_10_36_1 e_1_2_10_12_1 e_1_2_10_35_1 e_1_2_10_13_1 e_1_2_10_34_1 e_1_2_10_10_1 e_1_2_10_33_1 e_1_2_10_11_1 e_1_2_10_32_1 e_1_2_10_31_1 e_1_2_10_30_1 Van der Vaart A. W. (e_1_2_10_42_1) 2000 Agresti A. (e_1_2_10_2_1) 2003 Tsiatis A. (e_1_2_10_40_1) 2007 e_1_2_10_29_1 e_1_2_10_27_1 e_1_2_10_28_1 e_1_2_10_25_1 e_1_2_10_26_1 |
References_xml | – volume: 188 start-page: 117 year: 2007 end-page: 131 article-title: Multiple regression on distance matrices: A multivariate spatial analysis tool publication-title: Plant Ecology – volume: 48 start-page: 1487 year: 1994 end-page: 1499 article-title: Modeling brain evolution from behavior: A permutational regression approach publication-title: Evolution – volume: 449 start-page: 804 year: 2007 end-page: 810 article-title: The human microbiome project publication-title: Nature – volume: 22 start-page: 1 year: 2007 end-page: 19 article-title: The ecodist package for dissimilarity‐based analysis of ecological data publication-title: Journal of Statistical Software – volume: 118 start-page: 869 issue: 542 year: 2023 end-page: 882 article-title: Wasserstein regression publication-title: Journal of the American Statistical Association – volume: 20 start-page: 2943 issue: 9 year: 1991 end-page: 2953 – volume: 3 year: 2000 – volume: 117 start-page: 18924 year: 2020 end-page: 18933 article-title: Polygenic inheritance, gwas, polygenic risk scores, and the search for functional variants publication-title: Proceedings of the National Academy of Sciences – year: 2007 – volume: 42 start-page: 413 year: 2014 article-title: A significance test for the lasso publication-title: Annals of Statistics – volume: 482 year: 2003 – volume: 30 start-page: 473 issue: 6 year: 1992 end-page: 483 article-title: The MOS 36‐item short‐form health survey (SF‐36) publication-title: Medical Care – volume: 6 start-page: 223 year: 2015 article-title: Fisher, Neyman‐Pearson or nhst? A tutorial for teaching data testing publication-title: Frontiers in Psychology – volume: 45 start-page: 2548 year: 2018 end-page: 2562 article-title: Modern variable selection for longitudinal semi‐parametric models with missing data publication-title: Journal of Applied Statistics – volume: 71 start-page: 8228 year: 2005 end-page: 8235 article-title: Unifrac: A new phylogenetic method for comparing microbial communities publication-title: Applied and Environmental Microbiology – year: 1950 – volume: 3 start-page: 458 year: 2009 article-title: Multilevel functional principal component analysis publication-title: The Annals of Applied Statistics – volume: 57 start-page: 1173 year: 2001 end-page: 1184 article-title: Shrinkage estimators for covariance matrices publication-title: Biometrics – start-page: 325 year: 1968 end-page: 346 article-title: Asymptotic normality of simple linear rank statistics under alternatives publication-title: The Annals of Mathematical Statistics – start-page: 1 year: 2014 end-page: 7 article-title: Principal coordinates analysis publication-title: Wiley StatsRef: Statistics Reference Online – volume: 15 year: 2002 article-title: Stochastic neighbor embedding publication-title: Advances in Neural Information Processing Systems – volume: 46 start-page: 686 year: 2019 end-page: 705 article-title: A factor model approach for the joint segmentation with between‐series correlation publication-title: Scandinavian Journal of Statistics – volume: 71 start-page: 522 year: 2020 end-page: 538 article-title: Intestinal fungal dysbiosis and systemic immune response to fungi in patients with alcoholic hepatitis publication-title: Hepatology – volume: 12 start-page: 395 year: 2021 article-title: Association of loneliness and wisdom with gut microbial diversity and composition: An exploratory study publication-title: Frontiers in Psychiatry – volume: 66 start-page: 636 year: 2010 end-page: 643 article-title: On distance‐based permutation tests for between‐group comparisons publication-title: Biometrics – volume: 73 start-page: 273 year: 2011 end-page: 282 article-title: Regression shrinkage and selection via the lasso: A retrospective publication-title: Journal of the Royal Statistical Society: Series B (Statistical Methodology) – volume: 23 year: 2021 article-title: Circadian rhythm analysis using wearable device data: Novel penalized machine learning approach publication-title: Journal of Medical Internet Research – volume: 27 start-page: 209 year: 1967 end-page: 220 article-title: The detection of disease clustering and a generalized regression approach publication-title: Cancer Research – volume: 78 start-page: 950 issue: 3 year: 2021 end-page: 962 article-title: A semiparametric model for between‐subject attributes: Applications to beta‐diversity of microbiome data publication-title: Biometrics – volume: 21 start-page: 787 year: 1989 end-page: 790 article-title: Measures of location of compositional data sets publication-title: Mathematical Geology – volume: 26 start-page: 32 year: 2001 end-page: 46 article-title: A new method for non‐parametric multivariate analysis of variance publication-title: Austral Ecology – year: 2022 – volume: 108 start-page: 40 year: 2019 end-page: 47 article-title: A new scale for assessing wisdom based on common domains and a neurobiological model: The San Diego wisdom scale (sd‐wise) publication-title: Journal of Psychiatric Research – volume: 80 start-page: 62 year: 2013 end-page: 79 article-title: The Wu‐Minn human connectome project: An overview publication-title: NeuroImage – volume: 13 year: 2016 article-title: The rise of consumer health wearables: Promises and barriers publication-title: PLoS Medicine – volume: 176 start-page: 512 year: 2019 end-page: 520 article-title: Cerebellar‐prefrontal network connectivity and negative symptoms in schizophrenia publication-title: American Journal of Psychiatry – volume: 40 start-page: 1 year: 2011 end-page: 18 article-title: Rcpp: Seamless r and c++ integration publication-title: Journal of Statistical Software – volume: 21 start-page: C1 year: 2018 end-page: C68 article-title: Double/debiased machine learning for treatment and structural parameters publication-title: The Econometrics Journal – volume: 113 start-page: 1228 year: 2018 end-page: 1242 article-title: Estimation and inference of heterogeneous treatment effects using random forests publication-title: Journal of the American Statistical Association – volume: 28 start-page: 303 year: 2022 end-page: 314 article-title: Microbiome and metabolome features of the cardiometabolic disease spectrum publication-title: Nature Medicine – volume: 22 start-page: 1296 year: 2011 end-page: 1303 article-title: Emergence of perceptual gestalts in the human visual cortex: The case of the configural‐superiority effect publication-title: Psychological Science – volume: 112 start-page: 7629 year: 2015 end-page: 7634 article-title: Statistical learning and selective inference publication-title: Proceedings of the National Academy of Sciences – volume: 48 start-page: 729 year: 2021 end-page: 760 article-title: Clustering with statistical error control publication-title: Scandinavian Journal of Statistics – volume: 101 start-page: 1418 year: 2006 end-page: 1429 article-title: The adaptive lasso and its oracle properties publication-title: Journal of the American Statistical Association – volume: 714 year: 2008 – volume-title: Asymptotic statistics year: 2000 ident: e_1_2_10_42_1 – ident: e_1_2_10_16_1 doi: 10.18637/jss.v022.i07 – volume-title: Categorical data analysis year: 2003 ident: e_1_2_10_2_1 – ident: e_1_2_10_11_1 doi: 10.1111/j.0006-341X.2001.01173.x – ident: e_1_2_10_45_1 doi: 10.1080/01621459.2017.1319839 – ident: e_1_2_10_15_1 doi: 10.1038/s41591-022-01688-4 – ident: e_1_2_10_30_1 doi: 10.1214/13-AOS1175 – ident: e_1_2_10_28_1 – ident: e_1_2_10_29_1 doi: 10.1111/biom.13487 – ident: e_1_2_10_27_1 doi: 10.1007/s11258-006-9126-3 – volume: 15 year: 2002 ident: e_1_2_10_19_1 article-title: Stochastic neighbor embedding publication-title: Advances in Neural Information Processing Systems – ident: e_1_2_10_21_1 doi: 10.1080/02664763.2018.1426739 – ident: e_1_2_10_34_1 doi: 10.3389/fpsyg.2015.00223 – ident: e_1_2_10_37_1 doi: 10.1073/pnas.1507583112 – ident: e_1_2_10_10_1 doi: 10.1073/pnas.2005634117 – ident: e_1_2_10_38_1 doi: 10.1016/j.jpsychires.2017.09.005 – ident: e_1_2_10_36_1 doi: 10.1111/j.1541-0420.2009.01300.x – ident: e_1_2_10_25_1 doi: 10.1111/j.1558-5646.1994.tb02191.x – ident: e_1_2_10_41_1 doi: 10.1038/nature06244 – ident: e_1_2_10_23_1 doi: 10.1177/0956797611417000 – ident: e_1_2_10_4_1 doi: 10.1111/j.1442-9993.2001.01070.pp.x – ident: e_1_2_10_26_1 doi: 10.2196/18403 – ident: e_1_2_10_39_1 doi: 10.1111/j.1467-9868.2011.00771.x – start-page: 1 year: 2014 ident: e_1_2_10_17_1 article-title: Principal coordinates analysis publication-title: Wiley StatsRef: Statistics Reference Online – volume-title: Modern applied U‐statistics year: 2008 ident: e_1_2_10_22_1 – ident: e_1_2_10_32_1 doi: 10.1111/j.1095-8312.1998.tb01520.x – start-page: 2943 volume-title: Communications in Statistics‐Theory and Methods year: 1991 ident: e_1_2_10_9_1 – volume-title: Gestalt psychology: Its nature and significance year: 1950 ident: e_1_2_10_20_1 – ident: e_1_2_10_44_1 doi: 10.1111/sjos.12450 – ident: e_1_2_10_13_1 doi: 10.18637/jss.v040.i08 – ident: e_1_2_10_7_1 doi: 10.1111/ectj.12097 – ident: e_1_2_10_3_1 doi: 10.1007/BF00893322 – ident: e_1_2_10_6_1 doi: 10.1080/01621459.2021.1956937 – ident: e_1_2_10_5_1 doi: 10.1176/appi.ajp.2018.18040429 – ident: e_1_2_10_12_1 doi: 10.1214/08-AOAS206 – ident: e_1_2_10_18_1 doi: 10.1214/aoms/1177698394 – volume-title: Semiparametric theory and missing data year: 2007 ident: e_1_2_10_40_1 – ident: e_1_2_10_8_1 doi: 10.1111/sjos.12368 – ident: e_1_2_10_24_1 doi: 10.1002/hep.30832 – ident: e_1_2_10_14_1 doi: 10.1097/00005650-199206000-00002 – ident: e_1_2_10_31_1 doi: 10.1128/AEM.71.12.8228-8235.2005 – ident: e_1_2_10_35_1 doi: 10.1371/journal.pmed.1001953 – ident: e_1_2_10_33_1 doi: 10.3389/fpsyt.2021.648475 – ident: e_1_2_10_43_1 doi: 10.1016/j.neuroimage.2013.05.041 – ident: e_1_2_10_46_1 doi: 10.1198/016214506000000735 |
SSID | ssj0016418 |
Score | 2.3602247 |
Snippet | This article proposes a distance‐based framework incentivized by the paradigm shift toward feature aggregation for high‐dimensional data, which does not rely... This article proposes a distance-based framework incentivized by the paradigm shift towards feature aggregation for high-dimensional data, which does not rely... |
SourceID | pubmedcentral proquest pubmed crossref wiley |
SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source Publisher |
StartPage | 672 |
SubjectTerms | dimension reduction Feasibility studies Inference Influence functions multivariable regression pairwise distance Permutations robust inference semiparametric efficient influence function (EIF) |
Title | A new paradigm for high‐dimensional data: Distance‐based semiparametric feature aggregation framework via between‐subject attributes |
URI | https://onlinelibrary.wiley.com/doi/abs/10.1111%2Fsjos.12695 https://www.ncbi.nlm.nih.gov/pubmed/39101047 https://www.proquest.com/docview/3052209455 https://www.proquest.com/docview/3088560361 https://pubmed.ncbi.nlm.nih.gov/PMC11296665 |
Volume | 51 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwEB4hTvTQB32wLUWu6KVIWRE78TpVL4iHEBKtVEDiUkW242y3sNmK7PbAiXNP_Y39JczYSWBLVYneInmchz3j-cYZfwPwNqHDi0aWkcs2kygpcC4y7XgUO245uhhcDj3b50e5f5IcnKanC_ChPQsT-CG6DTeyDL9ek4FrU98y8vrbpO7HXGZ0wpyStQgRfe64ozAM8Jt7qMQikipTDTcppfHcdJ33Rncg5t1MydsI1rugvUfwpX35kHly1p9NTd9e_sHr-L9f9xgeNtiUbQVlegILrlqGB4cdsWu9DEsETgO381P4ucUQlDNiDy9GwzFD_MuI_vj31a-CigYEwg9GSajv2Q4BVdQwbCTPWbDajUfUdUw1vSwrnacYZXo4vHBDry-sbDPH2I-RZk1KGd6gnhnaPmJ6Gup1ufoZnOztHm_vR01th8gmGJNF1unUYjRljUhLWZTKORuLQeY2tbEZem4rMhWnmqruxhgS6cJiKErniKXWphDiOSxWk8qtAEsHRDGvYj3QOimkNCYzCTcp10oqoWQP3rVznNuG-Jzqb5znbQBEg537we7Beif7PdB9_FVqtVWVvDH5OseFk3MMllNsftM1o7HSHxhducmMZJRCiClk3IMXQbO6xwgEbsSb0QM1p3OdABGBz7dUo6-eEJwwM4ah-OANr1P_ePX86ODTkb96eR_hV7DEEc2FHLlVWJxezNxrRGNTs-at7hqZGzhP |
linkProvider | Wiley-Blackwell |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Nb9QwEB1BOdAeChQKW0oxoheQsmrsxOtwqyjVUtoi0VbqLbIdZ9nCZlGzy4ETZ078Rn4JM3Y27dIKCW6RPM6HPeN544zfAGwmdHjRyDJy2VYSJQXORaYdj2LHLUcXg8uhZ_s8lP2TZO80PW1yc-gsTOCHaDfcyDL8ek0GThvSl6y8PhvX3ZjLLL0Jt6ikN9nlzoeWPQoDAb-9h2osIqky1bCTUiLPRd95f3QFZF7NlbyMYb0T2r0TKq3WnruQck8-dacT07Xf_mB2_O_vuwvLDTxl20Gf7sENV63A0kHL7VqvwCLh00DvfB9-bDPE5YwIxIvhYMQQAjNiQP71_WdBdQMC5wejPNRXbIewKioZNpLzLFjtRkPqOqKyXpaVzrOMMj0YnLuBVxlWzpLH2NehZk1WGd6gnhraQWJ6Ekp2ufoBnOy-OX7dj5ryDpFNMCyLrNOpxYDKGpGWsiiVczYWvcxtaWMzdN5WZCpONRXejTEq0oXFaJSOEkutTSHEKixU48o9Apb2iGVexbqndVJIaUxmEm5SrpVUQskOvJhNcm4b7nMqwfE5n8VANNi5H-wOPG9lvwTGj2ul1me6kjdWX-e4dnKO8XKKzc_aZrRX-gmjKzeekoxSiDKFjDvwMKhW-xiB2I2oMzqg5pSuFSAu8PmWavjRc4ITbMZIFB_80ivVX149P9p7f-Sv1v5F-Cnc7h8f7Of7bw_fPYZFjuAupMytw8LkfOqeIDibmA1vgr8B0pE8aA |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9NAEB6VVkLlwKM8GlpgEVxAchSv7c0acalIo1KgIEqlXiprd70OKcSp6qQHTpw58Rv5Jczs2qahCAlulnb8iD2z832b2W8AHse0eVGLIrBpLw7iHL9FqiwPQssNxxSD06FT-9wTOwfx7mFyuATPm70wXh-iXXCjyHDzNQX4SV6cC_LqeFp1Qy7S5BKsxKIniXoN3rfiUcgD3OoeenEUCJnKWpyU6nh-nbuYji5gzIulkuchrMtBw2tw1Dy9Lz351J3PdNd8-U3Y8X9_3nW4WoNTtuW96QYs2XINrrxplV2rNVgldOrFnW_Cty2GqJyRfHg-Hk0YAmBG-sc_vn7PqWuAV_xgVIX6jA0IqaKL4SClzpxVdjKmUyfU1MuwwjqNUaZGo1M7cg7DiqZ0jJ2NFatryvAC1VzT-hFTM9-wy1a34GC4_eHFTlA3dwhMjKQsMFYlBumU0VFSiLyQ1pow6qe2p7RJMXWbKJVhoqjtboicSOUGuShtJBZK6TyKbsNyOS3tOrCkTxrzMlR9peJcCK1THXOdcCWFjKTowJPmG2emVj6nBhyfs4YB0cvO3MvuwKPW9sTrffzRarNxlayO-SrDmZNzZMsJDj9shzFa6S8YVdrpnGykRIwZibADd7xntbeJELmRcEYH5ILPtQakBL44Uo4_OkVwAs3IQ_HGT51P_eXRs_3dt_vu6O6_GD-Ay-8Gw-z1y71XG7DKEdn5erlNWJ6dzu09RGYzfd8F4E8pJDsg |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+New+Paradigm+for+High-dimensional+Data%3A+Distance-Based+Semiparametric+Feature+Aggregation+Framework+via+Between-Subject+Attributes&rft.jtitle=Scandinavian+journal+of+statistics&rft.au=Liu%2C+Jinyuan&rft.au=Zhang%2C+Xinlian&rft.au=Lin%2C+Tuo&rft.au=Chen%2C+Ruohui&rft.date=2024-06-01&rft.issn=0303-6898&rft.volume=51&rft.issue=2&rft.spage=672&rft_id=info:doi/10.1111%2Fsjos.12695&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0303-6898&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0303-6898&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0303-6898&client=summon |