A new paradigm for high‐dimensional data: Distance‐based semiparametric feature aggregation framework via between‐subject attributes

This article proposes a distance‐based framework incentivized by the paradigm shift toward feature aggregation for high‐dimensional data, which does not rely on the sparse‐feature assumption or the permutation‐based inference. Focusing on distance‐based outcomes that preserve information without tru...

Full description

Saved in:
Bibliographic Details
Published inScandinavian journal of statistics Vol. 51; no. 2; pp. 672 - 696
Main Authors Liu, Jinyuan, Zhang, Xinlian, Lin, Tuo, Chen, Ruohui, Zhong, Yuan, Chen, Tian, Wu, Tsungchin, Liu, Chenyu, Huang, Anna, Nguyen, Tanya T., Lee, Ellen E., Jeste, Dilip V., Tu, Xin M.
Format Journal Article
LanguageEnglish
Published England Blackwell Publishing Ltd 01.06.2024
Subjects
Online AccessGet full text
ISSN0303-6898
1467-9469
DOI10.1111/sjos.12695

Cover

Loading…
Abstract This article proposes a distance‐based framework incentivized by the paradigm shift toward feature aggregation for high‐dimensional data, which does not rely on the sparse‐feature assumption or the permutation‐based inference. Focusing on distance‐based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high‐dimensional variables using pairwise outcomes of between‐subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U‐statistics‐based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root‐n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model's interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands.
AbstractList This article proposes a distance-based framework incentivized by the paradigm shift towards feature aggregation for high-dimensional data, which does not rely on the sparse-feature assumption or the permutation-based inference. Focusing on distance-based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high-dimensional variables using pairwise outcomes of between-subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U-statistics-based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root-n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model’s interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands.
This article proposes a distance‐based framework incentivized by the paradigm shift toward feature aggregation for high‐dimensional data, which does not rely on the sparse‐feature assumption or the permutation‐based inference. Focusing on distance‐based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high‐dimensional variables using pairwise outcomes of between‐subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U‐statistics‐based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root‐n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model's interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands.
This article proposes a distance-based framework incentivized by the paradigm shift towards feature aggregation for high-dimensional data, which does not rely on the sparse-feature assumption or the permutation-based inference. Focusing on distance-based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high-dimensional variables using pairwise outcomes of between-subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U-statistics-based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root-n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model's interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands.This article proposes a distance-based framework incentivized by the paradigm shift towards feature aggregation for high-dimensional data, which does not rely on the sparse-feature assumption or the permutation-based inference. Focusing on distance-based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high-dimensional variables using pairwise outcomes of between-subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U-statistics-based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root-n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model's interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands.
This article proposes a distance‐based framework incentivized by the paradigm shift toward feature aggregation for high‐dimensional data, which does not rely on the sparse‐feature assumption or the permutation‐based inference. Focusing on distance‐based outcomes that preserve information without truncating any features, a class of semiparametric regression has been developed, which encapsulates multiple sources of high‐dimensional variables using pairwise outcomes of between‐subject attributes. Further, we propose a strategy to address the interlocking correlations among pairs via the U‐statistics‐based estimating equations (UGEE), which correspond to their unique efficient influence function (EIF). Hence, the resulting semiparametric estimators are robust to distributional misspecification while enjoying root‐ n consistency and asymptotic optimality to facilitate inference. In essence, the proposed approach not only circumvents information loss due to feature selection but also improves the model's interpretability and computational feasibility. Simulation studies and applications to the human microbiome and wearables data are provided, where the feature dimensions are tens of thousands.
Author Tu, Xin M.
Chen, Ruohui
Liu, Jinyuan
Zhang, Xinlian
Huang, Anna
Chen, Tian
Lin, Tuo
Liu, Chenyu
Jeste, Dilip V.
Lee, Ellen E.
Zhong, Yuan
Nguyen, Tanya T.
Wu, Tsungchin
AuthorAffiliation 4 Takeda Pharmaceuticals Cambridge, Massachusetts, U.S.A
3 Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, U.S.A
9 Stein Institute for Research on Aging, UC San Diego, San Diego, California, U.S.A
7 Center for Microbiome Innovation, UC San Diego, San Diego, California, U.S.A
2 Department of Family Medicine and Public Health, UC San Diego, San Diego, California, U.S.A
5 Department of Psychiatry, Vanderbilt University, Nashville, Tennessee, U.S.A
1 Department of Biostatistics, Vanderbilt University, Nashville, Tennessee, U.S.A
6 Veterans Affairs San Diego Healthcare System, La Jolla, California, U.S.A
8 Department of Psychiatry, UC San Diego, San Diego, California, U.S.A
AuthorAffiliation_xml – name: 4 Takeda Pharmaceuticals Cambridge, Massachusetts, U.S.A
– name: 6 Veterans Affairs San Diego Healthcare System, La Jolla, California, U.S.A
– name: 9 Stein Institute for Research on Aging, UC San Diego, San Diego, California, U.S.A
– name: 1 Department of Biostatistics, Vanderbilt University, Nashville, Tennessee, U.S.A
– name: 3 Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, U.S.A
– name: 2 Department of Family Medicine and Public Health, UC San Diego, San Diego, California, U.S.A
– name: 5 Department of Psychiatry, Vanderbilt University, Nashville, Tennessee, U.S.A
– name: 7 Center for Microbiome Innovation, UC San Diego, San Diego, California, U.S.A
– name: 8 Department of Psychiatry, UC San Diego, San Diego, California, U.S.A
Author_xml – sequence: 1
  givenname: Jinyuan
  orcidid: 0000-0001-6689-8245
  surname: Liu
  fullname: Liu, Jinyuan
  organization: Vanderbilt University
– sequence: 2
  givenname: Xinlian
  surname: Zhang
  fullname: Zhang, Xinlian
  email: xizhang@health.ucsd.edu
  organization: UC San Diego
– sequence: 3
  givenname: Tuo
  surname: Lin
  fullname: Lin, Tuo
  organization: UC San Diego
– sequence: 4
  givenname: Ruohui
  surname: Chen
  fullname: Chen, Ruohui
  organization: UC San Diego
– sequence: 5
  givenname: Yuan
  surname: Zhong
  fullname: Zhong, Yuan
  organization: University of Michigan
– sequence: 6
  givenname: Tian
  surname: Chen
  fullname: Chen, Tian
  organization: Takeda Pharmaceuticals
– sequence: 7
  givenname: Tsungchin
  surname: Wu
  fullname: Wu, Tsungchin
  organization: UC San Diego
– sequence: 8
  givenname: Chenyu
  surname: Liu
  fullname: Liu, Chenyu
  organization: UC San Diego
– sequence: 9
  givenname: Anna
  surname: Huang
  fullname: Huang, Anna
  organization: Vanderbilt University
– sequence: 10
  givenname: Tanya T.
  surname: Nguyen
  fullname: Nguyen, Tanya T.
  organization: UC San Diego
– sequence: 11
  givenname: Ellen E.
  surname: Lee
  fullname: Lee, Ellen E.
  organization: UC San Diego
– sequence: 12
  givenname: Dilip V.
  surname: Jeste
  fullname: Jeste, Dilip V.
  organization: UC San Diego
– sequence: 13
  givenname: Xin M.
  surname: Tu
  fullname: Tu, Xin M.
  organization: UC San Diego
BackLink https://www.ncbi.nlm.nih.gov/pubmed/39101047$$D View this record in MEDLINE/PubMed
BookMark eNp9ks1u1DAUhS1URH9gwwMgS2wQ0hQ7jj0xG1SVf1XqorC2bpybjIckHmyno-5Ys-IZeRKcTqmgQnhjWec7R_f63kOyN_oRCXnM2THP50Vc-3jMC6XlPXLAS7Vc6FLpPXLABBMLVelqnxzGuGaMq5JXD8i-0JxxVi4PyPcTOuKWbiBA47qBtj7QletWP7_9aNyAY3R-hJ42kOAlfe1igtFiFmuI2NCIg5utA6bgLG0R0hSQQtcF7CBlL21ndevDF3rpgNaYtohjDohTvUabKKRsraeE8SG530If8dHNfUQ-v33z6fT94uz83YfTk7OFLUstFxZB2koVthayVU1bIVoulhoZ1Fbb_BK64hK4LJZ8qSpobFUyoYVUAHUjxBF5tcvdTPWAjcUxBejNJrgBwpXx4MzfyuhWpvOXhvNCK6VkTnh2kxD81wljMoOLFvseRvRTNIJVlVRMKJ7Rp3fQtZ9C_tKZkkXBdCnnwCd_lnRby-85ZeD5DrDBxxiwvUU4M_MSmHkJzPUSZJjdga1L19PI7bj-3xa-s2xdj1f_CTcXH88vdp5fCSvLrA
CitedBy_id crossref_primary_10_6339_25_JDS1169
Cites_doi 10.18637/jss.v022.i07
10.1111/j.0006-341X.2001.01173.x
10.1080/01621459.2017.1319839
10.1038/s41591-022-01688-4
10.1214/13-AOS1175
10.1111/biom.13487
10.1007/s11258-006-9126-3
10.1080/02664763.2018.1426739
10.3389/fpsyg.2015.00223
10.1073/pnas.1507583112
10.1073/pnas.2005634117
10.1016/j.jpsychires.2017.09.005
10.1111/j.1541-0420.2009.01300.x
10.1111/j.1558-5646.1994.tb02191.x
10.1038/nature06244
10.1177/0956797611417000
10.1111/j.1442-9993.2001.01070.pp.x
10.2196/18403
10.1111/j.1467-9868.2011.00771.x
10.1111/j.1095-8312.1998.tb01520.x
10.1111/sjos.12450
10.18637/jss.v040.i08
10.1111/ectj.12097
10.1007/BF00893322
10.1080/01621459.2021.1956937
10.1176/appi.ajp.2018.18040429
10.1214/08-AOAS206
10.1214/aoms/1177698394
10.1111/sjos.12368
10.1002/hep.30832
10.1097/00005650-199206000-00002
10.1128/AEM.71.12.8228-8235.2005
10.1371/journal.pmed.1001953
10.3389/fpsyt.2021.648475
10.1016/j.neuroimage.2013.05.041
10.1198/016214506000000735
ContentType Journal Article
Copyright 2023 Board of the Foundation of the Scandinavian Journal of Statistics.
2024 Board of the Foundation of the Scandinavian Journal of Statistics
Copyright_xml – notice: 2023 Board of the Foundation of the Scandinavian Journal of Statistics.
– notice: 2024 Board of the Foundation of the Scandinavian Journal of Statistics
DBID AAYXX
CITATION
NPM
7SC
8FD
H8D
JQ2
L7M
L~C
L~D
7X8
5PM
DOI 10.1111/sjos.12695
DatabaseName CrossRef
PubMed
Computer and Information Systems Abstracts
Technology Research Database
Aerospace Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle CrossRef
PubMed
Aerospace Database
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitleList
Aerospace Database
MEDLINE - Academic
CrossRef

PubMed
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Statistics
Mathematics
EISSN 1467-9469
EndPage 696
ExternalDocumentID PMC11296665
39101047
10_1111_sjos_12695
SJOS12695
Genre article
Journal Article
GrantInformation_xml – fundername: UC San Diego Center for Healthy Aging
– fundername: National Institute of Mental Health
  funderid: K23/MH118435; K23/MH119375‐01; R01/MH094151
– fundername: NIMH NIH HHS
  grantid: R01 MH135147
– fundername: NIMH NIH HHS
  grantid: K23 MH118435
– fundername: NIMH NIH HHS
  grantid: K23 MH119375
– fundername: NIMH NIH HHS
  grantid: R01 MH094151
GroupedDBID -ET
.3N
.GA
.GJ
.L6
.Y3
05W
0R~
10A
123
1OC
3-9
31~
33P
3SF
4.4
44B
50Y
50Z
51W
51X
52M
52N
52O
52P
52S
52T
52U
52W
52X
5HH
5LA
5VS
66C
6OB
702
7PT
8-0
8-1
8-3
8-4
8-5
8UM
8V8
930
A03
AAESR
AAEVG
AAHHS
AAHQN
AAKYL
AAMNL
AANHP
AANLZ
AAONW
AASGY
AAXRX
AAYCA
AAZKR
ABBHK
ABCQN
ABCUV
ABDBF
ABEML
ABFAN
ABIVO
ABJNI
ABPVW
ABQDR
ABXSQ
ABYWD
ACAHQ
ACBWZ
ACCFJ
ACCZN
ACDIW
ACGFS
ACIWK
ACMTB
ACPOU
ACRPL
ACSCC
ACTMH
ACUHS
ACXBN
ACXQS
ACYXJ
ADBBV
ADEOM
ADIZJ
ADKYN
ADMGS
ADNMO
ADODI
ADOZA
ADULT
ADXAS
ADZMN
AEEZP
AEGXH
AEIGN
AEIMD
AELLO
AELPN
AEMOZ
AENEX
AEQDE
AEUPB
AEUQT
AEUYR
AFBPY
AFEBI
AFFPM
AFGKR
AFPWT
AFVYC
AFWVQ
AFZJQ
AHBTC
AHQJS
AITYG
AIURR
AIWBW
AJBDE
AJXKR
AKBRZ
AKVCP
ALAGY
ALMA_UNASSIGNED_HOLDINGS
ALRMG
ALUQN
ALVPJ
AMBMR
AMYDB
ASPBG
AS~
ATUGU
AUFTA
AVWKF
AZBYB
AZFZN
AZVAB
BAFTC
BDRZF
BFHJK
BHBCM
BHOJU
BMNLL
BMXJE
BNHUX
BROTX
BRXPI
BY8
CAG
COF
CS3
D-E
D-F
DCZOG
DPXWK
DQDLB
DR2
DRFUL
DRSTM
DSRWC
DU5
EAD
EAP
EBA
EBO
EBR
EBS
EBU
ECEWR
EJD
EMK
EST
ESX
F00
F01
F04
F5P
FEDTE
G-S
G.N
GIFXF
GODZA
H.T
H.X
HF~
HGLYW
HQ6
HVGLF
HZI
HZ~
IHE
IPSME
IX1
J0M
JAA
JAAYA
JBMMH
JBZCM
JENOY
JHFFW
JKQEH
JLEZI
JLXEF
JMS
JPL
JSODD
JST
K1G
K48
LATKE
LC2
LC3
LEEKS
LH4
LITHE
LOXES
LP6
LP7
LUTES
LW6
LYRES
MEWTI
MK4
MRFUL
MRSTM
MSFUL
MSSTM
MXFUL
MXSTM
N04
N05
N9A
NF~
O66
O9-
OIG
P2W
P2X
P4D
PQQKQ
Q.N
Q11
QB0
R.K
RNS
ROL
RX1
SA0
SUPJJ
TH9
TN5
TUS
UB1
V8K
W8V
W99
WBKPD
WIH
WIK
WOHZO
WQJ
WRC
WXSBR
WYISQ
XBAML
XG1
ZZTAW
~IA
~WT
AAWIL
AAYXX
ABAWQ
ACHJO
AEYWJ
AGHNM
AGLNM
AGQPQ
AGYGG
AIHAF
AMVHM
CITATION
AAMMB
AEFGJ
AGXDD
AIDQK
AIDYY
NPM
7SC
8FD
H8D
JQ2
L7M
L~C
L~D
7X8
5PM
ID FETCH-LOGICAL-c4495-cea5c862cb35f6df8eec1379e0abc9ceec39815a15271768adc84039356aabd33
IEDL.DBID DR2
ISSN 0303-6898
IngestDate Thu Aug 21 18:36:33 EDT 2025
Fri Jul 11 05:06:30 EDT 2025
Fri Jul 25 19:43:38 EDT 2025
Fri Aug 15 02:01:13 EDT 2025
Tue Jul 01 01:27:17 EDT 2025
Thu Apr 24 23:12:33 EDT 2025
Wed Jan 22 17:19:11 EST 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Keywords Pairwise distance
Semiparametric efficient influence function (EIF)
Dimension reduction
Robust inference
Multivariable regression
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c4495-cea5c862cb35f6df8eec1379e0abc9ceec39815a15271768adc84039356aabd33
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
AUTHOR CONTRIBUTIONS
JL: methodology, formal analysis, and writing - original draft. XZ: methodology, supervision, writing – review, and editing. TL, RC: conceptualization, writing – review, and editing. YZ, TC, TW, CL: software and visualization. AH, TN, EL, DJ: data curation, resources, writing – review, and editing. XT: methodology, resources, supervision, writing – review, and editing. All authors reviewed and approved the final manuscript.
ORCID 0000-0001-6689-8245
OpenAccessLink https://www.ncbi.nlm.nih.gov/pmc/articles/11296665
PMID 39101047
PQID 3052209455
PQPubID 30873
PageCount 696
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_11296665
proquest_miscellaneous_3088560361
proquest_journals_3052209455
pubmed_primary_39101047
crossref_primary_10_1111_sjos_12695
crossref_citationtrail_10_1111_sjos_12695
wiley_primary_10_1111_sjos_12695_SJOS12695
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate June 2024
PublicationDateYYYYMMDD 2024-06-01
PublicationDate_xml – month: 06
  year: 2024
  text: June 2024
PublicationDecade 2020
PublicationPlace England
PublicationPlace_xml – name: England
– name: Oxford
PublicationTitle Scandinavian journal of statistics
PublicationTitleAlternate Scand Stat Theory Appl
PublicationYear 2024
Publisher Blackwell Publishing Ltd
Publisher_xml – name: Blackwell Publishing Ltd
References 2002; 15
2021; 48
2015; 6
2007; 449
2021; 23
1989; 21
2000; 3
2007; 188
2011; 40
2007
2001; 26
1950
2019; 108
1994; 48
1967; 27
2018; 45
2018; 21
2022; 28
2016; 13
1992; 30
2014; 42
2010; 66
2021; 78
2021; 12
2022
2019; 46
2020; 71
2018; 113
1991; 20
2015; 112
2011; 73
2008; 714
2013; 80
2020; 117
2011; 22
2014
2005; 71
2023; 118
2009; 3
2001; 57
2007; 22
2006; 101
1968
2003; 482
2019; 176
e_1_2_10_23_1
e_1_2_10_46_1
e_1_2_10_24_1
e_1_2_10_45_1
e_1_2_10_21_1
e_1_2_10_44_1
e_1_2_10_43_1
Kowalski J. (e_1_2_10_22_1) 2008
e_1_2_10_41_1
Hinton G. E. (e_1_2_10_19_1) 2002; 15
Katz D. (e_1_2_10_20_1) 1950
e_1_2_10_4_1
e_1_2_10_18_1
e_1_2_10_3_1
Cox T. F. (e_1_2_10_9_1) 1991
e_1_2_10_6_1
e_1_2_10_16_1
Gower J. C. (e_1_2_10_17_1) 2014
e_1_2_10_39_1
e_1_2_10_5_1
e_1_2_10_38_1
e_1_2_10_8_1
e_1_2_10_14_1
e_1_2_10_37_1
e_1_2_10_7_1
e_1_2_10_15_1
e_1_2_10_36_1
e_1_2_10_12_1
e_1_2_10_35_1
e_1_2_10_13_1
e_1_2_10_34_1
e_1_2_10_10_1
e_1_2_10_33_1
e_1_2_10_11_1
e_1_2_10_32_1
e_1_2_10_31_1
e_1_2_10_30_1
Van der Vaart A. W. (e_1_2_10_42_1) 2000
Agresti A. (e_1_2_10_2_1) 2003
Tsiatis A. (e_1_2_10_40_1) 2007
e_1_2_10_29_1
e_1_2_10_27_1
e_1_2_10_28_1
e_1_2_10_25_1
e_1_2_10_26_1
References_xml – volume: 188
  start-page: 117
  year: 2007
  end-page: 131
  article-title: Multiple regression on distance matrices: A multivariate spatial analysis tool
  publication-title: Plant Ecology
– volume: 48
  start-page: 1487
  year: 1994
  end-page: 1499
  article-title: Modeling brain evolution from behavior: A permutational regression approach
  publication-title: Evolution
– volume: 449
  start-page: 804
  year: 2007
  end-page: 810
  article-title: The human microbiome project
  publication-title: Nature
– volume: 22
  start-page: 1
  year: 2007
  end-page: 19
  article-title: The ecodist package for dissimilarity‐based analysis of ecological data
  publication-title: Journal of Statistical Software
– volume: 118
  start-page: 869
  issue: 542
  year: 2023
  end-page: 882
  article-title: Wasserstein regression
  publication-title: Journal of the American Statistical Association
– volume: 20
  start-page: 2943
  issue: 9
  year: 1991
  end-page: 2953
– volume: 3
  year: 2000
– volume: 117
  start-page: 18924
  year: 2020
  end-page: 18933
  article-title: Polygenic inheritance, gwas, polygenic risk scores, and the search for functional variants
  publication-title: Proceedings of the National Academy of Sciences
– year: 2007
– volume: 42
  start-page: 413
  year: 2014
  article-title: A significance test for the lasso
  publication-title: Annals of Statistics
– volume: 482
  year: 2003
– volume: 30
  start-page: 473
  issue: 6
  year: 1992
  end-page: 483
  article-title: The MOS 36‐item short‐form health survey (SF‐36)
  publication-title: Medical Care
– volume: 6
  start-page: 223
  year: 2015
  article-title: Fisher, Neyman‐Pearson or nhst? A tutorial for teaching data testing
  publication-title: Frontiers in Psychology
– volume: 45
  start-page: 2548
  year: 2018
  end-page: 2562
  article-title: Modern variable selection for longitudinal semi‐parametric models with missing data
  publication-title: Journal of Applied Statistics
– volume: 71
  start-page: 8228
  year: 2005
  end-page: 8235
  article-title: Unifrac: A new phylogenetic method for comparing microbial communities
  publication-title: Applied and Environmental Microbiology
– year: 1950
– volume: 3
  start-page: 458
  year: 2009
  article-title: Multilevel functional principal component analysis
  publication-title: The Annals of Applied Statistics
– volume: 57
  start-page: 1173
  year: 2001
  end-page: 1184
  article-title: Shrinkage estimators for covariance matrices
  publication-title: Biometrics
– start-page: 325
  year: 1968
  end-page: 346
  article-title: Asymptotic normality of simple linear rank statistics under alternatives
  publication-title: The Annals of Mathematical Statistics
– start-page: 1
  year: 2014
  end-page: 7
  article-title: Principal coordinates analysis
  publication-title: Wiley StatsRef: Statistics Reference Online
– volume: 15
  year: 2002
  article-title: Stochastic neighbor embedding
  publication-title: Advances in Neural Information Processing Systems
– volume: 46
  start-page: 686
  year: 2019
  end-page: 705
  article-title: A factor model approach for the joint segmentation with between‐series correlation
  publication-title: Scandinavian Journal of Statistics
– volume: 71
  start-page: 522
  year: 2020
  end-page: 538
  article-title: Intestinal fungal dysbiosis and systemic immune response to fungi in patients with alcoholic hepatitis
  publication-title: Hepatology
– volume: 12
  start-page: 395
  year: 2021
  article-title: Association of loneliness and wisdom with gut microbial diversity and composition: An exploratory study
  publication-title: Frontiers in Psychiatry
– volume: 66
  start-page: 636
  year: 2010
  end-page: 643
  article-title: On distance‐based permutation tests for between‐group comparisons
  publication-title: Biometrics
– volume: 73
  start-page: 273
  year: 2011
  end-page: 282
  article-title: Regression shrinkage and selection via the lasso: A retrospective
  publication-title: Journal of the Royal Statistical Society: Series B (Statistical Methodology)
– volume: 23
  year: 2021
  article-title: Circadian rhythm analysis using wearable device data: Novel penalized machine learning approach
  publication-title: Journal of Medical Internet Research
– volume: 27
  start-page: 209
  year: 1967
  end-page: 220
  article-title: The detection of disease clustering and a generalized regression approach
  publication-title: Cancer Research
– volume: 78
  start-page: 950
  issue: 3
  year: 2021
  end-page: 962
  article-title: A semiparametric model for between‐subject attributes: Applications to beta‐diversity of microbiome data
  publication-title: Biometrics
– volume: 21
  start-page: 787
  year: 1989
  end-page: 790
  article-title: Measures of location of compositional data sets
  publication-title: Mathematical Geology
– volume: 26
  start-page: 32
  year: 2001
  end-page: 46
  article-title: A new method for non‐parametric multivariate analysis of variance
  publication-title: Austral Ecology
– year: 2022
– volume: 108
  start-page: 40
  year: 2019
  end-page: 47
  article-title: A new scale for assessing wisdom based on common domains and a neurobiological model: The San Diego wisdom scale (sd‐wise)
  publication-title: Journal of Psychiatric Research
– volume: 80
  start-page: 62
  year: 2013
  end-page: 79
  article-title: The Wu‐Minn human connectome project: An overview
  publication-title: NeuroImage
– volume: 13
  year: 2016
  article-title: The rise of consumer health wearables: Promises and barriers
  publication-title: PLoS Medicine
– volume: 176
  start-page: 512
  year: 2019
  end-page: 520
  article-title: Cerebellar‐prefrontal network connectivity and negative symptoms in schizophrenia
  publication-title: American Journal of Psychiatry
– volume: 40
  start-page: 1
  year: 2011
  end-page: 18
  article-title: Rcpp: Seamless r and c++ integration
  publication-title: Journal of Statistical Software
– volume: 21
  start-page: C1
  year: 2018
  end-page: C68
  article-title: Double/debiased machine learning for treatment and structural parameters
  publication-title: The Econometrics Journal
– volume: 113
  start-page: 1228
  year: 2018
  end-page: 1242
  article-title: Estimation and inference of heterogeneous treatment effects using random forests
  publication-title: Journal of the American Statistical Association
– volume: 28
  start-page: 303
  year: 2022
  end-page: 314
  article-title: Microbiome and metabolome features of the cardiometabolic disease spectrum
  publication-title: Nature Medicine
– volume: 22
  start-page: 1296
  year: 2011
  end-page: 1303
  article-title: Emergence of perceptual gestalts in the human visual cortex: The case of the configural‐superiority effect
  publication-title: Psychological Science
– volume: 112
  start-page: 7629
  year: 2015
  end-page: 7634
  article-title: Statistical learning and selective inference
  publication-title: Proceedings of the National Academy of Sciences
– volume: 48
  start-page: 729
  year: 2021
  end-page: 760
  article-title: Clustering with statistical error control
  publication-title: Scandinavian Journal of Statistics
– volume: 101
  start-page: 1418
  year: 2006
  end-page: 1429
  article-title: The adaptive lasso and its oracle properties
  publication-title: Journal of the American Statistical Association
– volume: 714
  year: 2008
– volume-title: Asymptotic statistics
  year: 2000
  ident: e_1_2_10_42_1
– ident: e_1_2_10_16_1
  doi: 10.18637/jss.v022.i07
– volume-title: Categorical data analysis
  year: 2003
  ident: e_1_2_10_2_1
– ident: e_1_2_10_11_1
  doi: 10.1111/j.0006-341X.2001.01173.x
– ident: e_1_2_10_45_1
  doi: 10.1080/01621459.2017.1319839
– ident: e_1_2_10_15_1
  doi: 10.1038/s41591-022-01688-4
– ident: e_1_2_10_30_1
  doi: 10.1214/13-AOS1175
– ident: e_1_2_10_28_1
– ident: e_1_2_10_29_1
  doi: 10.1111/biom.13487
– ident: e_1_2_10_27_1
  doi: 10.1007/s11258-006-9126-3
– volume: 15
  year: 2002
  ident: e_1_2_10_19_1
  article-title: Stochastic neighbor embedding
  publication-title: Advances in Neural Information Processing Systems
– ident: e_1_2_10_21_1
  doi: 10.1080/02664763.2018.1426739
– ident: e_1_2_10_34_1
  doi: 10.3389/fpsyg.2015.00223
– ident: e_1_2_10_37_1
  doi: 10.1073/pnas.1507583112
– ident: e_1_2_10_10_1
  doi: 10.1073/pnas.2005634117
– ident: e_1_2_10_38_1
  doi: 10.1016/j.jpsychires.2017.09.005
– ident: e_1_2_10_36_1
  doi: 10.1111/j.1541-0420.2009.01300.x
– ident: e_1_2_10_25_1
  doi: 10.1111/j.1558-5646.1994.tb02191.x
– ident: e_1_2_10_41_1
  doi: 10.1038/nature06244
– ident: e_1_2_10_23_1
  doi: 10.1177/0956797611417000
– ident: e_1_2_10_4_1
  doi: 10.1111/j.1442-9993.2001.01070.pp.x
– ident: e_1_2_10_26_1
  doi: 10.2196/18403
– ident: e_1_2_10_39_1
  doi: 10.1111/j.1467-9868.2011.00771.x
– start-page: 1
  year: 2014
  ident: e_1_2_10_17_1
  article-title: Principal coordinates analysis
  publication-title: Wiley StatsRef: Statistics Reference Online
– volume-title: Modern applied U‐statistics
  year: 2008
  ident: e_1_2_10_22_1
– ident: e_1_2_10_32_1
  doi: 10.1111/j.1095-8312.1998.tb01520.x
– start-page: 2943
  volume-title: Communications in Statistics‐Theory and Methods
  year: 1991
  ident: e_1_2_10_9_1
– volume-title: Gestalt psychology: Its nature and significance
  year: 1950
  ident: e_1_2_10_20_1
– ident: e_1_2_10_44_1
  doi: 10.1111/sjos.12450
– ident: e_1_2_10_13_1
  doi: 10.18637/jss.v040.i08
– ident: e_1_2_10_7_1
  doi: 10.1111/ectj.12097
– ident: e_1_2_10_3_1
  doi: 10.1007/BF00893322
– ident: e_1_2_10_6_1
  doi: 10.1080/01621459.2021.1956937
– ident: e_1_2_10_5_1
  doi: 10.1176/appi.ajp.2018.18040429
– ident: e_1_2_10_12_1
  doi: 10.1214/08-AOAS206
– ident: e_1_2_10_18_1
  doi: 10.1214/aoms/1177698394
– volume-title: Semiparametric theory and missing data
  year: 2007
  ident: e_1_2_10_40_1
– ident: e_1_2_10_8_1
  doi: 10.1111/sjos.12368
– ident: e_1_2_10_24_1
  doi: 10.1002/hep.30832
– ident: e_1_2_10_14_1
  doi: 10.1097/00005650-199206000-00002
– ident: e_1_2_10_31_1
  doi: 10.1128/AEM.71.12.8228-8235.2005
– ident: e_1_2_10_35_1
  doi: 10.1371/journal.pmed.1001953
– ident: e_1_2_10_33_1
  doi: 10.3389/fpsyt.2021.648475
– ident: e_1_2_10_43_1
  doi: 10.1016/j.neuroimage.2013.05.041
– ident: e_1_2_10_46_1
  doi: 10.1198/016214506000000735
SSID ssj0016418
Score 2.3602247
Snippet This article proposes a distance‐based framework incentivized by the paradigm shift toward feature aggregation for high‐dimensional data, which does not rely...
This article proposes a distance-based framework incentivized by the paradigm shift towards feature aggregation for high-dimensional data, which does not rely...
SourceID pubmedcentral
proquest
pubmed
crossref
wiley
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 672
SubjectTerms dimension reduction
Feasibility studies
Inference
Influence functions
multivariable regression
pairwise distance
Permutations
robust inference
semiparametric efficient influence function (EIF)
Title A new paradigm for high‐dimensional data: Distance‐based semiparametric feature aggregation framework via between‐subject attributes
URI https://onlinelibrary.wiley.com/doi/abs/10.1111%2Fsjos.12695
https://www.ncbi.nlm.nih.gov/pubmed/39101047
https://www.proquest.com/docview/3052209455
https://www.proquest.com/docview/3088560361
https://pubmed.ncbi.nlm.nih.gov/PMC11296665
Volume 51
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwEB4hTvTQB32wLUWu6KVIWRE78TpVL4iHEBKtVEDiUkW242y3sNmK7PbAiXNP_Y39JczYSWBLVYneInmchz3j-cYZfwPwNqHDi0aWkcs2kygpcC4y7XgUO245uhhcDj3b50e5f5IcnKanC_ChPQsT-CG6DTeyDL9ek4FrU98y8vrbpO7HXGZ0wpyStQgRfe64ozAM8Jt7qMQikipTDTcppfHcdJ33Rncg5t1MydsI1rugvUfwpX35kHly1p9NTd9e_sHr-L9f9xgeNtiUbQVlegILrlqGB4cdsWu9DEsETgO381P4ucUQlDNiDy9GwzFD_MuI_vj31a-CigYEwg9GSajv2Q4BVdQwbCTPWbDajUfUdUw1vSwrnacYZXo4vHBDry-sbDPH2I-RZk1KGd6gnhnaPmJ6Gup1ufoZnOztHm_vR01th8gmGJNF1unUYjRljUhLWZTKORuLQeY2tbEZem4rMhWnmqruxhgS6cJiKErniKXWphDiOSxWk8qtAEsHRDGvYj3QOimkNCYzCTcp10oqoWQP3rVznNuG-Jzqb5znbQBEg537we7Beif7PdB9_FVqtVWVvDH5OseFk3MMllNsftM1o7HSHxhducmMZJRCiClk3IMXQbO6xwgEbsSb0QM1p3OdABGBz7dUo6-eEJwwM4ah-OANr1P_ePX86ODTkb96eR_hV7DEEc2FHLlVWJxezNxrRGNTs-at7hqZGzhP
linkProvider Wiley-Blackwell
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Nb9QwEB1BOdAeChQKW0oxoheQsmrsxOtwqyjVUtoi0VbqLbIdZ9nCZlGzy4ETZ078Rn4JM3Y27dIKCW6RPM6HPeN544zfAGwmdHjRyDJy2VYSJQXORaYdj2LHLUcXg8uhZ_s8lP2TZO80PW1yc-gsTOCHaDfcyDL8ek0GThvSl6y8PhvX3ZjLLL0Jt6ikN9nlzoeWPQoDAb-9h2osIqky1bCTUiLPRd95f3QFZF7NlbyMYb0T2r0TKq3WnruQck8-dacT07Xf_mB2_O_vuwvLDTxl20Gf7sENV63A0kHL7VqvwCLh00DvfB9-bDPE5YwIxIvhYMQQAjNiQP71_WdBdQMC5wejPNRXbIewKioZNpLzLFjtRkPqOqKyXpaVzrOMMj0YnLuBVxlWzpLH2NehZk1WGd6gnhraQWJ6Ekp2ufoBnOy-OX7dj5ryDpFNMCyLrNOpxYDKGpGWsiiVczYWvcxtaWMzdN5WZCpONRXejTEq0oXFaJSOEkutTSHEKixU48o9Apb2iGVexbqndVJIaUxmEm5SrpVUQskOvJhNcm4b7nMqwfE5n8VANNi5H-wOPG9lvwTGj2ul1me6kjdWX-e4dnKO8XKKzc_aZrRX-gmjKzeekoxSiDKFjDvwMKhW-xiB2I2oMzqg5pSuFSAu8PmWavjRc4ITbMZIFB_80ivVX149P9p7f-Sv1v5F-Cnc7h8f7Of7bw_fPYZFjuAupMytw8LkfOqeIDibmA1vgr8B0pE8aA
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9NAEB6VVkLlwKM8GlpgEVxAchSv7c0acalIo1KgIEqlXiprd70OKcSp6qQHTpw58Rv5Jczs2qahCAlulnb8iD2z832b2W8AHse0eVGLIrBpLw7iHL9FqiwPQssNxxSD06FT-9wTOwfx7mFyuATPm70wXh-iXXCjyHDzNQX4SV6cC_LqeFp1Qy7S5BKsxKIniXoN3rfiUcgD3OoeenEUCJnKWpyU6nh-nbuYji5gzIulkuchrMtBw2tw1Dy9Lz351J3PdNd8-U3Y8X9_3nW4WoNTtuW96QYs2XINrrxplV2rNVgldOrFnW_Cty2GqJyRfHg-Hk0YAmBG-sc_vn7PqWuAV_xgVIX6jA0IqaKL4SClzpxVdjKmUyfU1MuwwjqNUaZGo1M7cg7DiqZ0jJ2NFatryvAC1VzT-hFTM9-wy1a34GC4_eHFTlA3dwhMjKQsMFYlBumU0VFSiLyQ1pow6qe2p7RJMXWbKJVhoqjtboicSOUGuShtJBZK6TyKbsNyOS3tOrCkTxrzMlR9peJcCK1THXOdcCWFjKTowJPmG2emVj6nBhyfs4YB0cvO3MvuwKPW9sTrffzRarNxlayO-SrDmZNzZMsJDj9shzFa6S8YVdrpnGykRIwZibADd7xntbeJELmRcEYH5ILPtQakBL44Uo4_OkVwAs3IQ_HGT51P_eXRs_3dt_vu6O6_GD-Ay-8Gw-z1y71XG7DKEdn5erlNWJ6dzu09RGYzfd8F4E8pJDsg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+New+Paradigm+for+High-dimensional+Data%3A+Distance-Based+Semiparametric+Feature+Aggregation+Framework+via+Between-Subject+Attributes&rft.jtitle=Scandinavian+journal+of+statistics&rft.au=Liu%2C+Jinyuan&rft.au=Zhang%2C+Xinlian&rft.au=Lin%2C+Tuo&rft.au=Chen%2C+Ruohui&rft.date=2024-06-01&rft.issn=0303-6898&rft.volume=51&rft.issue=2&rft.spage=672&rft_id=info:doi/10.1111%2Fsjos.12695&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0303-6898&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0303-6898&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0303-6898&client=summon