Centrality nearest-neighbor projected-distance regression

Nearest-neighbor projected-distance regression (NPDR) is a metric-based machine learning feature selection algorithm that uses distances between samples and projected differences between variables to identify variables or features that may interact to affect the prediction of complex outcomes. Typic...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 20; no. 3; p. e0319346
Main Authors Kresock, Elizabeth, Dawkins, Bryan, Luttbeg, Henry, Li, Yijie (Jamie), Kuplicki, Rayus, McKinney, B. A
Format Journal Article
LanguageEnglish
Published Public Library of Science 06.03.2025
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Nearest-neighbor projected-distance regression (NPDR) is a metric-based machine learning feature selection algorithm that uses distances between samples and projected differences between variables to identify variables or features that may interact to affect the prediction of complex outcomes. Typical tabular bioinformatics data consist of separate variables of interest, such as genes or proteins. In contrast, resting-state functional MRI (rs-fMRI) data are composed of time-series for brain regions of interest (ROIs) for each subject, and these within-brain time-series are typically transformed into correlations between pairs of ROIs. These pairs of variables of interest can then be used as inputs for feature selection or other machine learning methods. Straightforward feature selection would return the most significant pairs of ROIs; however, it would also be beneficial to know the importance of individual ROIs. We extend NPDR to compute the importance of individual ROIs from correlation-based features. We introduce correlation-difference and centrality-based versions of NPDR. Centrality-based NPDR can be coupled with any centrality method and can be coupled with importance scores other than NPDR, such as random forest importance scores. We develop a new simulation method using random network theory to generate artificial correlation data predictors with variations in correlations that affect class prediction. We compared feature selection methods based on detection of functional simulated ROIs, and we applied the new centrality NPDR approach to a resting-state fMRI study of major depressive disorder (MDD) participants and healthy controls. We determined that the areas of the brain that have the strongest network effect on MDD include the middle temporal gyrus, the inferior temporal gyrus, and the dorsal entorhinal cortex. The resulting feature selection and simulation approaches can be applied to other domains that use correlation-based features.
AbstractList Nearest-neighbor projected-distance regression (NPDR) is a metric-based machine learning feature selection algorithm that uses distances between samples and projected differences between variables to identify variables or features that may interact to affect the prediction of complex outcomes. Typical tabular bioinformatics data consist of separate variables of interest, such as genes or proteins. In contrast, resting-state functional MRI (rs-fMRI) data are composed of time-series for brain regions of interest (ROIs) for each subject, and these within-brain time-series are typically transformed into correlations between pairs of ROIs. These pairs of variables of interest can then be used as inputs for feature selection or other machine learning methods. Straightforward feature selection would return the most significant pairs of ROIs; however, it would also be beneficial to know the importance of individual ROIs. We extend NPDR to compute the importance of individual ROIs from correlation-based features. We introduce correlation-difference and centrality-based versions of NPDR. Centrality-based NPDR can be coupled with any centrality method and can be coupled with importance scores other than NPDR, such as random forest importance scores. We develop a new simulation method using random network theory to generate artificial correlation data predictors with variations in correlations that affect class prediction. We compared feature selection methods based on detection of functional simulated ROIs, and we applied the new centrality NPDR approach to a resting-state fMRI study of major depressive disorder (MDD) participants and healthy controls. We determined that the areas of the brain that have the strongest network effect on MDD include the middle temporal gyrus, the inferior temporal gyrus, and the dorsal entorhinal cortex. The resulting feature selection and simulation approaches can be applied to other domains that use correlation-based features.
Background Nearest-neighbor projected-distance regression (NPDR) is a metric-based machine learning feature selection algorithm that uses distances between samples and projected differences between variables to identify variables or features that may interact to affect the prediction of complex outcomes. Typical tabular bioinformatics data consist of separate variables of interest, such as genes or proteins. In contrast, resting-state functional MRI (rs-fMRI) data are composed of time-series for brain regions of interest (ROIs) for each subject, and these within-brain time-series are typically transformed into correlations between pairs of ROIs. These pairs of variables of interest can then be used as inputs for feature selection or other machine learning methods. Straightforward feature selection would return the most significant pairs of ROIs; however, it would also be beneficial to know the importance of individual ROIs. Results We extend NPDR to compute the importance of individual ROIs from correlation-based features. We introduce correlation-difference and centrality-based versions of NPDR. Centrality-based NPDR can be coupled with any centrality method and can be coupled with importance scores other than NPDR, such as random forest importance scores. We develop a new simulation method using random network theory to generate artificial correlation data predictors with variations in correlations that affect class prediction. Conclusions We compared feature selection methods based on detection of functional simulated ROIs, and we applied the new centrality NPDR approach to a resting-state fMRI study of major depressive disorder (MDD) participants and healthy controls. We determined that the areas of the brain that have the strongest network effect on MDD include the middle temporal gyrus, the inferior temporal gyrus, and the dorsal entorhinal cortex. The resulting feature selection and simulation approaches can be applied to other domains that use correlation-based features.
Audience Academic
Author Li, Yijie (Jamie)
Dawkins, Bryan
Luttbeg, Henry
Kuplicki, Rayus
McKinney, B. A
Kresock, Elizabeth
Author_xml – sequence: 1
  fullname: Kresock, Elizabeth
– sequence: 2
  fullname: Dawkins, Bryan
– sequence: 3
  fullname: Luttbeg, Henry
– sequence: 4
  fullname: Li, Yijie (Jamie)
– sequence: 5
  fullname: Kuplicki, Rayus
– sequence: 6
  fullname: McKinney, B. A
BookMark eNqFz01LAzEQBuAgFWyr_8DDngQPWZNNN7s5lqK1UChY8brkY_aLkJRNFvTfu6CH9eTpHeZ9GJgVWjjvAKF7SlLKCvrU-3Fw0qaXaZ0SRgXb8Cu0nDLDPCNsMZtv0CqEnpCclZwvkdiBi4O0XfxKHMgBQsQOuqZVfkgug-9BRzDYdCFKpyEZoJlM6Ly7Rde1tAHufnONzi_P77tXfDztD7vtETdCUAxKcKF0YQRRNctYtjGkNIXOSCEk1ZBzUhJgtRRcUcpBTzVXRa6ZKQ0ItkaPP1cbaaHqnPYuwmds5BhCdTi_VdsyE2Wec0b_saePv_ZhZluQNrbB2zFOn4U5_AZ37WqY
ContentType Journal Article
Copyright COPYRIGHT 2025 Public Library of Science
Copyright_xml – notice: COPYRIGHT 2025 Public Library of Science
DBID IOV
ISR
DOI 10.1371/journal.pone.0319346
DatabaseName Gale In Context: Opposing Viewpoints
Gale In Context: Science
DatabaseTitleList


DeliveryMethod fulltext_linktorsrc
Discipline Sciences (General)
EISSN 1932-6203
ExternalDocumentID A829855631
GeographicLocations United States
GeographicLocations_xml – name: United States
GroupedDBID ---
123
29O
2WC
53G
5VS
7RV
7X2
7X7
7XC
88E
8AO
8C1
8CJ
8FE
8FG
8FH
8FI
8FJ
A8Z
AAFWJ
AAUCC
AAWOE
ABDBF
ABIVO
ABJCF
ABUWG
ACGFO
ACIHN
ACIWK
ACPRK
ACUHS
ADBBV
AEAQA
AENEX
AEUYN
AFKRA
AFPKN
AFRAH
AHMBA
ALIPV
ALMA_UNASSIGNED_HOLDINGS
AOIJS
APEBS
ARAPS
ATCPS
BAWUL
BBNVY
BBORY
BCNDV
BENPR
BGLVJ
BHPHI
BKEYQ
BPHCQ
BVXVI
BWKFM
CCPQU
CS3
D1I
D1J
D1K
DIK
DU5
E3Z
EAP
EAS
EBD
EMOBN
ESX
EX3
F5P
FPL
FYUFA
GROUPED_DOAJ
GX1
HCIFZ
HH5
HMCUK
HYE
IAO
IEA
IGS
IHR
IHW
INH
INR
IOV
IPY
ISE
ISR
ITC
K6-
KB.
KQ8
L6V
LK5
LK8
M0K
M1P
M48
M7P
M7R
M7S
M~E
NAPCQ
O5R
O5S
OK1
P2P
P62
PATMY
PDBOC
PHGZM
PHGZT
PIMPY
PQQKQ
PROAC
PSQYO
PTHSS
PV9
PYCSY
RNS
RPM
RZL
SV3
TR2
UKHRP
WOQ
WOW
~02
~KM
OVT
ID FETCH-LOGICAL-g991-eb969bc7d90bf32324d08d7c2079a1ce56080e3fa96b116ecd086b75c3d8de93
IEDL.DBID M48
ISSN 1932-6203
IngestDate Fri Jun 27 05:14:09 EDT 2025
Fri Jun 27 05:14:18 EDT 2025
Thu May 22 21:23:42 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 3
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-g991-eb969bc7d90bf32324d08d7c2079a1ce56080e3fa96b116ecd086b75c3d8de93
PageCount e0319346
ParticipantIDs gale_incontextgauss_ISR_A829855631
gale_incontextgauss_IOV_A829855631
gale_healthsolutions_A829855631
PublicationCentury 2000
PublicationDate 20250306
PublicationDateYYYYMMDD 2025-03-06
PublicationDate_xml – month: 03
  year: 2025
  text: 20250306
  day: 06
PublicationDecade 2020
PublicationTitle PloS one
PublicationYear 2025
Publisher Public Library of Science
Publisher_xml – name: Public Library of Science
SSID ssj0053866
Score 2.4703512
Snippet Background Nearest-neighbor projected-distance regression (NPDR) is a metric-based machine learning feature selection algorithm that uses distances between...
Nearest-neighbor projected-distance regression (NPDR) is a metric-based machine learning feature selection algorithm that uses distances between samples and...
SourceID gale
SourceType Aggregation Database
StartPage e0319346
SubjectTerms Diagnosis
Health aspects
Machine learning
Magnetic resonance imaging
Major depressive disorder
Methods
Title Centrality nearest-neighbor projected-distance regression
Volume 20
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA61XryI9YH1URfxoIeUZLObbA4iRVqqUAVrpbfSvHqRVbst6MXfbmablh4UvOSyk4VkmOSbycx8CF2kHlUYaiSWZMwwAHYsY0Gwo4wzoVOdCagd7j3w7iC5H6bDClpytoYNLH517YBPajB9bX5-fN14g78uWRsEXU5qvr_ltglVOSzhG2jT300CTLWXrN4VvHWXr5eAWjCPCQvFdH_9JZzTazdOZwdtB6gYtRa6raGKzXdRLRhjEV2GjtFXe0iGGK1H1FEOTWmLGc4h5ukVHIVYizXYAFb0c6OpnSzSX_N91O-0n2-7OHAi4AnkKFkluVRaGEmUY4CGDMmM0DERcky19fglI5a5seSKUm61_8yVSDUzmbGSHaBq7td3iCKmPRLUzsTSgNvkFDdJqqSWTlPimKujM1j6aFGOubKDUSuLZQZdxWgdnZcS0EUihzSVyXheFKO7x5d_CPWf1oSO_iN0jLZi4N4t2Q9PUHU2ndtTDwhmqlE60o1S0zB-t38AxpS2ow
linkProvider Scholars Portal
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Centrality+nearest-neighbor+projected-distance+regression&rft.jtitle=PloS+one&rft.au=Kresock%2C+Elizabeth&rft.au=Dawkins%2C+Bryan&rft.au=Luttbeg%2C+Henry&rft.au=Li%2C+Yijie+%28Jamie%29&rft.date=2025-03-06&rft.pub=Public+Library+of+Science&rft.issn=1932-6203&rft.eissn=1932-6203&rft.volume=20&rft.issue=3&rft.spage=e0319346&rft_id=info:doi/10.1371%2Fjournal.pone.0319346&rft.externalDBID=ISR&rft.externalDocID=A829855631
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1932-6203&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1932-6203&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1932-6203&client=summon