Appropriateness of distances in nearest neighbour classification: a monometric perspective
Among the non-parametric classification methods, the nearest neighbour classifier (NNC) holds a pre-eminent position. Given a training or sample set S the choice one needs to make is on the value of k and the distance function d to be employed. Towards improving the efficacy of an NNC, there are man...
Saved in:
Published in | Pattern analysis and applications : PAA Vol. 28; no. 1 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
London
Springer London
01.03.2025
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
ISSN | 1433-7541 1433-755X |
DOI | 10.1007/s10044-024-01373-x |
Cover
Loading…
Summary: | Among the non-parametric classification methods, the nearest neighbour classifier (NNC) holds a pre-eminent position. Given a training or sample set
S
the choice one needs to make is on the value of
k
and the distance function
d
to be employed. Towards improving the efficacy of an NNC, there are many works—both theoretical and empirical—that help in choosing a suitable value of
k
. However, works that deal with the appropriateness of a distance
d
for a given
S
are largely empirical. In this work, we address the following two posers for a given
S
: (1) How to identify a potentially appropriate distance
d
? (2) What qualities should an appropriate
d
possess? Our investigations show that every distance function
d
determines a landscape on the underlying data space and only if the class boundaries align with this landscape can this
d
be appropriate. In view of this, we construct a relational graph
G
S
,
d
, in fact, a poset, on the given
S
using
d
. With the help of
G
S
,
d
, we choose a
T
⊂
S
to be used in a condensed-NN algorithm. Terming it the NEN algorithm, firstly, we show empirically that the training error of this NEN algorithm is reflective of the appropriateness of
d
. Towards providing a theoretical justification to our claims based on empiricism, we investigate the problem of classification in the setting of monometric spaces, wherein it emerges that the suitability of
d
is essentially related to the embeddability of
G
S
,
d
in the monometric space (
X
,
⪯
d
,
d
). |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 1433-7541 1433-755X |
DOI: | 10.1007/s10044-024-01373-x |