Appropriateness of distances in nearest neighbour classification: a monometric perspective

Among the non-parametric classification methods, the nearest neighbour classifier (NNC) holds a pre-eminent position. Given a training or sample set S the choice one needs to make is on the value of k and the distance function d to be employed. Towards improving the efficacy of an NNC, there are man...

Full description

Saved in:
Bibliographic Details
Published inPattern analysis and applications : PAA Vol. 28; no. 1
Main Authors Gupta, Megha, Jayaram, Balasubramaniam
Format Journal Article
LanguageEnglish
Published London Springer London 01.03.2025
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN1433-7541
1433-755X
DOI10.1007/s10044-024-01373-x

Cover

Loading…
More Information
Summary:Among the non-parametric classification methods, the nearest neighbour classifier (NNC) holds a pre-eminent position. Given a training or sample set S the choice one needs to make is on the value of k and the distance function d to be employed. Towards improving the efficacy of an NNC, there are many works—both theoretical and empirical—that help in choosing a suitable value of k . However, works that deal with the appropriateness of a distance d for a given S are largely empirical. In this work, we address the following two posers for a given S : (1) How to identify a potentially appropriate distance d ? (2) What qualities should an appropriate d possess? Our investigations show that every distance function d determines a landscape on the underlying data space and only if the class boundaries align with this landscape can this d be appropriate. In view of this, we construct a relational graph G S , d , in fact, a poset, on the given S using d . With the help of G S , d , we choose a T ⊂ S to be used in a condensed-NN algorithm. Terming it the NEN algorithm, firstly, we show empirically that the training error of this NEN algorithm is reflective of the appropriateness of d . Towards providing a theoretical justification to our claims based on empiricism, we investigate the problem of classification in the setting of monometric spaces, wherein it emerges that the suitability of d is essentially related to the embeddability of G S , d in the monometric space ( X , ⪯ d , d ).
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1433-7541
1433-755X
DOI:10.1007/s10044-024-01373-x