Comparative analysis of proficiencies of various textures and geometric features in breast mass classification using k-nearest neighbor

This paper introduces a comparative analysis of the proficiencies of various textures and geometric features in the diagnosis of breast masses on mammograms. An improved machine learning-based framework was developed for this study. The proposed system was tested using 106 full field digital mammogr...

Full description

Saved in:
Bibliographic Details
Published inVisual computing for industry, biomedicine and art Vol. 5; no. 1; pp. 3 - 19
Main Authors Singh, Harmandeep, Sharma, Vipul, Singh, Damanpreet
Format Journal Article
LanguageEnglish
Published Singapore Springer Singapore 12.01.2022
Springer Nature B.V
SpringerOpen
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper introduces a comparative analysis of the proficiencies of various textures and geometric features in the diagnosis of breast masses on mammograms. An improved machine learning-based framework was developed for this study. The proposed system was tested using 106 full field digital mammography images from the INbreast dataset, containing a total of 115 breast mass lesions. The proficiencies of individual and various combinations of computed textures and geometric features were investigated by evaluating their contributions towards attaining higher classification accuracies. Four state-of-the-art filter-based feature selection algorithms (Relief-F, Pearson correlation coefficient, neighborhood component analysis, and term variance) were employed to select the top 20 most discriminative features. The Relief-F algorithm outperformed other feature selection algorithms in terms of classification results by reporting 85.2% accuracy, 82.0% sensitivity, and 88.0% specificity. A set of nine most discriminative features were then selected, out of the earlier mentioned 20 features obtained using Relief-F, as a result of further simulations. The classification performances of six state-of-the-art machine learning classifiers, namely k-nearest neighbor (k-NN), support vector machine, decision tree, Naive Bayes, random forest, and ensemble tree, were investigated, and the obtained results revealed that the best classification results (accuracy = 90.4%, sensitivity = 92.0%, specificity = 88.0%) were obtained for the k-NN classifier with the number of neighbors having k = 5 and squared inverse distance weight. The key findings include the identification of the nine most discriminative features, that is, FD26 (Fourier Descriptor), Euler number, solidity, mean, FD14, FD13, periodicity, skewness, and contrast out of a pool of 125 texture and geometric features. The proposed results revealed that the selected nine features can be used for the classification of breast masses in mammograms.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2524-4442
2096-496X
2524-4442
DOI:10.1186/s42492-021-00100-1