A Cluster Based Classification for Imbalanced Data Using SMOTE

Abstract There is tremendous upturn in data repositories because of data generation by various organizations like government, cooperates, health caring in large amounts. Large amount of data is being produced, processed, collected, and analysed online. So there comes a requirement to transform this...

Full description

Saved in:
Bibliographic Details
Published inIOP conference series. Materials Science and Engineering Vol. 1099; no. 1; p. 12080
Main Authors Tripathi, Rajesh Kumar, Raja, Linesh, Kumar, Ankit, Dadheech, Pankaj, Kumar, Abhishek, Nachappa, M N
Format Journal Article
LanguageEnglish
Published Bristol IOP Publishing 01.03.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Abstract There is tremendous upturn in data repositories because of data generation by various organizations like government, cooperates, health caring in large amounts. Large amount of data is being produced, processed, collected, and analysed online. So there comes a requirement to transform this data into valuable information. This process of extracting the knowledge from large amount of data is referred as data mining. The proposed hybrid approach can be checked on different classifiers like Naïve Bayes, Random forest classifier etc. In proposed methodology we find that SMOTE algorithm which used K-nearest neighbour algorithm is limited to some minority class instances for creating synthetic samples, which sometimes leads to over fitting, so an effective oversampling approach can be developed.
ISSN:1757-8981
1757-899X
DOI:10.1088/1757-899X/1099/1/012080