Comparative Study of Different Machine Learning Models for Customer Churn Analysis Using SMOTE and Feature Variation Along With Customer Segmentation
Customer churn is a major issue faced by the companies in both the online and offline markets, which adversely affects profit and revenue. Recently, Machine Learning (ML) is being used to analyze and predict customer churns. In this research the problem of churn prediction is studied with special fo...
Saved in:
Published in | 2023 International Conference on Modeling, Simulation & Intelligent Computing (MoSICom) pp. 637 - 642 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
07.12.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Customer churn is a major issue faced by the companies in both the online and offline markets, which adversely affects profit and revenue. Recently, Machine Learning (ML) is being used to analyze and predict customer churns. In this research the problem of churn prediction is studied with special focus on feature selection and unbalanced data sets. Also, churn analysis has mainly dealt with prediction and not methods to retain customers. In our study, we use a customer dataset from a US telecom company. We compare several classifiers for churn prediction including logistic regression, decision trees, SVM, random forest, k-NN and XgBoost. Besides, methods to retain the customers are discussed. The importance of feature selection is highlighted in this paper and a detailed experimental study of model performance on balanced and unbalanced datasets are explored. After comparing the F1 scores, AUC scores and precision-recall curve, it is seen that XgBoost outperformed all the other algorithms. On the other hand, retaining customers requires the careful study of their behavioral patterns. Customer segmentation is an effective way used by the marketing teams to identify the different groups of customers. In this paper, k-means, agglomerative clustering, gaussian mixture (GM) and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) are used for clustering the customers into segments. We evaluate the clustering results using silhouette analysis. |
---|---|
DOI: | 10.1109/MoSICom59118.2023.10458848 |