Introversion-Extraversion Prediction using Machine Learning

Introversion and extroversion are personality traits that assess the type of interaction between people and others. Introversion and extraversion have their advantages and disadvantages. Knowing their personality, people can utilize these advantages and disadvantages for their benefit. This study co...

Full description

Saved in:
Bibliographic Details
Published inJOIV : international journal on informatics visualization Online Vol. 7; no. 4; p. 2154
Main Authors Fieri, Brillian, La'la, Joshua, Suhartono, Derwin
Format Journal Article
LanguageEnglish
Published 31.12.2023
Online AccessGet full text

Cover

Loading…
More Information
Summary:Introversion and extroversion are personality traits that assess the type of interaction between people and others. Introversion and extraversion have their advantages and disadvantages. Knowing their personality, people can utilize these advantages and disadvantages for their benefit. This study compares and evaluates several machine learning models and dataset balancing methods to predict the introversion-extraversion personality based on the survey result conducted by Open-Source Psychometrics Project. The dataset was balanced using three balancing methods, and fifteen questions were chosen as the features based on their correlations with the personality self-identification result. The dataset was used to train several supervised machine-learning models. The best model for the Synthetic Minority Oversampling (SMOTE), Adaptive Synthesis Sampling (ADASYN), and Synthetic Minority Oversampling-Edited Nearest Neighbor (SMOTE-ENN) datasets was the Random Forest with the 10-fold cross-validation accuracy of 95.5%, 95.3%, and 71.0%. On the original dataset, the best model was Support Vector Machine, with a 10-fold cross-validation accuracy of 73.5%. Based on the results, the best balancing methods to increase the models’ performance were oversampling. Conversely, the hybrid method of oversampling-undersampling did not significantly increase performance. Furthermore, the tree-like models, like Random Forest and Decision Tree, improved performance substantially from the data balancing. In contrast, the other models, excluding the SVM, did not show a significant rise in performance. This research implies that further study is needed on the hybrid balancing method and another classification model to improve personality classification performance.
ISSN:2549-9610
2549-9904
DOI:10.30630/joiv.7.4.01019