Personality Classification from Online Text using Machine Learning Approach

Personality refer to the distinctive set of characteristics of a person that effect their habits, behaviour’s, attitude and pattern of thoughts. Text available on Social Networking sites provide an opportunity to recognize individual’s personality traits automatically. In this proposed work, Machine...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of advanced computer science & applications Vol. 11; no. 3
Main Authors Khan, Alam Sher, Ahmad, Hussain, Zubair, Muhammad, Khan, Furqan, Arif, Areeba, Ali, Hassan
Format Journal Article
LanguageEnglish
Published West Yorkshire Science and Information (SAI) Organization Limited 2020
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Personality refer to the distinctive set of characteristics of a person that effect their habits, behaviour’s, attitude and pattern of thoughts. Text available on Social Networking sites provide an opportunity to recognize individual’s personality traits automatically. In this proposed work, Machine Learning Technique, XGBoost classifier is used to predict four personality traits based on Myers- Briggs Type Indicator (MBTI) model, namely Introversion-Extroversion(I-E), iNtuition-Sensing(N-S), Feeling-Thinking(F-T) and Judging-Perceiving(J-P) from input text. Publically available benchmark dataset from Kaggle is used in experiments. The skewness of the dataset is the main issue associated with the prior work, which is minimized by applying Re-sampling technique namely random over-sampling, resulting in better performance. For more exploration of the personality from text, pre-processing techniques including tokenization, word stemming, stop words elimination and feature selection using TF IDF are also exploited. This work provides the basis for developing a personality identification system which could assist organization for recruiting and selecting appropriate personnel and to improve their business by knowing the personality and preferences of their customers. The results obtained by all classifiers across all personality traits is good enough, however, the performance of XGBoost classifier is outstanding by achieving more than 99% precision and accuracy for different traits.
AbstractList Personality refer to the distinctive set of characteristics of a person that effect their habits, behaviour’s, attitude and pattern of thoughts. Text available on Social Networking sites provide an opportunity to recognize individual’s personality traits automatically. In this proposed work, Machine Learning Technique, XGBoost classifier is used to predict four personality traits based on Myers- Briggs Type Indicator (MBTI) model, namely Introversion-Extroversion(I-E), iNtuition-Sensing(N-S), Feeling-Thinking(F-T) and Judging-Perceiving(J-P) from input text. Publically available benchmark dataset from Kaggle is used in experiments. The skewness of the dataset is the main issue associated with the prior work, which is minimized by applying Re-sampling technique namely random over-sampling, resulting in better performance. For more exploration of the personality from text, pre-processing techniques including tokenization, word stemming, stop words elimination and feature selection using TF IDF are also exploited. This work provides the basis for developing a personality identification system which could assist organization for recruiting and selecting appropriate personnel and to improve their business by knowing the personality and preferences of their customers. The results obtained by all classifiers across all personality traits is good enough, however, the performance of XGBoost classifier is outstanding by achieving more than 99% precision and accuracy for different traits.
Author Arif, Areeba
Zubair, Muhammad
Khan, Alam Sher
Ali, Hassan
Ahmad, Hussain
Khan, Furqan
Author_xml – sequence: 1
  givenname: Alam Sher
  surname: Khan
  fullname: Khan, Alam Sher
– sequence: 2
  givenname: Hussain
  surname: Ahmad
  fullname: Ahmad, Hussain
– sequence: 3
  givenname: Muhammad
  surname: Zubair
  fullname: Zubair, Muhammad
– sequence: 4
  givenname: Furqan
  surname: Khan
  fullname: Khan, Furqan
– sequence: 5
  givenname: Areeba
  surname: Arif
  fullname: Arif, Areeba
– sequence: 6
  givenname: Hassan
  surname: Ali
  fullname: Ali, Hassan
BookMark eNp9kMtOwzAQRS0EEqX0D1hEYp1ix3ZSs4siHoWiIlEkdtbYscFV6hQ7lejfkz5WLJjNjK7undGcC3TqW28QuiJ4TBjPxc30qazeynGGMzzGhGDKJydokBGep5wX-HQ_T1KCi49zNIpxifuiIssndICeX02IrYfGddukaiBGZ52GzrU-saFdJXPfOG-Shfnpkk10_jN5Af21k2YGgt8J5Xod2l68RGcWmmhGxz5E7_d3i-oxnc0fplU5SzUVpEupUAUDTGvMmM1qQ6w2VKnc1IqaQmBgWihNC2MFKIAaaptZZnOlNJjC5nSIrg97-7PfGxM7uWw3of8hyiznnHBOC9K72MGlQxtjMFaug1tB2EqC5Z6cPJCTO3LySK6P3f6JadfteXQBXPN_-Bc9yHek
CitedBy_id crossref_primary_10_1093_llc_fqac055
crossref_primary_10_3389_fdata_2022_931206
crossref_primary_10_1007_s42001_025_00366_y
crossref_primary_10_48084_etasr_7901
crossref_primary_10_1109_ACCESS_2021_3121137
crossref_primary_10_1007_s10462_023_10603_3
crossref_primary_10_3389_fpubh_2022_861062
crossref_primary_10_21015_vtse_v12i3_1864
crossref_primary_10_4018_IJITWE_298654
crossref_primary_10_1016_j_eswa_2022_118318
crossref_primary_10_18517_ijods_4_2_116_135_2023
crossref_primary_10_1109_ACCESS_2021_3121791
crossref_primary_10_3389_fpubh_2022_862497
crossref_primary_10_26634_jfet_18_3_19479
ContentType Journal Article
Copyright 2020. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2020. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
3V.
7XB
8FE
8FG
8FK
8G5
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
GNUQQ
GUQSH
HCIFZ
JQ2
K7-
M2O
MBDVC
P5Z
P62
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
DOI 10.14569/IJACSA.2020.0110358
DatabaseName CrossRef
ProQuest Central (Corporate)
ProQuest Central (purchase pre-March 2016)
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Research Library
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Central
Technology Collection (ProQuest)
ProQuest One
ProQuest Central Korea
ProQuest Central Student
ProQuest Research Library
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
ProQuest Research Library
Research Library (Corporate)
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
DatabaseTitle CrossRef
Publicly Available Content Database
Research Library Prep
Computer Science Database
ProQuest Central Student
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
Research Library (Alumni Edition)
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Central Korea
ProQuest Research Library
ProQuest Central (New)
Advanced Technologies & Aerospace Collection
ProQuest Central Basic
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
DatabaseTitleList Publicly Available Content Database
Database_xml – sequence: 1
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2156-5570
ExternalDocumentID 10_14569_IJACSA_2020_0110358
GroupedDBID .DC
5VS
8G5
AAYXX
ABUWG
ADMLS
AFKRA
ALMA_UNASSIGNED_HOLDINGS
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
CITATION
DWQXO
EBS
EJD
GNUQQ
GUQSH
HCIFZ
K7-
KQ8
M2O
OK1
PHGZM
PHGZT
PIMPY
RNS
3V.
7XB
8FE
8FG
8FK
JQ2
MBDVC
P62
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
ID FETCH-LOGICAL-c391t-39b74a03d044f2de1fce3bb6edb3e790a4c9bc37ef9abaadadf2f4f6bbcae7f63
IEDL.DBID BENPR
ISSN 2158-107X
IngestDate Fri Jul 25 06:57:43 EDT 2025
Tue Jul 01 01:09:59 EDT 2025
Thu Apr 24 23:13:13 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Issue 3
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c391t-39b74a03d044f2de1fce3bb6edb3e790a4c9bc37ef9abaadadf2f4f6bbcae7f63
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://www.proquest.com/docview/2655155371?pq-origsite=%requestingapplication%
PQID 2655155371
PQPubID 5444811
ParticipantIDs proquest_journals_2655155371
crossref_primary_10_14569_IJACSA_2020_0110358
crossref_citationtrail_10_14569_IJACSA_2020_0110358
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2020-00-00
PublicationDateYYYYMMDD 2020-01-01
PublicationDate_xml – year: 2020
  text: 2020-00-00
PublicationDecade 2020
PublicationPlace West Yorkshire
PublicationPlace_xml – name: West Yorkshire
PublicationTitle International journal of advanced computer science & applications
PublicationYear 2020
Publisher Science and Information (SAI) Organization Limited
Publisher_xml – name: Science and Information (SAI) Organization Limited
SSID ssj0000392683
Score 2.3034925
Snippet Personality refer to the distinctive set of characteristics of a person that effect their habits, behaviour’s, attitude and pattern of thoughts. Text available...
SourceID proquest
crossref
SourceType Aggregation Database
Enrichment Source
Index Database
SubjectTerms Classifiers
Datasets
Extroversion
Introversion
Machine learning
Personality
Personality traits
Sampling methods
Title Personality Classification from Online Text using Machine Learning Approach
URI https://www.proquest.com/docview/2655155371
Volume 11
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwhV1JTwIxGG0ELl7cjSiSHrxWOm2ZTk9mJCBiIEQh4TaZblwMoOD_t50poBc9z3J4nX7Lm6_vAXCnLTYRwRoR108gJiRGOVMJojqxhHOrTUFdDEdxf8oGs_YsEG7rMFa5jYlFoNZL5TnyFonb3o2E8uhh9YG8a5T_uxosNCqg5kJw4pqv2mN3NH7dsSzYpf-40OJ0qc3rmPJZOD_nCgfReh6knbfUdYkE3_tESL3z-8_89Ds8FzmndwKOQrEI03J1T8GBWZyB460RAwz78hy8jPclNSxsLv0AUIE59OdHYCkoCicuEkM_6T6Hw2KI0sCgrzqHaRAXvwDTXnfS6aPgkoAUFdEGUSE5yzHVmDFLtImsMlTK2GhJDRfYQS-kotxYkcs817m2xDIbS6lyw21ML0F1sVyYKwBVJG1E29gynjCjRUKoTKxWrqpQQhtaB3SLTaaChLh3snjPfCvhEc1KRDOPaBYQrQO0e2pVSmj8c39jC3sWNtQ62y__9d-Xb8Chf1nJkjRAdfP5ZW5d3bCRTVBJek_N8Il8A8MNwco
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3LTtwwFL2isGg3baEgoBS8oEuDY3vieFFVI2CYYRhUiUGaXYhfbKrhNajqT_GN-CYOjw2sWCexopPr-8r1OQDbLjCfceYoj_UEldowWklbUOGKwJUKzteti9FJ3j-TR5POZA7u27MwOFbZ-sTaUbtLiz3yXZ53UI1EqOz31TVF1Sj8u9pKaDRmMfT__8WS7fbXYD9-35-c9w7Ge32aVAWoFTqbUaGNkhUTjkkZuPNZsF4Yk3tnhFeaxVfVxgrlg65MVbnKBR5kyI2xlVchF3HdD7AghdC4o4re4WNPh8VkI6-ZP2MgRdZUNUmn9WKaoncHR929026sSTnbwbArUGf-eTR8GQzqCNf7Cp9Takq6jS0twpyfLsGXVvaBJC_wDYZ_nhJ4Uotq4rhR_YUJnlYhDX0pGUe_T3Cu_oKM6pFNTxKb6wXpJirzZTh7F_RWYH56OfWrQGxmQiY6LEhVSO90wYUpgrMxh7HaebEGosWmtImwHHUz_pZYuCCiZYNoiYiWCdE1oI9PXTWEHW_cv9HCXqbte1s-Gdv665e34GN_PDoujwcnw-_wCRdu-jMbMD-7ufM_YsYyM5u1mRA4f2-7fACsqwED
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Personality+Classification+from+Online+Text+using+Machine+Learning+Approach&rft.jtitle=International+journal+of+advanced+computer+science+%26+applications&rft.au=Khan%2C+Alam+Sher&rft.au=Ahmad%2C+Hussain&rft.au=Zubair%2C+Muhammad&rft.au=Khan%2C+Furqan&rft.date=2020&rft.issn=2158-107X&rft.eissn=2156-5570&rft.volume=11&rft.issue=3&rft_id=info:doi/10.14569%2FIJACSA.2020.0110358&rft.externalDBID=n%2Fa&rft.externalDocID=10_14569_IJACSA_2020_0110358
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2158-107X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2158-107X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2158-107X&client=summon