Pearson–Matthews correlation coefficients for binary and multinary classification

The Pearson–Matthews correlation coefficient (usually abbreviated MCC) is considered to be one of the most useful metrics for the performance of a binary classification. For multinary classification tasks (with more than two classes) the existing extension of MCC, commonly called the RK metric, has...

Full description

Saved in:
Bibliographic Details
Published inSignal processing Vol. 222; p. 109511
Main Authors Stoica, Petre, Babu, Prabhu
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.09.2024
Subjects
Online AccessGet full text
ISSN0165-1684
1872-7557
1872-7557
DOI10.1016/j.sigpro.2024.109511

Cover

Abstract The Pearson–Matthews correlation coefficient (usually abbreviated MCC) is considered to be one of the most useful metrics for the performance of a binary classification. For multinary classification tasks (with more than two classes) the existing extension of MCC, commonly called the RK metric, has also been successfully used in many applications. The present paper begins with an introductory discussion on certain aspects of MCC. Then we go on to discuss the topic of multinary classification that is the main focus of this paper and which, despite its practical and theoretical importance, appears to be less developed than the topic of binary classification. Our discussion of the RK is followed by the introduction of two other metrics for multinary classification derived from the multivariate Pearson correlation (MPC) coefficients. We show that both RK and the MPC metrics suffer from the problem of not decisively indicating poor classification results when they should, and introduce three new enhanced metrics that do not suffer from this problem. We also present an additional new metric for multinary classification which can be viewed as a direct extension of MCC.
AbstractList The Pearson–Matthews correlation coefficient (usually abbreviated MCC) is considered to be one of the most useful metrics for the performance of a binary classification. For multinary classification tasks (with more than two classes) the existing extension of MCC, commonly called the RK metric, has also been successfully used in many applications. The present paper begins with an introductory discussion on certain aspects of MCC. Then we go on to discuss the topic of multinary classification that is the main focus of this paper and which, despite its practical and theoretical importance, appears to be less developed than the topic of binary classification. Our discussion of the RK is followed by the introduction of two other metrics for multinary classification derived from the multivariate Pearson correlation (MPC) coefficients. We show that both RK and the MPC metrics suffer from the problem of not decisively indicating poor classification results when they should, and introduce three new enhanced metrics that do not suffer from this problem. We also present an additional new metric for multinary classification which can be viewed as a direct extension of MCC.
The Pearson-Matthews correlation coefficient (usually abbreviated MCC) is considered to be one of the most useful metrics for the performance of a binary classification. For multinary classification tasks (with more than two classes) the existing extension of MCC, commonly called the R K metric, has also been successfully used in many applications. The present paper begins with an introductory discussion on certain aspects of MCC. Then we go on to discuss the topic of multinary classification that is the main focus of this paper and which, despite its practical and theoretical importance, appears to be less developed than the topic of binary classification. Our discussion of the R K is followed by the introduction of two other metrics for multinary classification derived from the multivariate Pearson correlation (MPC) coefficients. We show that both R K and the MPC metrics suffer from the problem of not decisively indicating poor classification results when they should, and introduce three new enhanced metrics that do not suffer from this problem. We also present an additional new metric for multinary classification which can be viewed as a direct extension of MCC.
ArticleNumber 109511
Author Stoica, Petre
Babu, Prabhu
Author_xml – sequence: 1
  givenname: Petre
  orcidid: 0000-0002-7957-3711
  surname: Stoica
  fullname: Stoica, Petre
  email: ps@it.uu.se
  organization: Division of Systems and Control, Department of Information Technology, Uppsala University, PO Box 337, Uppsala, 75237, Sweden
– sequence: 2
  givenname: Prabhu
  surname: Babu
  fullname: Babu, Prabhu
  email: Prabhu.Babu@care.iitd.ac.in
  organization: Centre for Applied Research in Electronics, Indian Institute of Technology, Delhi, New Delhi 110016, India
BackLink https://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-532237$$DView record from Swedish Publication Index
BookMark eNqFkEtOwzAQhi1UJErhBixyAFL8SJqYBVJVnlIRSDy2lmtPiqs0rmyHih134IacBLeBDQtYeTz6v5nRt496jW0AoSOChwST0cli6M185eyQYprFFs8J2UF9UhY0LfK86KF-jOUpGZXZHtr3foExJmyE--jhHqTztvl8_7iVIbzA2ifKOge1DMY2sYaqMspAE3xSWZfMTCPdWyIbnSzbOnQ_VUvvTcxtoQO0W8naw-H3O0BPlxePk-t0end1MxlPU8UyFlKmgbGC05JwpnQpVSU51kSxIueKcSlnFCilpFRQVjzDJebZiDEtMXCliWQDdNzN9WtYtTOxcmYZrxFWGnFunsfCurloW5EzSlkR46ddXDnrvYNKKBO2BwcnTS0IFhubYiE6m2JjU3Q2I5z9gn-2_YOddRhED68GnPAblwq0caCC0Nb8PeALPJ2WSQ
CitedBy_id crossref_primary_10_2478_amns_2024_2351
crossref_primary_10_2478_amns_2024_2244
crossref_primary_10_3390_ani14243712
crossref_primary_10_3390_rs16163050
crossref_primary_10_2478_amns_2024_2127
crossref_primary_10_1007_s11069_024_06762_3
crossref_primary_10_1109_TSIPN_2025_3540701
crossref_primary_10_3390_app14135728
crossref_primary_10_1126_sciadv_adr9609
Cites_doi 10.1016/0005-2795(75)90109-9
10.1016/j.sigpro.2020.107913
10.1109/ACCESS.2021.3084050
10.1371/journal.pone.0041882
10.1016/j.compbiolchem.2004.09.006
10.1186/s12864-019-6413-7
10.1016/j.sigpro.2017.12.006
10.1016/j.aci.2018.08.003
10.1016/j.ipm.2009.03.002
10.5121/ijdkp.2015.5201
10.1016/j.patrec.2020.03.030
ContentType Journal Article
Copyright 2024 Elsevier B.V.
Copyright_xml – notice: 2024 Elsevier B.V.
DBID AAYXX
CITATION
ADTPV
AOWAS
DF2
DOI 10.1016/j.sigpro.2024.109511
DatabaseName CrossRef
SwePub
SwePub Articles
SWEPUB Uppsala universitet
DatabaseTitle CrossRef
DatabaseTitleList

DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1872-7557
ExternalDocumentID oai_DiVA_org_uu_532237
10_1016_j_sigpro_2024_109511
S0165168424001300
GrantInformation_xml – fundername: Swedish Research Council
  grantid: 2017-04610; 2016-06079; 2021-05022
  funderid: http://dx.doi.org/10.13039/501100004359
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
123
1B1
1~.
1~5
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABFNM
ABFRF
ABMAC
ABXDB
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EJD
EO8
EO9
EP2
EP3
F0J
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
IHE
J1W
JJJVA
KOM
LG9
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
TAE
TN5
WUQ
XPP
ZMT
~02
~G-
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABJNI
ABWVN
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AFXIZ
AGCQF
AGQPQ
AGRNS
AIGII
AIIUN
AKBMS
AKYEP
ANKPU
APXCP
BNPGV
CITATION
SSH
ADTPV
AOWAS
DF2
EFKBS
EFLBG
ID FETCH-LOGICAL-c343t-3de337928193cd8acfa90d1c3759c39aab2e22218ce8f9408094633da0e9cd1a3
IEDL.DBID AIKHN
ISSN 0165-1684
1872-7557
IngestDate Tue Sep 09 23:02:54 EDT 2025
Tue Jul 01 02:07:38 EDT 2025
Thu Apr 24 22:57:44 EDT 2025
Tue Jun 18 08:51:03 EDT 2024
IsPeerReviewed true
IsScholarly true
Keywords Multivariate Pearson correlation (MPC)
Matthews correlation coefficient (MCC)
Multinary classification
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c343t-3de337928193cd8acfa90d1c3759c39aab2e22218ce8f9408094633da0e9cd1a3
ORCID 0000-0002-7957-3711
ParticipantIDs swepub_primary_oai_DiVA_org_uu_532237
crossref_citationtrail_10_1016_j_sigpro_2024_109511
crossref_primary_10_1016_j_sigpro_2024_109511
elsevier_sciencedirect_doi_10_1016_j_sigpro_2024_109511
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2024-09-01
PublicationDateYYYYMMDD 2024-09-01
PublicationDate_xml – month: 09
  year: 2024
  text: 2024-09-01
  day: 01
PublicationDecade 2020
PublicationTitle Signal processing
PublicationYear 2024
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Gorodkin (b14) 2004; 28
Hossin, Sulaiman (b5) 2015; 5
Gorynin, Gangloff, Monfrini, Pieczynski (b11) 2018; 145
Chicco, Jurman (b1) 2020; 21
Tharwat (b3) 2021; 17
Grandini, Bagli, Visani (b2) 2020
Reinke, Tizabi, Baumgartner, Eisenmann, Heckmann-Nötzel, Kavur, Rädsch, Sudre, Acion, Antonelli (b9) 2023
Brynolfsson, Reinhold, Sandsten (b10) 2021; 183
Matthews (b15) 1975; 405
Cramér (b16) 1999
A. Kumar, A. Niculescu-Mizil, K. Kavukcoglu, H. Daumé, A binary classification framework for two-stage multiple kernel learning, in: Proceedings of the 29th International Coference on International Conference on Machine Learning, 2012, pp. 1331–1338.
Sokolova, Lapalme (b4) 2009; 45
Jurman, Riccadonna, Furlanello (b8) 2012; 7
Zhu (b12) 2020; 136
Labatut, Cherifi (b6) 2011
Chicco, Warrens, Jurman (b7) 2021; 9
Brynolfsson (10.1016/j.sigpro.2024.109511_b10) 2021; 183
Gorynin (10.1016/j.sigpro.2024.109511_b11) 2018; 145
Cramér (10.1016/j.sigpro.2024.109511_b16) 1999
Grandini (10.1016/j.sigpro.2024.109511_b2) 2020
Tharwat (10.1016/j.sigpro.2024.109511_b3) 2021; 17
10.1016/j.sigpro.2024.109511_b13
Chicco (10.1016/j.sigpro.2024.109511_b1) 2020; 21
Zhu (10.1016/j.sigpro.2024.109511_b12) 2020; 136
Chicco (10.1016/j.sigpro.2024.109511_b7) 2021; 9
Jurman (10.1016/j.sigpro.2024.109511_b8) 2012; 7
Reinke (10.1016/j.sigpro.2024.109511_b9) 2023
Matthews (10.1016/j.sigpro.2024.109511_b15) 1975; 405
Hossin (10.1016/j.sigpro.2024.109511_b5) 2015; 5
Gorodkin (10.1016/j.sigpro.2024.109511_b14) 2004; 28
Labatut (10.1016/j.sigpro.2024.109511_b6) 2011
Sokolova (10.1016/j.sigpro.2024.109511_b4) 2009; 45
References_xml – volume: 9
  start-page: 78368
  year: 2021
  end-page: 78381
  ident: b7
  article-title: The Matthews correlation coefficient (MCC) is more informative than Cohen’s Kappa and Brier score in binary classification assessment
  publication-title: IEEE Access
– volume: 7
  start-page: e41882
  year: 2012
  ident: b8
  article-title: A comparison of MCC and CEN error measures in multi-class prediction
  publication-title: PLoS One
– year: 2011
  ident: b6
  article-title: Evaluation of performance measures for classifiers comparison
– reference: A. Kumar, A. Niculescu-Mizil, K. Kavukcoglu, H. Daumé, A binary classification framework for two-stage multiple kernel learning, in: Proceedings of the 29th International Coference on International Conference on Machine Learning, 2012, pp. 1331–1338.
– volume: 136
  start-page: 71
  year: 2020
  end-page: 80
  ident: b12
  article-title: On the performance of matthews correlation coefficient (MCC) for imbalanced dataset
  publication-title: Pattern Recognit. Lett.
– volume: 17
  start-page: 168
  year: 2021
  end-page: 192
  ident: b3
  article-title: Classification assessment methods
  publication-title: Appl. Comput. Inform.
– year: 2020
  ident: b2
  article-title: Metrics for multi-class classification: An overview
– volume: 28
  start-page: 367
  year: 2004
  end-page: 374
  ident: b14
  article-title: Comparing two K-category assignments by a K-category correlation coefficient
  publication-title: Comput. Biol. Chem.
– volume: 21
  start-page: 1
  year: 2020
  end-page: 13
  ident: b1
  article-title: The advantages of the matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation
  publication-title: BMC Genomics
– year: 1999
  ident: b16
  article-title: Mathematical Methods of Statistics, vol. 26
– volume: 183
  year: 2021
  ident: b10
  article-title: A time-frequency-shift invariant parameter estimator for oscillating transient functions using the matched window reassignment
  publication-title: Signal Process.
– volume: 45
  start-page: 427
  year: 2009
  end-page: 437
  ident: b4
  article-title: A systematic analysis of performance measures for classification tasks
  publication-title: Inf. Process. Manage.
– volume: 145
  start-page: 183
  year: 2018
  end-page: 192
  ident: b11
  article-title: Assessing the segmentation performance of pairwise and triplet Markov models
  publication-title: Signal Process.
– volume: 405
  start-page: 442
  year: 1975
  end-page: 451
  ident: b15
  article-title: Comparison of the predicted and observed secondary structure of T4 phage lysozyme
  publication-title: Biochim. Biophys. Acta Protein Struct.
– volume: 5
  start-page: 1
  year: 2015
  ident: b5
  article-title: A review on evaluation metrics for data classification evaluations
  publication-title: Int. J. Data Min. Knowl. Manag. Process
– year: 2023
  ident: b9
  article-title: Understanding metric-related pitfalls in image analysis validation
– volume: 405
  start-page: 442
  issue: 2
  year: 1975
  ident: 10.1016/j.sigpro.2024.109511_b15
  article-title: Comparison of the predicted and observed secondary structure of T4 phage lysozyme
  publication-title: Biochim. Biophys. Acta Protein Struct.
  doi: 10.1016/0005-2795(75)90109-9
– volume: 183
  year: 2021
  ident: 10.1016/j.sigpro.2024.109511_b10
  article-title: A time-frequency-shift invariant parameter estimator for oscillating transient functions using the matched window reassignment
  publication-title: Signal Process.
  doi: 10.1016/j.sigpro.2020.107913
– year: 2023
  ident: 10.1016/j.sigpro.2024.109511_b9
– volume: 9
  start-page: 78368
  year: 2021
  ident: 10.1016/j.sigpro.2024.109511_b7
  article-title: The Matthews correlation coefficient (MCC) is more informative than Cohen’s Kappa and Brier score in binary classification assessment
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2021.3084050
– volume: 7
  start-page: e41882
  issue: 8
  year: 2012
  ident: 10.1016/j.sigpro.2024.109511_b8
  article-title: A comparison of MCC and CEN error measures in multi-class prediction
  publication-title: PLoS One
  doi: 10.1371/journal.pone.0041882
– volume: 28
  start-page: 367
  issue: 5–6
  year: 2004
  ident: 10.1016/j.sigpro.2024.109511_b14
  article-title: Comparing two K-category assignments by a K-category correlation coefficient
  publication-title: Comput. Biol. Chem.
  doi: 10.1016/j.compbiolchem.2004.09.006
– volume: 21
  start-page: 1
  year: 2020
  ident: 10.1016/j.sigpro.2024.109511_b1
  article-title: The advantages of the matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation
  publication-title: BMC Genomics
  doi: 10.1186/s12864-019-6413-7
– year: 2020
  ident: 10.1016/j.sigpro.2024.109511_b2
– ident: 10.1016/j.sigpro.2024.109511_b13
– year: 2011
  ident: 10.1016/j.sigpro.2024.109511_b6
– volume: 145
  start-page: 183
  year: 2018
  ident: 10.1016/j.sigpro.2024.109511_b11
  article-title: Assessing the segmentation performance of pairwise and triplet Markov models
  publication-title: Signal Process.
  doi: 10.1016/j.sigpro.2017.12.006
– volume: 17
  start-page: 168
  issue: 1
  year: 2021
  ident: 10.1016/j.sigpro.2024.109511_b3
  article-title: Classification assessment methods
  publication-title: Appl. Comput. Inform.
  doi: 10.1016/j.aci.2018.08.003
– volume: 45
  start-page: 427
  issue: 4
  year: 2009
  ident: 10.1016/j.sigpro.2024.109511_b4
  article-title: A systematic analysis of performance measures for classification tasks
  publication-title: Inf. Process. Manage.
  doi: 10.1016/j.ipm.2009.03.002
– year: 1999
  ident: 10.1016/j.sigpro.2024.109511_b16
– volume: 5
  start-page: 1
  issue: 2
  year: 2015
  ident: 10.1016/j.sigpro.2024.109511_b5
  article-title: A review on evaluation metrics for data classification evaluations
  publication-title: Int. J. Data Min. Knowl. Manag. Process
  doi: 10.5121/ijdkp.2015.5201
– volume: 136
  start-page: 71
  year: 2020
  ident: 10.1016/j.sigpro.2024.109511_b12
  article-title: On the performance of matthews correlation coefficient (MCC) for imbalanced dataset
  publication-title: Pattern Recognit. Lett.
  doi: 10.1016/j.patrec.2020.03.030
SSID ssj0001360
Score 2.471883
Snippet The Pearson–Matthews correlation coefficient (usually abbreviated MCC) is considered to be one of the most useful metrics for the performance of a binary...
The Pearson-Matthews correlation coefficient (usually abbreviated MCC) is considered to be one of the most useful metrics for the performance of a binary...
SourceID swepub
crossref
elsevier
SourceType Open Access Repository
Enrichment Source
Index Database
Publisher
StartPage 109511
SubjectTerms Matthews correlation coefficient (MCC)
Multinary classification
Multivariate Pearson correlation (MPC)
Title Pearson–Matthews correlation coefficients for binary and multinary classification
URI https://dx.doi.org/10.1016/j.sigpro.2024.109511
https://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-532237
Volume 222
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEA61XvQgPrE-yh70GLvZyT5yLFWpil6q4i1kk6xUZJXaXsX_4D_0l5jJ7pZ6EMHbZsmQMAlfJuSbbwg50qEobBYamkWFoRx0TrOkYNRwppjSkc0zTBS-vkmGd_zyIX5okUGTC4O0yhr7K0z3aF3_6dXe7L2Ox70RJuIwfEbi_vnN3duXIxBJ3CbL_Yur4c0ckBn4ZGHsT9GgyaDzNK-38aODKndRjDhKK8WM_XpCLUqJ-uPnfJ2s1XFj0K-mtkFattwkqwtqgltk5OANw-evj8-minegsfhGRXdz39brRSB1InCxapD7XNxAlSbwvELf0hhOI3_IG22Tu_Oz28GQ1jUTqAYOUwrGAqQCn8dAm0zpQonQMA1pLDQIpfLIupCAZdpmheAuXhQ8ATAqtEIbpmCHtMuX0u6SIFY8MUYUqRExZ6EWmVvVQqUJ0wJiVnQINH6SuhYUx7oWz7Jhjj3JyrsSvSsr73YInVu9VoIaf_RPmyWQPzaGdJj_h-VxtWLzcVBK-3R835cvk0c5m8nYoRmke_8eYZ-sYKuinB2Q9nQys4cuRpnmXbJ08s669U78BrCd6ac
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3LTgIxFG0ILtSF8RnxOQtdVqa08-iSoAYV2ACGXdNpOwRjgCBsjf_gH_ol9nZmEBeGxN082szktjk9Tc89F6Er5fPUxL7GcS3VmFGV4DhMCdaMSCJVzSQxJAq3O2Gzzx4HwaCEGkUuDMgqc-zPMN2hdf6kmkezOh2Nql1IxCFwjMTc8Zvdt2-wgEag67t5_9F5EOpShaE1huZF_pwTeb2Nhhao7DaxxsBYKSDkz_Vp1UjULT73u2gnZ41ePfuxPVQy4320veIleIC6FtyAPH99fBY1vD0FpTcysZu9Ns4tAoQTnmWqXuIycT051p5TFbo7BWQa1EOu0yHq39_1Gk2cV0zAijI6x1QbSiMOh2NU6ViqVHJfE2WjwhXlUiY1YwkBiZWJU84sW-QspFRL33CliaRHqDyejM0x8gLJQq15GmkeMOIrHtsxTWUUEsVpQNIKokWchMrtxKGqxasodGMvIouugOiKLLoVhJe9ppmdxpr2UTEE4te0EBbx1_S8zkZs-R0w0r4dPdfFZDYUi4UILJbR6OTfX7hEm81euyVaD52nU7QFbzLx2Rkqz2cLc27Zyjy5cLPxG63a6nI
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Pearson%E2%80%93Matthews+correlation+coefficients+for+binary+and+multinary+classification&rft.jtitle=Signal+processing&rft.au=Stoica%2C+Petre&rft.au=Babu%2C+Prabhu&rft.date=2024-09-01&rft.pub=Elsevier+B.V&rft.issn=0165-1684&rft.eissn=1872-7557&rft.volume=222&rft_id=info:doi/10.1016%2Fj.sigpro.2024.109511&rft.externalDocID=S0165168424001300
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0165-1684&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0165-1684&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0165-1684&client=summon