K-Means and Related Clustering Methods

K-Means is arguably the most popular data analysis method. The method outputs a partition of the entity set into clusters and centroids representing them. It is very intuitive and usually requires just a few pages to get presented. This text includes a number of less popular subjects that are import...

Full description

Saved in:
Bibliographic Details
Published inCore Concepts in Data Analysis pp. 221 - 281
Main Author Mirkin, Boris
Format Book Chapter
LanguageEnglish
Published United Kingdom Springer London, Limited 2011
Springer London
SeriesUndergraduate Topics in Computer Science
Subjects
Online AccessGet full text
ISBN0857292862
9780857292865
ISSN1863-7310
DOI10.1007/978-0-85729-287-2_6

Cover

Loading…
Abstract K-Means is arguably the most popular data analysis method. The method outputs a partition of the entity set into clusters and centroids representing them. It is very intuitive and usually requires just a few pages to get presented. This text includes a number of less popular subjects that are important when using K-Means for real-world data analysis: Data standardization, especially, at mixed scales Innate tools for interpretation of clusters Analysis of examples of K-Means working and its failures Initialization – the choice of the number of clusters and location of centroids sVersions of K-Means such as incremental K-Means, nature inspired K-Means, and entity-centroid “medoid” methods are presented. Three modifications of K-Means onto different cluster structures are given:. Fuzzy K-Means for finding fuzzy clusters, Expectation-Maximization (EM) for finding probabilistic clusters, and Kohonen self-organizing maps (SOM) that tie up the sought clusters to a visually convenient two-dimensional grid. Equivalent reformulations of K-Means criterion are described – they can yield different algorithms for K-Means. One of these is explained at length: K-Means extends Principal component analysis to the case of binary scoring factors, which yields the so-called Anomalous cluster method, a key to an intelligent version of K-Means with automated choice of the number of clusters and their initialization.
AbstractList K-Means is arguably the most popular data analysis method. The method outputs a partition of the entity set into clusters and centroids representing them. It is very intuitive and usually requires just a few pages to get presented. This text includes a number of less popular subjects that are important when using K-Means for real-world data analysis: Data standardization, especially, at mixed scales Innate tools for interpretation of clusters Analysis of examples of K-Means working and its failures Initialization – the choice of the number of clusters and location of centroids sVersions of K-Means such as incremental K-Means, nature inspired K-Means, and entity-centroid “medoid” methods are presented. Three modifications of K-Means onto different cluster structures are given:. Fuzzy K-Means for finding fuzzy clusters, Expectation-Maximization (EM) for finding probabilistic clusters, and Kohonen self-organizing maps (SOM) that tie up the sought clusters to a visually convenient two-dimensional grid. Equivalent reformulations of K-Means criterion are described – they can yield different algorithms for K-Means. One of these is explained at length: K-Means extends Principal component analysis to the case of binary scoring factors, which yields the so-called Anomalous cluster method, a key to an intelligent version of K-Means with automated choice of the number of clusters and their initialization.
Author Mirkin, Boris
Author_xml – sequence: 1
  fullname: Mirkin, Boris
BookMark eNqVkMtOwzAQRY0oiLTkC9hkxc7gseNHlqjiJVohoe4tJ55QIEpCnP4_Tlv2cDejuZozoztzMmu7Fgm5AnYDjOnbQhvKqJGaF5QbTblVJySNLtt70eKnZP7bKD4jCRglqBbAzkmigSuWA-gLkobwyaKkkHkOCbl-oWt0bchc67M3bNyIPls2uzDi8NG-Z2sct50Pl-Ssdk3A9FgXZPNwv1k-0dXr4_PybkV70FpRBb7gBfqq9K6STkshlKscGqmMML70nKscNHNlUdY1eKlqgwzAoAYtUCwIHNaGfrqOgy277itYYHb6g42JLbP7mDaGtvEPkckPTD903zsMo8UJqrAdB9dUW9fHKCEOgoDcWDDGcvFnTEYBFP_FBFNKcXnEpPgB3rWCEw
ContentType Book Chapter
Copyright Springer-Verlag London Limited 2011
Copyright_xml – notice: Springer-Verlag London Limited 2011
DBID FFUUA
DEWEY 006.31
DOI 10.1007/978-0-85729-287-2_6
DatabaseName ProQuest Ebook Central - Book Chapters - Demo use only
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9780857292872
0857292870
EndPage 281
ExternalDocumentID EBC6313148_188_236
EBC5555119_188_236
EBC3066625_188_235
GroupedDBID -T.
.~Z
089
0D6
0DA
0E8
38.
A4J
AABBV
AAFYB
AAINA
AAMFE
ABFCV
ABMNI
ABMRC
AECAB
AECMQ
AEGQK
AEJLV
AEKFX
AETDV
AEZAY
ALMA_UNASSIGNED_HOLDINGS
ANXAN
AZZ
BBABE
C9S
C9V
CZZ
FFUUA
I4C
IEZ
JJU
SBO
TCUKC
TPJZQ
UZ6
Z83
Z88
AAJYQ
AATVQ
ABBUY
ABCYT
ACDTA
ACDUY
AEHEY
AHNNE
ATJMZ
ID FETCH-LOGICAL-p1776-61d929edcbdac5a75336acae856838dbd2264170ab9bff1d56f8e0118e7173e3
ISBN 0857292862
9780857292865
ISSN 1863-7310
IngestDate Tue Jul 29 19:59:53 EDT 2025
Wed May 28 23:48:33 EDT 2025
Thu May 29 16:37:55 EDT 2025
Thu May 29 16:12:18 EDT 2025
IsPeerReviewed false
IsScholarly false
LCCallNum QA76.9.D343 -- M57 2011eb
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-p1776-61d929edcbdac5a75336acae856838dbd2264170ab9bff1d56f8e0118e7173e3
OCLC 712604117
1193117555
PQID EBC3066625_188_235
PageCount 61
ParticipantIDs springer_books_10_1007_978_0_85729_287_2_6
proquest_ebookcentralchapters_6313148_188_236
proquest_ebookcentralchapters_5555119_188_236
proquest_ebookcentralchapters_3066625_188_235
PublicationCentury 2000
PublicationDate 2011
20110209
PublicationDateYYYYMMDD 2011-01-01
2011-02-09
PublicationDate_xml – year: 2011
  text: 2011
PublicationDecade 2010
PublicationPlace United Kingdom
PublicationPlace_xml – name: United Kingdom
– name: London
PublicationSeriesTitle Undergraduate Topics in Computer Science
PublicationSeriesTitleAlternate Undergraduate Topics Computer Sci.
PublicationSubtitle Summarization, Correlation and Visualization
PublicationTitle Core Concepts in Data Analysis
PublicationYear 2011
Publisher Springer London, Limited
Springer London
Publisher_xml – name: Springer London, Limited
– name: Springer London
SSID ssj0000535441
Score 1.3718988
Snippet K-Means is arguably the most popular data analysis method. The method outputs a partition of the entity set into clusters and centroids representing them. It...
SourceID springer
proquest
SourceType Publisher
StartPage 221
SubjectTerms Artificial intelligence
Cluster Centroid
Company Data
Data Scatter
Discrete mathematics
Fuzzy Cluster
Gravity Center
Maths for computer scientists
Title K-Means and Related Clustering Methods
URI http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=3066625&ppg=235
http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=5555119&ppg=236
http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=6313148&ppg=236
http://link.springer.com/10.1007/978-0-85729-287-2_6
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1La9wwEBbJ5lJ6aJu29Bl8CD20qKwkS5aPzXZLyKOnbclN6GUohN2w2Vz66zsjy3a9LIRkD2YRg5H1CWtmPN83hBy7hrvSc0lFkJ6WTkwpHMKMRqmjd9LhKYjVFj_V6a_y7EpeDWoWiV2ycV_93528ksegCmOAK7JkH4Bsf1MYgP-AL1wBYbhuOb_jNGurK7BaY8u5xDpMRa3f7cb2IiP9Qv5Z54ZbJ6t1N9ze4JxeRjioMkXx2qLvObu-Q-UEzB9cpt7So7RAIsf9nxbo0oJfhq4gY8ZUG0Civj2vkZ2683U6VFBMabKkEGBRbnaIV_NWdWRLvHp-MhMYIXFpmNYGjPbJfqXlhBx8m59d_O7TYSgzA55ZUujMU-JZH6mfYi8a1ekCj6Y0ChG2vmonZ2HxnDxFAkmBzA6Y5AuyF5eH5FnXPqPIb9OX5FMGoAAAigxAMQBQZABekcWP-WJ2SnPbCnrDqkpBMB5gu8fgXbBeWogHhbLeRi2VFjq4gNxlVk2tq13TsCBVoyMSgCNWRETxmkyWq2V8Q4pQl7GBmC8Iy0rJrVY1k7YMsdFaVZ6_JbR7ZJO-reeCXt8-4K3ZWvt77SX8GKuzvbrXXgkmIJ4e7D93627Q_NZ0otiAl5mahJcBvAzg9e6Bk39Pngz7_AOZbNZ38SM4hBt3lDfTP_cAWWw
linkProvider Library Specific Holdings
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Core+Concepts+in+Data+Analysis&rft.au=Mirkin%2C+Boris&rft.atitle=K-Means+and+Related+Clustering+Methods&rft.date=2011-01-01&rft.pub=Springer+London%2C+Limited&rft.isbn=9780857292865&rft_id=info:doi/10.1007%2F978-0-85729-287-2_6&rft.externalDBID=235&rft.externalDocID=EBC3066625_188_235
thumbnail_s http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F3066625-l.jpg
http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F5555119-l.jpg
http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F6313148-l.jpg