K-Means and Related Clustering Methods

K-Means is arguably the most popular data analysis method. The method outputs a partition of the entity set into clusters and centroids representing them. It is very intuitive and usually requires just a few pages to get presented. This text includes a number of less popular subjects that are import...

Full description

Saved in:

Bibliographic Details
Published in	Core Concepts in Data Analysis pp. 221 - 281
Main Author	Mirkin, Boris
Format	Book Chapter
Language	English
Published	United Kingdom Springer London, Limited 2011 Springer London
Series	Undergraduate Topics in Computer Science
Subjects	Artificial intelligence Cluster Centroid Company Data Data Scatter Discrete mathematics Fuzzy Cluster Gravity Center Maths for computer scientists
Online Access	Get full text
ISBN	0857292862 9780857292865
ISSN	1863-7310
DOI	10.1007/978-0-85729-287-2_6

Cover

Loading…

Abstract	K-Means is arguably the most popular data analysis method. The method outputs a partition of the entity set into clusters and centroids representing them. It is very intuitive and usually requires just a few pages to get presented. This text includes a number of less popular subjects that are important when using K-Means for real-world data analysis: Data standardization, especially, at mixed scales Innate tools for interpretation of clusters Analysis of examples of K-Means working and its failures Initialization – the choice of the number of clusters and location of centroids sVersions of K-Means such as incremental K-Means, nature inspired K-Means, and entity-centroid “medoid” methods are presented. Three modifications of K-Means onto different cluster structures are given:. Fuzzy K-Means for finding fuzzy clusters, Expectation-Maximization (EM) for finding probabilistic clusters, and Kohonen self-organizing maps (SOM) that tie up the sought clusters to a visually convenient two-dimensional grid. Equivalent reformulations of K-Means criterion are described – they can yield different algorithms for K-Means. One of these is explained at length: K-Means extends Principal component analysis to the case of binary scoring factors, which yields the so-called Anomalous cluster method, a key to an intelligent version of K-Means with automated choice of the number of clusters and their initialization.
AbstractList	K-Means is arguably the most popular data analysis method. The method outputs a partition of the entity set into clusters and centroids representing them. It is very intuitive and usually requires just a few pages to get presented. This text includes a number of less popular subjects that are important when using K-Means for real-world data analysis: Data standardization, especially, at mixed scales Innate tools for interpretation of clusters Analysis of examples of K-Means working and its failures Initialization – the choice of the number of clusters and location of centroids sVersions of K-Means such as incremental K-Means, nature inspired K-Means, and entity-centroid “medoid” methods are presented. Three modifications of K-Means onto different cluster structures are given:. Fuzzy K-Means for finding fuzzy clusters, Expectation-Maximization (EM) for finding probabilistic clusters, and Kohonen self-organizing maps (SOM) that tie up the sought clusters to a visually convenient two-dimensional grid. Equivalent reformulations of K-Means criterion are described – they can yield different algorithms for K-Means. One of these is explained at length: K-Means extends Principal component analysis to the case of binary scoring factors, which yields the so-called Anomalous cluster method, a key to an intelligent version of K-Means with automated choice of the number of clusters and their initialization.
Author	Mirkin, Boris
Author_xml	– sequence: 1 fullname: Mirkin, Boris
BookMark	eNqVkMtOwzAQRY0oiLTkC9hkxc7gseNHlqjiJVohoe4tJ55QIEpCnP4_Tlv2cDejuZozoztzMmu7Fgm5AnYDjOnbQhvKqJGaF5QbTblVJySNLtt70eKnZP7bKD4jCRglqBbAzkmigSuWA-gLkobwyaKkkHkOCbl-oWt0bchc67M3bNyIPls2uzDi8NG-Z2sct50Pl-Ssdk3A9FgXZPNwv1k-0dXr4_PybkV70FpRBb7gBfqq9K6STkshlKscGqmMML70nKscNHNlUdY1eKlqgwzAoAYtUCwIHNaGfrqOgy277itYYHb6g42JLbP7mDaGtvEPkckPTD903zsMo8UJqrAdB9dUW9fHKCEOgoDcWDDGcvFnTEYBFP_FBFNKcXnEpPgB3rWCEw
ContentType	Book Chapter
Copyright	Springer-Verlag London Limited 2011
Copyright_xml	– notice: Springer-Verlag London Limited 2011
DBID	FFUUA
DEWEY	006.31
DOI	10.1007/978-0-85729-287-2_6
DatabaseName	ProQuest Ebook Central - Book Chapters - Demo use only
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISBN	9780857292872 0857292870
EndPage	281
ExternalDocumentID	EBC6313148_188_236 EBC5555119_188_236 EBC3066625_188_235
GroupedDBID	-T. .~Z 089 0D6 0DA 0E8 38. A4J AABBV AAFYB AAINA AAMFE ABFCV ABMNI ABMRC AECAB AECMQ AEGQK AEJLV AEKFX AETDV AEZAY ALMA_UNASSIGNED_HOLDINGS ANXAN AZZ BBABE C9S C9V CZZ FFUUA I4C IEZ JJU SBO TCUKC TPJZQ UZ6 Z83 Z88 AAJYQ AATVQ ABBUY ABCYT ACDTA ACDUY AEHEY AHNNE ATJMZ
ID	FETCH-LOGICAL-p1776-61d929edcbdac5a75336acae856838dbd2264170ab9bff1d56f8e0118e7173e3
ISBN	0857292862 9780857292865
ISSN	1863-7310
IngestDate	Tue Jul 29 19:59:53 EDT 2025 Wed May 28 23:48:33 EDT 2025 Thu May 29 16:37:55 EDT 2025 Thu May 29 16:12:18 EDT 2025
IsPeerReviewed	false
IsScholarly	false
LCCallNum	QA76.9.D343 -- M57 2011eb
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-p1776-61d929edcbdac5a75336acae856838dbd2264170ab9bff1d56f8e0118e7173e3
OCLC	712604117 1193117555
PQID	EBC3066625_188_235
PageCount	61
ParticipantIDs	springer_books_10_1007_978_0_85729_287_2_6 proquest_ebookcentralchapters_6313148_188_236 proquest_ebookcentralchapters_5555119_188_236 proquest_ebookcentralchapters_3066625_188_235
PublicationCentury	2000
PublicationDate	2011 20110209
PublicationDateYYYYMMDD	2011-01-01 2011-02-09
PublicationDate_xml	– year: 2011 text: 2011
PublicationDecade	2010
PublicationPlace	United Kingdom
PublicationPlace_xml	– name: United Kingdom – name: London
PublicationSeriesTitle	Undergraduate Topics in Computer Science
PublicationSeriesTitleAlternate	Undergraduate Topics Computer Sci.
PublicationSubtitle	Summarization, Correlation and Visualization
PublicationTitle	Core Concepts in Data Analysis
PublicationYear	2011
Publisher	Springer London, Limited Springer London
Publisher_xml	– name: Springer London, Limited – name: Springer London
SSID	ssj0000535441
Score	1.3718988
Snippet	K-Means is arguably the most popular data analysis method. The method outputs a partition of the entity set into clusters and centroids representing them. It...
SourceID	springer proquest
SourceType	Publisher
StartPage	221
SubjectTerms	Artificial intelligence Cluster Centroid Company Data Data Scatter Discrete mathematics Fuzzy Cluster Gravity Center Maths for computer scientists
Title	K-Means and Related Clustering Methods
URI	http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=3066625&ppg=235 http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=5555119&ppg=236 http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=6313148&ppg=236 http://link.springer.com/10.1007/978-0-85729-287-2_6
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1La9wwEBbJ5lJ6aJu29Bl8CD20qKwkS5aPzXZLyKOnbclN6GUohN2w2Vz66zsjy3a9LIRkD2YRg5H1CWtmPN83hBy7hrvSc0lFkJ6WTkwpHMKMRqmjd9LhKYjVFj_V6a_y7EpeDWoWiV2ycV_93528ksegCmOAK7JkH4Bsf1MYgP-AL1wBYbhuOb_jNGurK7BaY8u5xDpMRa3f7cb2IiP9Qv5Z54ZbJ6t1N9ze4JxeRjioMkXx2qLvObu-Q-UEzB9cpt7So7RAIsf9nxbo0oJfhq4gY8ZUG0Civj2vkZ2683U6VFBMabKkEGBRbnaIV_NWdWRLvHp-MhMYIXFpmNYGjPbJfqXlhBx8m59d_O7TYSgzA55ZUujMU-JZH6mfYi8a1ekCj6Y0ChG2vmonZ2HxnDxFAkmBzA6Y5AuyF5eH5FnXPqPIb9OX5FMGoAAAigxAMQBQZABekcWP-WJ2SnPbCnrDqkpBMB5gu8fgXbBeWogHhbLeRi2VFjq4gNxlVk2tq13TsCBVoyMSgCNWRETxmkyWq2V8Q4pQl7GBmC8Iy0rJrVY1k7YMsdFaVZ6_JbR7ZJO-reeCXt8-4K3ZWvt77SX8GKuzvbrXXgkmIJ4e7D93627Q_NZ0otiAl5mahJcBvAzg9e6Bk39Pngz7_AOZbNZ38SM4hBt3lDfTP_cAWWw
linkProvider	Library Specific Holdings
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Core+Concepts+in+Data+Analysis&rft.au=Mirkin%2C+Boris&rft.atitle=K-Means+and+Related+Clustering+Methods&rft.date=2011-01-01&rft.pub=Springer+London%2C+Limited&rft.isbn=9780857292865&rft_id=info:doi/10.1007%2F978-0-85729-287-2_6&rft.externalDBID=235&rft.externalDocID=EBC3066625_188_235
thumbnail_s	http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F3066625-l.jpg http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F5555119-l.jpg http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F6313148-l.jpg