K-Means and Related Clustering Methods
K-Means is arguably the most popular data analysis method. The method outputs a partition of the entity set into clusters and centroids representing them. It is very intuitive and usually requires just a few pages to get presented. This text includes a number of less popular subjects that are import...
Saved in:
Published in | Core Concepts in Data Analysis pp. 221 - 281 |
---|---|
Main Author | |
Format | Book Chapter |
Language | English |
Published |
United Kingdom
Springer London, Limited
2011
Springer London |
Series | Undergraduate Topics in Computer Science |
Subjects | |
Online Access | Get full text |
ISBN | 0857292862 9780857292865 |
ISSN | 1863-7310 |
DOI | 10.1007/978-0-85729-287-2_6 |
Cover
Loading…
Abstract | K-Means is arguably the most popular data analysis method. The method outputs a partition of the entity set into clusters and centroids representing them. It is very intuitive and usually requires just a few pages to get presented. This text includes a number of less popular subjects that are important when using K-Means for real-world data analysis: Data standardization, especially, at mixed scales Innate tools for interpretation of clusters Analysis of examples of K-Means working and its failures Initialization – the choice of the number of clusters and location of centroids sVersions of K-Means such as incremental K-Means, nature inspired K-Means, and entity-centroid “medoid” methods are presented. Three modifications of K-Means onto different cluster structures are given:. Fuzzy K-Means for finding fuzzy clusters, Expectation-Maximization (EM) for finding probabilistic clusters, and Kohonen self-organizing maps (SOM) that tie up the sought clusters to a visually convenient two-dimensional grid. Equivalent reformulations of K-Means criterion are described – they can yield different algorithms for K-Means. One of these is explained at length: K-Means extends Principal component analysis to the case of binary scoring factors, which yields the so-called Anomalous cluster method, a key to an intelligent version of K-Means with automated choice of the number of clusters and their initialization. |
---|---|
AbstractList | K-Means is arguably the most popular data analysis method. The method outputs a partition of the entity set into clusters and centroids representing them. It is very intuitive and usually requires just a few pages to get presented. This text includes a number of less popular subjects that are important when using K-Means for real-world data analysis: Data standardization, especially, at mixed scales Innate tools for interpretation of clusters Analysis of examples of K-Means working and its failures Initialization – the choice of the number of clusters and location of centroids sVersions of K-Means such as incremental K-Means, nature inspired K-Means, and entity-centroid “medoid” methods are presented. Three modifications of K-Means onto different cluster structures are given:. Fuzzy K-Means for finding fuzzy clusters, Expectation-Maximization (EM) for finding probabilistic clusters, and Kohonen self-organizing maps (SOM) that tie up the sought clusters to a visually convenient two-dimensional grid. Equivalent reformulations of K-Means criterion are described – they can yield different algorithms for K-Means. One of these is explained at length: K-Means extends Principal component analysis to the case of binary scoring factors, which yields the so-called Anomalous cluster method, a key to an intelligent version of K-Means with automated choice of the number of clusters and their initialization. |
Author | Mirkin, Boris |
Author_xml | – sequence: 1 fullname: Mirkin, Boris |
BookMark | eNqVkMtOwzAQRY0oiLTkC9hkxc7gseNHlqjiJVohoe4tJ55QIEpCnP4_Tlv2cDejuZozoztzMmu7Fgm5AnYDjOnbQhvKqJGaF5QbTblVJySNLtt70eKnZP7bKD4jCRglqBbAzkmigSuWA-gLkobwyaKkkHkOCbl-oWt0bchc67M3bNyIPls2uzDi8NG-Z2sct50Pl-Ssdk3A9FgXZPNwv1k-0dXr4_PybkV70FpRBb7gBfqq9K6STkshlKscGqmMML70nKscNHNlUdY1eKlqgwzAoAYtUCwIHNaGfrqOgy277itYYHb6g42JLbP7mDaGtvEPkckPTD903zsMo8UJqrAdB9dUW9fHKCEOgoDcWDDGcvFnTEYBFP_FBFNKcXnEpPgB3rWCEw |
ContentType | Book Chapter |
Copyright | Springer-Verlag London Limited 2011 |
Copyright_xml | – notice: Springer-Verlag London Limited 2011 |
DBID | FFUUA |
DEWEY | 006.31 |
DOI | 10.1007/978-0-85729-287-2_6 |
DatabaseName | ProQuest Ebook Central - Book Chapters - Demo use only |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISBN | 9780857292872 0857292870 |
EndPage | 281 |
ExternalDocumentID | EBC6313148_188_236 EBC5555119_188_236 EBC3066625_188_235 |
GroupedDBID | -T. .~Z 089 0D6 0DA 0E8 38. A4J AABBV AAFYB AAINA AAMFE ABFCV ABMNI ABMRC AECAB AECMQ AEGQK AEJLV AEKFX AETDV AEZAY ALMA_UNASSIGNED_HOLDINGS ANXAN AZZ BBABE C9S C9V CZZ FFUUA I4C IEZ JJU SBO TCUKC TPJZQ UZ6 Z83 Z88 AAJYQ AATVQ ABBUY ABCYT ACDTA ACDUY AEHEY AHNNE ATJMZ |
ID | FETCH-LOGICAL-p1776-61d929edcbdac5a75336acae856838dbd2264170ab9bff1d56f8e0118e7173e3 |
ISBN | 0857292862 9780857292865 |
ISSN | 1863-7310 |
IngestDate | Tue Jul 29 19:59:53 EDT 2025 Wed May 28 23:48:33 EDT 2025 Thu May 29 16:37:55 EDT 2025 Thu May 29 16:12:18 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
LCCallNum | QA76.9.D343 -- M57 2011eb |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-p1776-61d929edcbdac5a75336acae856838dbd2264170ab9bff1d56f8e0118e7173e3 |
OCLC | 712604117 1193117555 |
PQID | EBC3066625_188_235 |
PageCount | 61 |
ParticipantIDs | springer_books_10_1007_978_0_85729_287_2_6 proquest_ebookcentralchapters_6313148_188_236 proquest_ebookcentralchapters_5555119_188_236 proquest_ebookcentralchapters_3066625_188_235 |
PublicationCentury | 2000 |
PublicationDate | 2011 20110209 |
PublicationDateYYYYMMDD | 2011-01-01 2011-02-09 |
PublicationDate_xml | – year: 2011 text: 2011 |
PublicationDecade | 2010 |
PublicationPlace | United Kingdom |
PublicationPlace_xml | – name: United Kingdom – name: London |
PublicationSeriesTitle | Undergraduate Topics in Computer Science |
PublicationSeriesTitleAlternate | Undergraduate Topics Computer Sci. |
PublicationSubtitle | Summarization, Correlation and Visualization |
PublicationTitle | Core Concepts in Data Analysis |
PublicationYear | 2011 |
Publisher | Springer London, Limited Springer London |
Publisher_xml | – name: Springer London, Limited – name: Springer London |
SSID | ssj0000535441 |
Score | 1.3718988 |
Snippet | K-Means is arguably the most popular data analysis method. The method outputs a partition of the entity set into clusters and centroids representing them. It... |
SourceID | springer proquest |
SourceType | Publisher |
StartPage | 221 |
SubjectTerms | Artificial intelligence Cluster Centroid Company Data Data Scatter Discrete mathematics Fuzzy Cluster Gravity Center Maths for computer scientists |
Title | K-Means and Related Clustering Methods |
URI | http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=3066625&ppg=235 http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=5555119&ppg=236 http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=6313148&ppg=236 http://link.springer.com/10.1007/978-0-85729-287-2_6 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1La9wwEBbJ5lJ6aJu29Bl8CD20qKwkS5aPzXZLyKOnbclN6GUohN2w2Vz66zsjy3a9LIRkD2YRg5H1CWtmPN83hBy7hrvSc0lFkJ6WTkwpHMKMRqmjd9LhKYjVFj_V6a_y7EpeDWoWiV2ycV_93528ksegCmOAK7JkH4Bsf1MYgP-AL1wBYbhuOb_jNGurK7BaY8u5xDpMRa3f7cb2IiP9Qv5Z54ZbJ6t1N9ze4JxeRjioMkXx2qLvObu-Q-UEzB9cpt7So7RAIsf9nxbo0oJfhq4gY8ZUG0Civj2vkZ2683U6VFBMabKkEGBRbnaIV_NWdWRLvHp-MhMYIXFpmNYGjPbJfqXlhBx8m59d_O7TYSgzA55ZUujMU-JZH6mfYi8a1ekCj6Y0ChG2vmonZ2HxnDxFAkmBzA6Y5AuyF5eH5FnXPqPIb9OX5FMGoAAAigxAMQBQZABekcWP-WJ2SnPbCnrDqkpBMB5gu8fgXbBeWogHhbLeRi2VFjq4gNxlVk2tq13TsCBVoyMSgCNWRETxmkyWq2V8Q4pQl7GBmC8Iy0rJrVY1k7YMsdFaVZ6_JbR7ZJO-reeCXt8-4K3ZWvt77SX8GKuzvbrXXgkmIJ4e7D93627Q_NZ0otiAl5mahJcBvAzg9e6Bk39Pngz7_AOZbNZ38SM4hBt3lDfTP_cAWWw |
linkProvider | Library Specific Holdings |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Core+Concepts+in+Data+Analysis&rft.au=Mirkin%2C+Boris&rft.atitle=K-Means+and+Related+Clustering+Methods&rft.date=2011-01-01&rft.pub=Springer+London%2C+Limited&rft.isbn=9780857292865&rft_id=info:doi/10.1007%2F978-0-85729-287-2_6&rft.externalDBID=235&rft.externalDocID=EBC3066625_188_235 |
thumbnail_s | http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F3066625-l.jpg http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F5555119-l.jpg http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F6313148-l.jpg |