Accelerated sparse Kernel Spectral Clustering for large scale data clustering problems
An improved version of the sparse multiway kernel spectral clustering (KSC) is presented in this brief. The original algorithm is derived from weighted kernel principal component (KPCA) analysis formulated within the primal-dual least-squares support vector machine (LS-SVM) framework. Sparsity is ac...
Saved in:
Published in | arXiv.org |
---|---|
Main Authors | , , , |
Format | Paper |
Language | English |
Published |
Ithaca
Cornell University Library, arXiv.org
20.10.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | An improved version of the sparse multiway kernel spectral clustering (KSC) is presented in this brief. The original algorithm is derived from weighted kernel principal component (KPCA) analysis formulated within the primal-dual least-squares support vector machine (LS-SVM) framework. Sparsity is achieved then by the combination of the incomplete Cholesky decomposition (ICD) based low rank approximation of the kernel matrix with the so called reduced set method. The original ICD based sparse KSC algorithm was reported to be computationally far too demanding, especially when applied on large scale data clustering problems that actually it was designed for, which has prevented to gain more than simply theoretical relevance so far. This is altered by the modifications reported in this brief that drastically improve the computational characteristics. Solving the alternative, symmetrized version of the computationally most demanding core eigenvalue problem eliminates the necessity of forming and SVD of large matrices during the model construction. This results in solving clustering problems now within seconds that were reported to require hours without altering the results. Furthermore, sparsity is also improved significantly, leading to more compact model representation, increasing further not only the computational efficiency but also the descriptive power. These transform the original, only theoretically relevant ICD based sparse KSC algorithm applicable for large scale practical clustering problems. Theoretical results and improvements are demonstrated by computational experiments on carefully selected synthetic data as well as on real life problems such as image segmentation. |
---|---|
AbstractList | An improved version of the sparse multiway kernel spectral clustering (KSC) is presented in this brief. The original algorithm is derived from weighted kernel principal component (KPCA) analysis formulated within the primal-dual least-squares support vector machine (LS-SVM) framework. Sparsity is achieved then by the combination of the incomplete Cholesky decomposition (ICD) based low rank approximation of the kernel matrix with the so called reduced set method. The original ICD based sparse KSC algorithm was reported to be computationally far too demanding, especially when applied on large scale data clustering problems that actually it was designed for, which has prevented to gain more than simply theoretical relevance so far. This is altered by the modifications reported in this brief that drastically improve the computational characteristics. Solving the alternative, symmetrized version of the computationally most demanding core eigenvalue problem eliminates the necessity of forming and SVD of large matrices during the model construction. This results in solving clustering problems now within seconds that were reported to require hours without altering the results. Furthermore, sparsity is also improved significantly, leading to more compact model representation, increasing further not only the computational efficiency but also the descriptive power. These transform the original, only theoretically relevant ICD based sparse KSC algorithm applicable for large scale practical clustering problems. Theoretical results and improvements are demonstrated by computational experiments on carefully selected synthetic data as well as on real life problems such as image segmentation. |
Author | Novak, Mihaly Suykens, Johan Langone, Rocco Alzate, Carlos |
Author_xml | – sequence: 1 givenname: Mihaly surname: Novak fullname: Novak, Mihaly – sequence: 2 givenname: Rocco surname: Langone fullname: Langone, Rocco – sequence: 3 givenname: Carlos surname: Alzate fullname: Alzate, Carlos – sequence: 4 givenname: Johan surname: Suykens fullname: Suykens, Johan |
BookMark | eNqNyrEKwjAQgOEgClbtOxw4CzFtNasURXBUXEtMr8USk3qXvr8dBFenf_j-hZj64HEiEpVl243OlZqLlLmTUqrdXhVFloj7wVp0SCZiDdwbYoQLkkcH1x5tJOOgdANHpKdvoQkEzlCLwNY4hNpEA_bnPYWHwxevxKwxjjH9dinWp-OtPG_G4T0gx6oLA_mRKqW1LHSe5zr77_oArz1DcQ |
ContentType | Paper |
Copyright | 2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: 2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | 8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ L6V M7S PIMPY PQEST PQQKQ PQUKI PRINS PTHSS |
DatabaseName | ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central Korea SciTech Premium Collection ProQuest Engineering Collection Engineering Database Publicly Available Content Database ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection |
DatabaseTitle | Publicly Available Content Database Engineering Database Technology Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest Engineering Collection ProQuest One Academic UKI Edition ProQuest Central Korea Materials Science & Engineering Collection ProQuest One Academic Engineering Collection |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Physics |
EISSN | 2331-8422 |
Genre | Working Paper/Pre-Print |
GroupedDBID | 8FE 8FG ABJCF ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BENPR BGLVJ CCPQU DWQXO FRJ HCIFZ L6V M7S M~E PIMPY PQEST PQQKQ PQUKI PRINS PTHSS |
ID | FETCH-proquest_journals_28805844483 |
IEDL.DBID | BENPR |
IngestDate | Thu Oct 10 16:01:21 EDT 2024 |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-proquest_journals_28805844483 |
OpenAccessLink | https://www.proquest.com/docview/2880584448?pq-origsite=%requestingapplication% |
PQID | 2880584448 |
PQPubID | 2050157 |
ParticipantIDs | proquest_journals_2880584448 |
PublicationCentury | 2000 |
PublicationDate | 20231020 |
PublicationDateYYYYMMDD | 2023-10-20 |
PublicationDate_xml | – month: 10 year: 2023 text: 20231020 day: 20 |
PublicationDecade | 2020 |
PublicationPlace | Ithaca |
PublicationPlace_xml | – name: Ithaca |
PublicationTitle | arXiv.org |
PublicationYear | 2023 |
Publisher | Cornell University Library, arXiv.org |
Publisher_xml | – name: Cornell University Library, arXiv.org |
SSID | ssj0002672553 |
Score | 3.4967403 |
SecondaryResourceType | preprint |
Snippet | An improved version of the sparse multiway kernel spectral clustering (KSC) is presented in this brief. The original algorithm is derived from weighted kernel... |
SourceID | proquest |
SourceType | Aggregation Database |
SubjectTerms | Algorithms Clustering Eigenvalues Image segmentation Kernels Sparsity Support vector machines Synthetic data |
Title | Accelerated sparse Kernel Spectral Clustering for large scale data clustering problems |
URI | https://www.proquest.com/docview/2880584448 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LSwMxEB7sLoI3n_ioJaDX4G6yj-5JtOxalJYiKr2V3ezkVGq7aa_-didLag9Cj2EgJEPyzSOT-QDuw6RtAqJ5mdQBj0Id8zKKJUeUKpOhVqr9KDwaJ8PP6HUaT13Czbiyyi0mtkBdfyubI38QdNDIWFI08bhcccsaZV9XHYVGB3xBkULggf-cjyfvf1kWkaTkM8t_QNtaj-IY_Em5xOYEDnBxCodt0aUyZ_D1pBShvm3WUDO62Y1B9obNAufM0sLbHAQbzDe2lQEZGEbuJZvbwm1mSLHIbHEnUzu544Yx53BX5B-DId8uZuYOjJntticvwKPIHy-B1RQlppVOMRBZJFW_zKpY6yCryyoLI4FX0N030_V-8Q0cWe50C8Qi6IK3bjZ4SxZ2XfWg0y9eek6ZNBr95L9neYe7 |
link.rule.ids | 783,787,12777,21400,33385,33756,43612,43817 |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT8MwDLZgFYIbT_EYEAmuEW3TBz0hmDYVtlUTGmi3qk2TU7WNZvv_2FXGDkg7W4oSy_n8iOMP4NGL2iEgmhdR5fLA0yEvglBwpYRMhKelbD8Kj7Mo_Qo-ZuHMFtyMbavcYGIL1NVCUo38yUdDQ2eJ2cTL8ocTaxS9rloKjX1waFQVJl_OWz-bfP5VWfwoxphZ_APa1nsMjsGZFEvVnMCemp_CQdt0Kc0ZfL9KiahPwxoqhje7MYoNVTNXNSNaeKpBsF69plEG6GAYhpespsZtZlCxilFzJ5NbueWGMefwMOhPeynfbCa3BmPy7fHEBXQw81eXwCrMEuNSx8r1k0DI5yIpQ63dpCrKxAt8dQXdXStd7xbfw2E6HY_y0Xs2vIEj4lEnUPbdLnRWzVrdorddlXdWpb-Ux4ie |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Accelerated+sparse+Kernel+Spectral+Clustering+for+large+scale+data+clustering+problems&rft.jtitle=arXiv.org&rft.au=Novak%2C+Mihaly&rft.au=Langone%2C+Rocco&rft.au=Alzate%2C+Carlos&rft.au=Suykens%2C+Johan&rft.date=2023-10-20&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422 |