Convex Non-negative Matrix Factorization in the Wild
Non-negative matrix factorization (NMF) has recently received a lot of attention in data mining, information retrieval, and computer vision. It factorizes a non-negative input matrix V into two non-negative matrix factors V = WH such that W describes "clusters" of the datasets. Analyzing g...
Saved in:
Published in | 2009 Ninth IEEE International Conference on Data Mining pp. 523 - 532 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.12.2009
|
Subjects | |
Online Access | Get full text |
ISBN | 9781424452422 1424452422 |
ISSN | 1550-4786 |
DOI | 10.1109/ICDM.2009.55 |
Cover
Abstract | Non-negative matrix factorization (NMF) has recently received a lot of attention in data mining, information retrieval, and computer vision. It factorizes a non-negative input matrix V into two non-negative matrix factors V = WH such that W describes "clusters" of the datasets. Analyzing genotypes, social networks, or images, it can be beneficial to ensure V to contain meaningful "cluster centroids", i.e., to restrict W to be convex combinations of data points. But how can we run this convex NMF in the wild, i.e., given millions of data points? Triggered by the simple observation that each data point is a convex combination of vertices of the data convex hull, we propose to restrict W further to be vertices of the convex hull. The benefits of this convex-hull NMF approach are twofold. First, the expected size of the convex hull of, for example, n random Gaussian points in the plane is ¿(¿log n), i.e., the candidate set typically grows much slower than the data set. Second, distance preserving low-dimensional embeddings allow one to compute candidate vertices efficiently. Our extensive experimental evaluation shows that convex-hull NMF compares favorably to convex NMF for large data sets both in terms of speed and reconstruction quality. Moreover, we show that our method can easily be applied to large-scale, real-world data sets, in our case consisting of 1.6 million images respectively 150 million votes on World of Warcraft ® guilds. |
---|---|
AbstractList | Non-negative matrix factorization (NMF) has recently received a lot of attention in data mining, information retrieval, and computer vision. It factorizes a non-negative input matrix V into two non-negative matrix factors V = WH such that W describes "clusters" of the datasets. Analyzing genotypes, social networks, or images, it can be beneficial to ensure V to contain meaningful "cluster centroids", i.e., to restrict W to be convex combinations of data points. But how can we run this convex NMF in the wild, i.e., given millions of data points? Triggered by the simple observation that each data point is a convex combination of vertices of the data convex hull, we propose to restrict W further to be vertices of the convex hull. The benefits of this convex-hull NMF approach are twofold. First, the expected size of the convex hull of, for example, n random Gaussian points in the plane is ¿(¿log n), i.e., the candidate set typically grows much slower than the data set. Second, distance preserving low-dimensional embeddings allow one to compute candidate vertices efficiently. Our extensive experimental evaluation shows that convex-hull NMF compares favorably to convex NMF for large data sets both in terms of speed and reconstruction quality. Moreover, we show that our method can easily be applied to large-scale, real-world data sets, in our case consisting of 1.6 million images respectively 150 million votes on World of Warcraft ® guilds. |
Author | Thurau, C. Kersting, K. Bauckhage, C. |
Author_xml | – sequence: 1 givenname: C. surname: Thurau fullname: Thurau, C. organization: Fraunhofer IAIS, St. Augustin, Germany – sequence: 2 givenname: K. surname: Kersting fullname: Kersting, K. organization: Fraunhofer IAIS, St. Augustin, Germany – sequence: 3 givenname: C. surname: Bauckhage fullname: Bauckhage, C. organization: Fraunhofer IAIS, St. Augustin, Germany |
BookMark | eNotjk1Lw0AURUesYFO7c-dm_kDim6_3MkuJVgutbhSXZZq86EidSBpK9dcb0MXlwuVwOZmYpC6xEJcKCqXAXy-r23WhAXzh3InIgNA7U3rnT8XcU6msttZpq_VETJVzkFsq8Vxk-_0HgEE0MBW26tKBj_KxS3nitzDEA8t1GPp4lItQD10ff8axSzImObyzfI275kKctWG35_l_z8TL4u65eshXT_fL6maVR0VuyJGsa6BltgqoJlIYDDHVTHY7WlsocYsaXaNRj2aAbdt4O4ZLr9pQm5m4-vuNzLz56uNn6L83ziCMvPkFh5dHHg |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/ICDM.2009.55 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISBN | 0769538959 9780769538952 |
EndPage | 532 |
ExternalDocumentID | 5360278 |
Genre | orig-research |
GroupedDBID | 29O 6IE 6IF 6IH 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL RNS |
ID | FETCH-LOGICAL-i175t-6745d0fee4107c7716a37e7ce74b0094086b6265d26278606ffd94fd9e891fac3 |
IEDL.DBID | RIE |
ISBN | 9781424452422 1424452422 |
ISSN | 1550-4786 |
IngestDate | Wed Aug 27 02:47:04 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i175t-6745d0fee4107c7716a37e7ce74b0094086b6265d26278606ffd94fd9e891fac3 |
PageCount | 10 |
ParticipantIDs | ieee_primary_5360278 |
PublicationCentury | 2000 |
PublicationDate | 2009-Dec. |
PublicationDateYYYYMMDD | 2009-12-01 |
PublicationDate_xml | – month: 12 year: 2009 text: 2009-Dec. |
PublicationDecade | 2000 |
PublicationTitle | 2009 Ninth IEEE International Conference on Data Mining |
PublicationTitleAbbrev | ICDM |
PublicationYear | 2009 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0036630 ssib015831989 ssj0000453561 |
Score | 1.5771571 |
Snippet | Non-negative matrix factorization (NMF) has recently received a lot of attention in data mining, information retrieval, and computer vision. It factorizes a... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 523 |
SubjectTerms | archetypal analysis Computer vision Data analysis data handling Data mining Embedded computing Image analysis Image reconstruction Information retrieval Large-scale systems matrix decomposition non negative matrix factorization social network analysis Social network services Voting |
Title | Convex Non-negative Matrix Factorization in the Wild |
URI | https://ieeexplore.ieee.org/document/5360278 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NS8MwFA9zJ09TN_GbHDyarW2-2vN0TKHDg4PdRpO8yhA6kQ2Gf70vaTtFPHgotD0l4eV9JO_3-xFya1VcuEgDS0XmmOClZSl3hkldaGNLDBilByfnMzWdi6eFXHTI3R4LAwCh-QyG_jXc5bu13fqjspHkyl-UHZADNLMaq9XaTixTHrdKiMELC8kDZrP2yhwjawBHYkbOhE5VC_KSGKOSlvup_d53yGejx_F9XtNaejTgDwWWEIAmPZK3Q6_7Tt6G240Z2s9frI7_ndsRGXxD_ejzPogdkw5UJ6TXaj3QZuv3iRj79vQdna0rVsFroAunuef339FJ0OxpAJ10VVFMKim6Gzcg88nDy3jKGsUFtsI0YsOUFtJFJYDAqtBqrKUKrkFb0ML4HkSsfwxWQNIlCseKtU9ZukzgA2kWl4Xlp6RbrSs4I9TFFjOJIkoKx4Vw3IAwhUiNixXYSCXnpO8XYflek2osm_lf_P37khwmjXBDFF-R7uZjC9eYDWzMTTCDL5yjqts |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEN0gHvSECsZv9-DRhbb71Z5RAkqJB0i4ke5HDTEpxkBC_PXObls0xoOHJm1Pu83uvJnuvPcQutMizEwgLYlZYgijuSYxNYpwmUmlcwCM3JGT04kYztjTnM8b6H7HhbHW-uYz23W3_izfrPTG_SrrcSrcQdke2gfcZ7xka9WrJ-QxDWsvRB-HGaeetVnGZQrY6umRkJMTJmNR07w4oFRUqz_Vz7se-aQ36j-kpbCl4wP-8GDxEDRoobQefNl58tbdrFVXf_7Sdfzv7I5Q55vsh192MHaMGrY4Qa3a7QFXm7-NWN81qG_xZFWQwr56wXCcOoX_LR54156K0omXBYa0EkPAMR00GzxO-0NSeS6QJSQSayIk4ybIrWVQF2oJ1VRGpZXaSqZcFyJUQApqIG4iAWOF6ifPTcLgsnES5pmmp6hZrAp7hrAJNeQSWRBlhjJmqLJMZSxWJhRWByI6R233ERbvpazGopr_xd-vb9HBcJqOF-PR5PkSHUaVjUMQXqHm-mNjryE3WKsbvyS-AMririg |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2009+Ninth+IEEE+International+Conference+on+Data+Mining&rft.atitle=Convex+Non-negative+Matrix+Factorization+in+the+Wild&rft.au=Thurau%2C+C.&rft.au=Kersting%2C+K.&rft.au=Bauckhage%2C+C.&rft.date=2009-12-01&rft.pub=IEEE&rft.isbn=9781424452422&rft.issn=1550-4786&rft.spage=523&rft.epage=532&rft_id=info:doi/10.1109%2FICDM.2009.55&rft.externalDocID=5360278 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1550-4786&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1550-4786&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1550-4786&client=summon |