Document Clustering Using Semantic Features and Fuzzy Relations
Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship among terms in the data set. To resolve this problem, ontology or matrix factorization approaches are usually used. However, a major problem...
Saved in:
Published in | Journal of Information and Communication Convergence Engineering, 11(3) Vol. 11; no. 3; pp. 179 - 184 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
한국정보통신학회
30.09.2013
|
Subjects | |
Online Access | Get full text |
ISSN | 2234-8255 2234-8883 |
DOI | 10.6109/jicce.2013.11.3.179 |
Cover
Loading…
Abstract | Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship among terms in the data set. To resolve this problem, ontology or matrix factorization approaches are usually used. However, a major problem of the ontology approach is that it is usually difficult to find a comprehensive ontology that can cover all the concepts mentioned in a collection. This paper proposes a new document clustering method using semantic features and fuzzy relations for solving the problems of ontology and matrix factorization approaches. The proposed method can improve the quality of document clustering because the clustered documents use fuzzy relation values between semantic features and terms to distinguish clearly among dissimilar documents in clusters. The selected cluster label terms can represent the inherent structure of a document set better by using semantic features based on non-negative matrix factorization, which is used in document clustering. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods. KCI Citation Count: 0 |
---|---|
AbstractList | Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship among terms in the data set. To resolve this problem, ontology or matrix factorization approaches are usually used. However, a major problem of the ontology approach is that it is usually difficult to find a comprehensive ontology that can cover all the concepts mentioned in a collection. This paper proposes a new document clustering method using semantic features and fuzzy relations for solving the problems of ontology and matrix factorization approaches. The proposed method can improve the quality of document clustering because the clustered documents use fuzzy relation values between semantic features and terms to distinguish clearly among dissimilar documents in clusters. The selected cluster label terms can represent the inherent structure of a document set better by using semantic features based on non-negative matrix factorization, which is used in document clustering. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods. KCI Citation Count: 0 |
Author | Kim, Chul-Won Park, Sun |
Author_xml | – sequence: 1 givenname: Chul-Won surname: Kim fullname: Kim, Chul-Won – sequence: 2 givenname: Sun surname: Park fullname: Park, Sun |
BackLink | https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART001806028$$DAccess content in National Research Foundation of Korea (NRF) |
BookMark | eNo1kM9PwjAYhhuDiYj8BV529bDZH-u6ngxBURISE4Rz031rSWV0pt0O8NfLQC_P-yXvm-_w3KORb71B6JHgrCBYPn87AJNRTFhGSHaGkDdoTCnL07Is2ej_ppzfoWmMrsKcMVEInI_Ry2sL_cH4Lpk3fexMcH6XbOPAL3PQvnOQLIzu-mBion2dLPrT6ZisTaM71_r4gG6tbqKZ_uUEbRdvm_lHuvp8X85nqxQIZTJlNS01x7Q2FgwUFbFMQiEx00JiDRVIXQnDYaggB0kqWlpeYJ4XrILSsgl6uv71wao9ONVqd8ldq_ZBzdabpSKUEinEecuuWwhtjMFY9RPcQYejIlgNytRFmRqUKULUGUKyX_3aYjk |
Cites_doi | 10.1038/44565 |
ContentType | Journal Article |
DBID | AAYXX CITATION ACYCR |
DOI | 10.6109/jicce.2013.11.3.179 |
DatabaseName | CrossRef Korean Citation Index |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
EISSN | 2234-8883 |
EndPage | 184 |
ExternalDocumentID | oai_kci_go_kr_ARTI_1221977 10_6109_jicce_2013_11_3_179 |
GroupedDBID | .UV AAYXX ALMA_UNASSIGNED_HOLDINGS CITATION ACYCR M~E |
ID | FETCH-LOGICAL-c1239-3d28a502defcec6b1f39c6903a790acbc9ab7e5cc6b1c4c91b28f5605463bc8f3 |
ISSN | 2234-8255 |
IngestDate | Tue Nov 21 21:31:48 EST 2023 Tue Jul 01 00:57:29 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 3 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c1239-3d28a502defcec6b1f39c6903a790acbc9ab7e5cc6b1c4c91b28f5605463bc8f3 |
Notes | G704-SER000003196.2013.11.3.001 |
OpenAccessLink | http://koreascience.or.kr:80/article/JAKO201330251815531.pdf |
PageCount | 6 |
ParticipantIDs | nrf_kci_oai_kci_go_kr_ARTI_1221977 crossref_primary_10_6109_jicce_2013_11_3_179 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2013-09-30 |
PublicationDateYYYYMMDD | 2013-09-30 |
PublicationDate_xml | – month: 09 year: 2013 text: 2013-09-30 day: 30 |
PublicationDecade | 2010 |
PublicationTitle | Journal of Information and Communication Convergence Engineering, 11(3) |
PublicationYear | 2013 |
Publisher | 한국정보통신학회 |
Publisher_xml | – name: 한국정보통신학회 |
References | E1ICAW_2013_v11n3_179_016 |
References_xml | – ident: E1ICAW_2013_v11n3_179_016 doi: 10.1038/44565 |
SSID | ssib053376704 ssib044744615 ssib025702295 ssib012146031 |
Score | 1.8505486 |
Snippet | Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship... |
SourceID | nrf crossref |
SourceType | Open Website Index Database |
StartPage | 179 |
SubjectTerms | 전자/정보통신공학 |
Title | Document Clustering Using Semantic Features and Fuzzy Relations |
URI | https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART001806028 |
Volume | 11 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
ispartofPNX | Journal of Information and Communication Convergence Engineering, 2013, 11(3), , pp.179-184 |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3NbtQwELa25cIFFQGiUFCE8AlS8h_PCSVpqi5SudBKvUW2k6BtYYtWGyT2wFvyPszYm022aqXCxXJsx3Y8nzwzsWeGsbfQpnVYp-BCoIQb1Vq60qs9F6QExIwvBZA18unn5OQ8-nQRX0wmf0a3lrqlOtSrW-1K_oeqWIZ0JSvZf6DsplMswDzSF1OkMKb3ojFyiM4c5hffOnJ4QGq_vQPwpfmOSzbT70jE61ClNocEx91q9Wu4_3aHYLq2UFr2F5W3bEjIRPCntdhsxs4MiVS0K4rxn1teZjwHukxRFjwPeGYykHPwhiY5zz0OMdWICJtsINBQrBKzEU2nNmBU_3uCQkVAf9JCgOLlEXUBhRkx5tkRdZfZfnGAkOcRtREJ1dJIuZmMfQtMJuOiGLZFlGciF_VaexbejMqECEdbsW-D1Ky5um8D0d1kGIn1t3o505p8pvohspBDTPp3x-65b7DNLQfdV3pWfb2urhYVqiHTyg-QEaTpDnsQoPZCEUVOf5f9NudTLHVvOHylOILroOrmOYpS1NFN7I3Nt1p_WTTbD7fMdUum2pkv2pGIdLbHHq0h5GQWqI_ZpJk_YR97kDoDSB0DUqcHqdOD1EG4OQakzgakT9n5cXlWnLjrqB2uRikIiV8HQsZeUDetbnSi_DYEnYAXyhQ8qZUGqdIm1lSlIw2-CkSLcjfFZVBatOEztju_njfPmaMTqTwFqao9Taq9CLWoUaGBGqX8Nkz22fv-u6sf1jlLhUotLVNllqmiZUIlt8IkhX32BtfG0Opumr24T6OX7OGA9QO2u1x0zSsUTpfqtSH1X3tKfUo |
linkProvider | ISSN International Centre |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Document+Clustering+Using+Semantic+Features+and+Fuzzy+Relations&rft.jtitle=Journal+of+Information+and+Communication+Convergence+Engineering%2C+11%283%29&rft.au=%EA%B9%80%EC%B2%A0%EC%9B%90&rft.au=%EB%B0%95%EC%84%A0&rft.au=Member%2C+KIICE&rft.date=2013-09-30&rft.pub=%ED%95%9C%EA%B5%AD%EC%A0%95%EB%B3%B4%ED%86%B5%EC%8B%A0%ED%95%99%ED%9A%8C&rft.issn=2234-8255&rft.eissn=2234-8883&rft.spage=179&rft.epage=184&rft_id=info:doi/10.6109%2Fjicce.2013.11.3.179&rft.externalDBID=n%2Fa&rft.externalDocID=oai_kci_go_kr_ARTI_1221977 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2234-8255&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2234-8255&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2234-8255&client=summon |