Document Clustering Using Semantic Features and Fuzzy Relations

Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship among terms in the data set. To resolve this problem, ontology or matrix factorization approaches are usually used. However, a major problem...

Full description

Saved in:
Bibliographic Details
Published inJournal of Information and Communication Convergence Engineering, 11(3) Vol. 11; no. 3; pp. 179 - 184
Main Authors Kim, Chul-Won, Park, Sun
Format Journal Article
LanguageEnglish
Published 한국정보통신학회 30.09.2013
Subjects
Online AccessGet full text
ISSN2234-8255
2234-8883
DOI10.6109/jicce.2013.11.3.179

Cover

Loading…
Abstract Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship among terms in the data set. To resolve this problem, ontology or matrix factorization approaches are usually used. However, a major problem of the ontology approach is that it is usually difficult to find a comprehensive ontology that can cover all the concepts mentioned in a collection. This paper proposes a new document clustering method using semantic features and fuzzy relations for solving the problems of ontology and matrix factorization approaches. The proposed method can improve the quality of document clustering because the clustered documents use fuzzy relation values between semantic features and terms to distinguish clearly among dissimilar documents in clusters. The selected cluster label terms can represent the inherent structure of a document set better by using semantic features based on non-negative matrix factorization, which is used in document clustering. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods. KCI Citation Count: 0
AbstractList Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship among terms in the data set. To resolve this problem, ontology or matrix factorization approaches are usually used. However, a major problem of the ontology approach is that it is usually difficult to find a comprehensive ontology that can cover all the concepts mentioned in a collection. This paper proposes a new document clustering method using semantic features and fuzzy relations for solving the problems of ontology and matrix factorization approaches. The proposed method can improve the quality of document clustering because the clustered documents use fuzzy relation values between semantic features and terms to distinguish clearly among dissimilar documents in clusters. The selected cluster label terms can represent the inherent structure of a document set better by using semantic features based on non-negative matrix factorization, which is used in document clustering. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods. KCI Citation Count: 0
Author Kim, Chul-Won
Park, Sun
Author_xml – sequence: 1
  givenname: Chul-Won
  surname: Kim
  fullname: Kim, Chul-Won
– sequence: 2
  givenname: Sun
  surname: Park
  fullname: Park, Sun
BackLink https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART001806028$$DAccess content in National Research Foundation of Korea (NRF)
BookMark eNo1kM9PwjAYhhuDiYj8BV529bDZH-u6ngxBURISE4Rz031rSWV0pt0O8NfLQC_P-yXvm-_w3KORb71B6JHgrCBYPn87AJNRTFhGSHaGkDdoTCnL07Is2ej_ppzfoWmMrsKcMVEInI_Ry2sL_cH4Lpk3fexMcH6XbOPAL3PQvnOQLIzu-mBion2dLPrT6ZisTaM71_r4gG6tbqKZ_uUEbRdvm_lHuvp8X85nqxQIZTJlNS01x7Q2FgwUFbFMQiEx00JiDRVIXQnDYaggB0kqWlpeYJ4XrILSsgl6uv71wao9ONVqd8ldq_ZBzdabpSKUEinEecuuWwhtjMFY9RPcQYejIlgNytRFmRqUKULUGUKyX_3aYjk
Cites_doi 10.1038/44565
ContentType Journal Article
DBID AAYXX
CITATION
ACYCR
DOI 10.6109/jicce.2013.11.3.179
DatabaseName CrossRef
Korean Citation Index
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
EISSN 2234-8883
EndPage 184
ExternalDocumentID oai_kci_go_kr_ARTI_1221977
10_6109_jicce_2013_11_3_179
GroupedDBID .UV
AAYXX
ALMA_UNASSIGNED_HOLDINGS
CITATION
ACYCR
M~E
ID FETCH-LOGICAL-c1239-3d28a502defcec6b1f39c6903a790acbc9ab7e5cc6b1c4c91b28f5605463bc8f3
ISSN 2234-8255
IngestDate Tue Nov 21 21:31:48 EST 2023
Tue Jul 01 00:57:29 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c1239-3d28a502defcec6b1f39c6903a790acbc9ab7e5cc6b1c4c91b28f5605463bc8f3
Notes G704-SER000003196.2013.11.3.001
OpenAccessLink http://koreascience.or.kr:80/article/JAKO201330251815531.pdf
PageCount 6
ParticipantIDs nrf_kci_oai_kci_go_kr_ARTI_1221977
crossref_primary_10_6109_jicce_2013_11_3_179
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2013-09-30
PublicationDateYYYYMMDD 2013-09-30
PublicationDate_xml – month: 09
  year: 2013
  text: 2013-09-30
  day: 30
PublicationDecade 2010
PublicationTitle Journal of Information and Communication Convergence Engineering, 11(3)
PublicationYear 2013
Publisher 한국정보통신학회
Publisher_xml – name: 한국정보통신학회
References E1ICAW_2013_v11n3_179_016
References_xml – ident: E1ICAW_2013_v11n3_179_016
  doi: 10.1038/44565
SSID ssib053376704
ssib044744615
ssib025702295
ssib012146031
Score 1.8505486
Snippet Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship...
SourceID nrf
crossref
SourceType Open Website
Index Database
StartPage 179
SubjectTerms 전자/정보통신공학
Title Document Clustering Using Semantic Features and Fuzzy Relations
URI https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART001806028
Volume 11
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
ispartofPNX Journal of Information and Communication Convergence Engineering, 2013, 11(3), , pp.179-184
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3NbtQwELa25cIFFQGiUFCE8AlS8h_PCSVpqi5SudBKvUW2k6BtYYtWGyT2wFvyPszYm022aqXCxXJsx3Y8nzwzsWeGsbfQpnVYp-BCoIQb1Vq60qs9F6QExIwvBZA18unn5OQ8-nQRX0wmf0a3lrqlOtSrW-1K_oeqWIZ0JSvZf6DsplMswDzSF1OkMKb3ojFyiM4c5hffOnJ4QGq_vQPwpfmOSzbT70jE61ClNocEx91q9Wu4_3aHYLq2UFr2F5W3bEjIRPCntdhsxs4MiVS0K4rxn1teZjwHukxRFjwPeGYykHPwhiY5zz0OMdWICJtsINBQrBKzEU2nNmBU_3uCQkVAf9JCgOLlEXUBhRkx5tkRdZfZfnGAkOcRtREJ1dJIuZmMfQtMJuOiGLZFlGciF_VaexbejMqECEdbsW-D1Ky5um8D0d1kGIn1t3o505p8pvohspBDTPp3x-65b7DNLQfdV3pWfb2urhYVqiHTyg-QEaTpDnsQoPZCEUVOf5f9NudTLHVvOHylOILroOrmOYpS1NFN7I3Nt1p_WTTbD7fMdUum2pkv2pGIdLbHHq0h5GQWqI_ZpJk_YR97kDoDSB0DUqcHqdOD1EG4OQakzgakT9n5cXlWnLjrqB2uRikIiV8HQsZeUDetbnSi_DYEnYAXyhQ8qZUGqdIm1lSlIw2-CkSLcjfFZVBatOEztju_njfPmaMTqTwFqao9Taq9CLWoUaGBGqX8Nkz22fv-u6sf1jlLhUotLVNllqmiZUIlt8IkhX32BtfG0Opumr24T6OX7OGA9QO2u1x0zSsUTpfqtSH1X3tKfUo
linkProvider ISSN International Centre
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Document+Clustering+Using+Semantic+Features+and+Fuzzy+Relations&rft.jtitle=Journal+of+Information+and+Communication+Convergence+Engineering%2C+11%283%29&rft.au=%EA%B9%80%EC%B2%A0%EC%9B%90&rft.au=%EB%B0%95%EC%84%A0&rft.au=Member%2C+KIICE&rft.date=2013-09-30&rft.pub=%ED%95%9C%EA%B5%AD%EC%A0%95%EB%B3%B4%ED%86%B5%EC%8B%A0%ED%95%99%ED%9A%8C&rft.issn=2234-8255&rft.eissn=2234-8883&rft.spage=179&rft.epage=184&rft_id=info:doi/10.6109%2Fjicce.2013.11.3.179&rft.externalDBID=n%2Fa&rft.externalDocID=oai_kci_go_kr_ARTI_1221977
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2234-8255&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2234-8255&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2234-8255&client=summon