Dependence among terms in vector space model

The vector space model is a mathematical-based model that represents terms, documents and queries by vectors and provides a ranking. In this model, the subspace of interest is formed by a set of pairwise orthogonal term vectors, indicating that terms are mutually independent. However, this is a simp...

Full description

Saved in:
Bibliographic Details
Published inProceedings. International Database Engineering and Applications Symposium, 2004. IDEAS '04 pp. 97 - 102
Main Authors Silva, I.R., Souza, J.N., Santos, K.S.
Format Conference Proceeding
LanguageEnglish
Published Los Alamitos CA IEEE 2004
IEEE Computer Society
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The vector space model is a mathematical-based model that represents terms, documents and queries by vectors and provides a ranking. In this model, the subspace of interest is formed by a set of pairwise orthogonal term vectors, indicating that terms are mutually independent. However, this is a simplification that doesn't correspond to the reality. Based on this scenery, we present, in this work, an extension to the vector space model to take into account the correlation between terms. In the proposed model, term vectors are rotated in space geometrically reflecting the dependence semantics among terms. We rotate terms based on a data mining technique called association rules. The retrieval effectiveness of the proposed model is evaluated and the results shows that our model improves in average precision, relative to the standard vector space model, for all collections evaluated, leading to a gain up to 31%.
AbstractList The vector space model is a mathematical-based model that represents terms, documents and queries by vectors and provides a ranking. In this model, the subspace of interest is formed by a set of pairwise orthogonal term vectors, indicating that terms are mutually independent. However, this is a simplification that doesn't correspond to the reality. Based on this scenery, we present, in this work, an extension to the vector space model to take into account the correlation between terms. In the proposed model, term vectors are rotated in space geometrically reflecting the dependence semantics among terms. We rotate terms based on a data mining technique called association rules. The retrieval effectiveness of the proposed model is evaluated and the results shows that our model improves in average precision, relative to the standard vector space model, for all collections evaluated, leading to a gain up to 31%.
Author Santos, K.S.
Silva, I.R.
Souza, J.N.
Author_xml – sequence: 1
  givenname: I.R.
  surname: Silva
  fullname: Silva, I.R.
  organization: Univ. Fed. de Uberlandia, Brazil
– sequence: 2
  givenname: J.N.
  surname: Souza
  fullname: Souza, J.N.
  organization: Univ. Fed. de Uberlandia, Brazil
– sequence: 3
  givenname: K.S.
  surname: Santos
  fullname: Santos, K.S.
  organization: Univ. Fed. de Uberlandia, Brazil
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=18024438$$DView record in Pascal Francis
BookMark eNpFkM1Kw0AURgesYFv7ArrJxp2Jd34yubMsbdVCwYW6LtOZOxJpJiETBN_eQAquzuIcvsW3YLPYRmLsjkPBOZin_Xa3fi8EgCq45KZCccVWI6DSphRcI87YfAwxR9B4wxYpfQNwY1DN2eOWOoqeoqPMNm38ygbqm5TVMfshN7R9ljo7uqb1dL5l18GeE60uXLLP593H5jU_vL3sN-tDXguQQy78KUiprXUn1J40ylIJXYHxhjuPGjRB5UvURpIXzoVRGKeMcCYEZblcsodpt7PJ2XPobXR1OnZ93dj-98gRhFISx-5-6moi-tfTCfIPIxdQ6w
ContentType Conference Proceeding
Copyright 2006 INIST-CNRS
Copyright_xml – notice: 2006 INIST-CNRS
DBID 6IE
6IL
CBEJK
RIE
RIL
IQODW
DOI 10.1109/IDEAS.2004.1319782
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library Online
IEEE Proceedings Order Plans (POP All) 1998-Present
Pascal-Francis
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library Online
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Applied Sciences
EndPage 102
ExternalDocumentID 18024438
1319782
Genre orig-research
GroupedDBID 29O
6IE
6IF
6IH
6IK
6IL
6IN
AAJGR
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IPLJI
JC5
M43
OCL
RIE
RIL
RNS
AAVQY
IQODW
RIB
RIC
ID FETCH-LOGICAL-i203t-2dbf336aacb86de6835426709d91cd8606e07d58693ed2ccf9d99c492c9ff4a13
IEDL.DBID RIE
ISBN 9780769521688
0769521681
ISSN 1098-8068
IngestDate Sun Oct 22 16:09:43 EDT 2023
Wed Jun 26 19:21:12 EDT 2024
IsPeerReviewed false
IsScholarly true
Keywords Data analysis
Correlation
Vector space
Database query
Hierarchical classification
Semantics
Association rule
Database
Information extraction
Data mining
Modeling
Language English
License CC BY 4.0
LinkModel DirectLink
MeetingName International Database Engineering and Applications Symposium (Coimbra, Portugal, July 7-9, 2004)
MergedId FETCHMERGED-LOGICAL-i203t-2dbf336aacb86de6835426709d91cd8606e07d58693ed2ccf9d99c492c9ff4a13
PageCount 6
ParticipantIDs pascalfrancis_primary_18024438
ieee_primary_1319782
PublicationCentury 2000
PublicationDate 20040000
2004
PublicationDateYYYYMMDD 2004-01-01
PublicationDate_xml – year: 2004
  text: 20040000
PublicationDecade 2000
PublicationPlace Los Alamitos CA
PublicationPlace_xml – name: Los Alamitos CA
PublicationTitle Proceedings. International Database Engineering and Applications Symposium, 2004. IDEAS '04
PublicationTitleAbbrev IDEAS
PublicationYear 2004
Publisher IEEE
IEEE Computer Society
Publisher_xml – name: IEEE
– name: IEEE Computer Society
SSID ssj0019984
ssj0000451384
Score 1.635645
Snippet The vector space model is a mathematical-based model that represents terms, documents and queries by vectors and provides a ranking. In this model, the...
SourceID pascalfrancis
ieee
SourceType Index Database
Publisher
StartPage 97
SubjectTerms Applied sciences
Association rules
Computer science; control theory; systems
Data engineering
Data mining
Data processing. List processing. Character string processing
Exact sciences and technology
Information retrieval
Information systems. Data bases
Mathematical model
Memory organisation. Data processing
Proposals
Set theory
Software
Solid modeling
Thesauri
Title Dependence among terms in vector space model
URI https://ieeexplore.ieee.org/document/1319782
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT8IwFG-AkydUMOIH2cEjG9tauvZoBIImGBMl4UbWr4SYDCPDg3-9r10HajyY7LDtdWle17zX9_lD6CbLQOARIkODxSgkoywPRWpoqLFmknC4tPVDzh_pbEEelqNlAw32tTBaa5d8piN762L5aiN31lU2TGC_gEZrombGaVWrtfen2D4p2NZc-ggCmBEuomz7ZbKYsspk56CtKEt85536mdXVNDEf3o8nt8_Oboz8dB53xWZN5ltYOFMhXnxTQ9M2mtcMVNknr9GuFJH8_NXb8b8cHqPuoeAveNqrshPU0MUpateID4EXAB00GHvMXBjvYIoCK9m3wboIPpz7PwABBTSHr9NFi-nk5W4WeryFcJ3GuAxTJQzGNM-lYFRpan1Cqe3vpngiFQNTR8eZGjHKsVaplAYI9m-mkhtD8gSfoVaxKfQ5CtKYKWxgpGKckJwLGRuVSZYnEt4J2UMdy_zqrWqpsfJ891D_xxof6AzOEgSzi7-_u0RHVUaNdY1coVb5vtPXcFgoRd_tki9FTbZo
link.rule.ids 309,310,780,784,789,790,796,4050,4051,27925,54758
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEJ4gHvSECkZ8YA8eKbTdbbs9GoGAAjEREm6k3UdCTIqR4sFf72y7BTUeTHpoO9tsZruZ2Xl-AHdhiAKPUm4rkvg29cPYTjwV2JJIxmmEl9R-yMk0GM7p48JfVKC9q4WRUubJZ7Kjb_NYvljzrXaVdV3cL6jRDuDQp3jOLaq1dh4V3SmF6KpLE0NAQyKPKeuOmcwJWGG0R6ivAuaa3jvlMyvraZyoO-r1719yy7FjJjTIKzpvMt7g0qkC8-KbIhrUYFKyUOSfvHa2WdLhn7-6O_6XxxNo7Ev-rOedMjuFikzPoFZiPlhGBNSh3TOouTg-ByqytGzfWKvU-sgDABaKKKTlCDsNmA_6s4ehbRAX7JXnkMz2RKIICeKYJywQMtBeIU93eBORywVDY0c6ofBZEBEpPM4VEvT_9HikFI1dcg7VdJ3KC7A8hwmicKRgEaVxlHBHiZCz2OX4LuFNqGvml29FU42l4bsJrR9rvKczPE1Qwi7__u4WjoazyXg5Hk2fruC4yK_RjpJrqGbvW3mDR4csaeU75gsWiLm7
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings.+International+Database+Engineering+and+Applications+Symposium%2C+2004.+IDEAS+%2704&rft.atitle=Dependence+among+terms+in+vector+space+model&rft.au=Silva%2C+I.R.&rft.au=Souza%2C+J.N.&rft.au=Santos%2C+K.S.&rft.date=2004-01-01&rft.pub=IEEE&rft.isbn=9780769521688&rft.issn=1098-8068&rft.spage=97&rft.epage=102&rft_id=info:doi/10.1109%2FIDEAS.2004.1319782&rft.externalDocID=1319782
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1098-8068&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1098-8068&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1098-8068&client=summon