Improving Text Similarity Measurement by Critical Sentence Vector Model

We propose the Critical Sentence Vector Model (CSVM), a novel model to measure text similarity. The CSVM accounts for the structural and semantic information of the document. Compared to existing methods based on keyword vector, e.g. Vector Space Model (VSM), CSVM measures documents similarity by me...

Full description

Saved in:
Bibliographic Details
Published inInformation Retrieval Technology pp. 522 - 527
Main Authors Li, Wei, Wong, Kam-Fai, Yuan, Chunfa, Li, Wenjie, Xia, Yunqing
Format Book Chapter Conference Proceeding
LanguageEnglish
Published Berlin, Heidelberg Springer Berlin Heidelberg 2005
Springer
Edition1ère éd
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
Abstract We propose the Critical Sentence Vector Model (CSVM), a novel model to measure text similarity. The CSVM accounts for the structural and semantic information of the document. Compared to existing methods based on keyword vector, e.g. Vector Space Model (VSM), CSVM measures documents similarity by measuring similarity between critical sentence vectors extracted from documents. Experiments show that CSVM outperforms VSM in calculation of text similarity.
AbstractList We propose the Critical Sentence Vector Model (CSVM), a novel model to measure text similarity. The CSVM accounts for the structural and semantic information of the document. Compared to existing methods based on keyword vector, e.g. Vector Space Model (VSM), CSVM measures documents similarity by measuring similarity between critical sentence vectors extracted from documents. Experiments show that CSVM outperforms VSM in calculation of text similarity.
Author Xia, Yunqing
Li, Wei
Wong, Kam-Fai
Yuan, Chunfa
Li, Wenjie
Author_xml – sequence: 1
  givenname: Wei
  surname: Li
  fullname: Li, Wei
  email: wli@se.cuhk.edu.hk
  organization: Department of Systems Engineering, the Chinese University of Hong Kong, Shatin, N.T., Hong Kong
– sequence: 2
  givenname: Kam-Fai
  surname: Wong
  fullname: Wong, Kam-Fai
  email: kfwong@se.cuhk.edu.hk
  organization: Department of Systems Engineering, the Chinese University of Hong Kong, Shatin, N.T., Hong Kong
– sequence: 3
  givenname: Chunfa
  surname: Yuan
  fullname: Yuan, Chunfa
  email: ycf@tsinghua.edu.cn
  organization: State Key Laboratory of Intelligent Technology and System, Tsinghua University, Beijing, China
– sequence: 4
  givenname: Wenjie
  surname: Li
  fullname: Li, Wenjie
  email: cswjli@comp.polyu.edu.hk
  organization: Department of Computing, Hong Kong Polytechnic University, Hung Hom, Hong Kong
– sequence: 5
  givenname: Yunqing
  surname: Xia
  fullname: Xia, Yunqing
  email: yqxia@se.cuhk.edu.hk
  organization: Department of Systems Engineering, the Chinese University of Hong Kong, Shatin, N.T., Hong Kong
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=17325814$$DView record in Pascal Francis
BookMark eNpNUD1PwzAUNFAk2tKJP-CFgSHg5-fazogqKJVaMbSwRk7yUgXyJTsg-u9JVQbeck93p9PpJmzUtA0xdgPiHoQwDwBzLdHKRKkzNsG5EiiFAH3OxqABIkQVX7BZbOxRkzFYLUdsLFDIKDYKr9gshA8xHIKVGsZsuao7336XzZ7v6Kfn27IuK-fL_sA35MKXp5qanqcHvhjIMnMV3w4ENRnxd8r61vNNm1N1zS4LVwWa_eGUvT0_7RYv0fp1uVo8rqNOQtxHVCDmEhELkWaU0lARHWhtMkc6p0xqaTVILKxWw2s0kXFOiDTHQilrcMpuT7mdC0OZwrsmK0PS-bJ2_pCAQTm3oAbf3ckXBqnZk0_Stv0MCYjkuGTyb0n8BQhsYXs
ContentType Book Chapter
Conference Proceeding
Copyright Springer-Verlag Berlin Heidelberg 2005
2006 INIST-CNRS
Copyright_xml – notice: Springer-Verlag Berlin Heidelberg 2005
– notice: 2006 INIST-CNRS
DBID IQODW
DOI 10.1007/11562382_44
DatabaseName Pascal-Francis
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Library & Information Science
Computer Science
Applied Sciences
EISBN 3540320016
9783540320012
EISSN 1611-3349
Edition 1ère éd
Editor Myaeng, Sung Hyon
Lee, Gary Geunbae
Yamada, Akio
Meng, Helen
Editor_xml – sequence: 1
  givenname: Gary Geunbae
  surname: Lee
  fullname: Lee, Gary Geunbae
  email: gblee@postech.ac.kr
– sequence: 2
  givenname: Akio
  surname: Yamada
  fullname: Yamada, Akio
  email: a-yamada@da.jp.nec.com
– sequence: 3
  givenname: Helen
  surname: Meng
  fullname: Meng, Helen
  email: hmmeng@se.cuhk.edu.hk
– sequence: 4
  givenname: Sung Hyon
  surname: Myaeng
  fullname: Myaeng, Sung Hyon
  email: myaeng@icu.ac.kr
EndPage 527
ExternalDocumentID 17325814
GroupedDBID -DT
-GH
-~X
1SB
29L
2HA
2HV
5QI
875
AASHB
ABMNI
ACGFS
ADCXD
AEFIE
ALMA_UNASSIGNED_HOLDINGS
EJD
F5P
FEDTE
HVGLF
LAS
LDH
P2P
RIG
RNI
RSU
SVGTG
VI1
~02
IQODW
ID FETCH-LOGICAL-p219t-ef33d2333f0bcebe2913a1667cae6dec26286123f86462876ee7aa00bd3f44873
ISBN 9783540291862
3540291865
ISSN 0302-9743
IngestDate Sun Oct 22 16:09:12 EDT 2023
Tue Oct 01 18:00:48 EDT 2024
IsPeerReviewed true
IsScholarly true
Keywords Vector space
Keyword
Similarity
Semantics
Document structure
Information retrieval
Metric
Text
Vector method
Modeling
Sentence
Language English
License CC BY 4.0
LinkModel OpenURL
MeetingName Information retrieval technology (Second Asia information retrieval symposium, AIRS 2005, Jeju Island, Korea, October 13-15, 2005, proceedings)
MergedId FETCHMERGED-LOGICAL-p219t-ef33d2333f0bcebe2913a1667cae6dec26286123f86462876ee7aa00bd3f44873
PageCount 6
ParticipantIDs pascalfrancis_primary_17325814
springer_books_10_1007_11562382_44
PublicationCentury 2000
PublicationDate 2005
PublicationDateYYYYMMDD 2005-01-01
PublicationDate_xml – year: 2005
  text: 2005
PublicationDecade 2000
PublicationPlace Berlin, Heidelberg
PublicationPlace_xml – name: Berlin, Heidelberg
– name: New York, NY
PublicationSeriesTitle Lecture Notes in Computer Science
PublicationSubtitle Second Asia Information Retrieval Symposium, AIRS 2005, Jeju Island, Korea, October 13-15, 2005. Proceedings
PublicationTitle Information Retrieval Technology
PublicationYear 2005
Publisher Springer Berlin Heidelberg
Springer
Publisher_xml – name: Springer Berlin Heidelberg
– name: Springer
RelatedPersons Kleinberg, Jon M.
Mattern, Friedemann
Nierstrasz, Oscar
Tygar, Dough
Steffen, Bernhard
Kittler, Josef
Vardi, Moshe Y.
Weikum, Gerhard
Sudan, Madhu
Naor, Moni
Mitchell, John C.
Terzopoulos, Demetri
Pandu Rangan, C.
Kanade, Takeo
Hutchison, David
RelatedPersons_xml – sequence: 1
  givenname: David
  surname: Hutchison
  fullname: Hutchison, David
  organization: Lancaster University, UK
– sequence: 2
  givenname: Takeo
  surname: Kanade
  fullname: Kanade, Takeo
  organization: Carnegie Mellon University, Pittsburgh, USA
– sequence: 3
  givenname: Josef
  surname: Kittler
  fullname: Kittler, Josef
  organization: University of Surrey, Guildford, UK
– sequence: 4
  givenname: Jon M.
  surname: Kleinberg
  fullname: Kleinberg, Jon M.
  organization: Cornell University, Ithaca, USA
– sequence: 5
  givenname: Friedemann
  surname: Mattern
  fullname: Mattern, Friedemann
  organization: ETH Zurich, Switzerland
– sequence: 6
  givenname: John C.
  surname: Mitchell
  fullname: Mitchell, John C.
  organization: Stanford University, CA, USA
– sequence: 7
  givenname: Moni
  surname: Naor
  fullname: Naor, Moni
  organization: Weizmann Institute of Science, Rehovot, Israel
– sequence: 8
  givenname: Oscar
  surname: Nierstrasz
  fullname: Nierstrasz, Oscar
  organization: University of Bern, Switzerland
– sequence: 9
  givenname: C.
  surname: Pandu Rangan
  fullname: Pandu Rangan, C.
  organization: Indian Institute of Technology, Madras, India
– sequence: 10
  givenname: Bernhard
  surname: Steffen
  fullname: Steffen, Bernhard
  organization: University of Dortmund, Germany
– sequence: 11
  givenname: Madhu
  surname: Sudan
  fullname: Sudan, Madhu
  organization: Massachusetts Institute of Technology, MA, USA
– sequence: 12
  givenname: Demetri
  surname: Terzopoulos
  fullname: Terzopoulos, Demetri
  organization: New York University, NY, USA
– sequence: 13
  givenname: Dough
  surname: Tygar
  fullname: Tygar, Dough
  organization: University of California, Berkeley, USA
– sequence: 14
  givenname: Moshe Y.
  surname: Vardi
  fullname: Vardi, Moshe Y.
  organization: Rice University, Houston, USA
– sequence: 15
  givenname: Gerhard
  surname: Weikum
  fullname: Weikum, Gerhard
  organization: Max-Planck Institute of Computer Science, Saarbruecken, Germany
SSID ssj0000318261
ssj0002792
Score 1.8123046
Snippet We propose the Critical Sentence Vector Model (CSVM), a novel model to measure text similarity. The CSVM accounts for the structural and semantic information...
SourceID pascalfrancis
springer
SourceType Index Database
Publisher
StartPage 522
SubjectTerms Applied sciences
Combine Method
Common Noun
Computer science; control theory; systems
Exact sciences and technology
Information systems. Data bases
Memory organisation. Data processing
Proper Noun
Semantic Information
Software
Weight Assignment
Title Improving Text Similarity Measurement by Critical Sentence Vector Model
URI http://link.springer.com/10.1007/11562382_44
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT-MwELZ4iMselmVZLWhBFkJcqiAnTpzkwAEh2AoKFwrLniInsaUiNUU0XWn59czEzqMtQrtcosp9xJ3PGY9n5psh5DDWPlcsC5woYMzxXRY5ElaWo3KsfpYxzQVyh69vRP_Ov3wIHtp024pdUqbH2cubvJKPoApjgCuyZP8D2eZHYQBeA75wBYThumD8zrtZbbpgQzwEIWFfrD_ItV9ylQ-qeP0vNWr0r83CvZJj50I2w79n0gbgZ4WWi98uHkdzy6t1RgxBvfduR-MRnJHRpL9uvY5o2za9FG6x-ieqkfsqUFB1YTNrGGWlpicDG864mZRVlliv7jhRK6Alb2TvnWJd1tnkxW4kuq5NDroZTjdG3SmjjgUWWeSmqKlVsYHhMdvdOjCVBZY2ApP7AeYumHeRl_j-KlkNY1CB66fnl4P7xg2HOs0TbVYQ1lM0gSczGUsHqiYbbDSeMjN5y_dEKmbnTpheK6cgWG1aoyzF2CvTZbhJPiGdhSLPBET5hayoYot8rkVLrWi3yJ5lsNAj2llZ9ftfyc8GcYqI0xZx2kGcpn9pjTitEacGcVohvk3uLs6HZ33HduFwnmA3Kx2lOc89zrlmaQaPPPx1Ll0hwkwqkavMQ3Iz2D86EshzDoVSoZSMpTnXcPYP-TeyVkwK9Z1QP3DTCGXHNPOzNJaxlsJVrtZppHKd75D9OcklT6biSuKG3Asi198hB7UoE3zwpklddbsj_91_-dAPslY-z9QeWJRlum9XxSvN0nO-
link.rule.ids 310,311,782,783,787,792,793,796,4059,4060,27939
linkProvider Library Specific Holdings
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Information+Retrieval+Technology&rft.au=Li%2C+Wei&rft.au=Wong%2C+Kam-Fai&rft.au=Yuan%2C+Chunfa&rft.au=Li%2C+Wenjie&rft.atitle=Improving+Text+Similarity+Measurement+by+Critical+Sentence+Vector+Model&rft.series=Lecture+Notes+in+Computer+Science&rft.pub=Springer+Berlin+Heidelberg&rft.isbn=9783540291862&rft.issn=0302-9743&rft.eissn=1611-3349&rft.spage=522&rft.epage=527&rft_id=info:doi/10.1007%2F11562382_44
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0302-9743&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0302-9743&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0302-9743&client=summon