An Algorithm of Query Expansion for Chinese EMR Retrieval by Improving Expansion Term Weights and Retrieval Scores

Query expansion (QE) has been widely used in electronic medical record (EMR) retrieval for assisted diagnosis and clinical research. However, existing QE algorithms haven't achieved satisfactory performance in Chinese EMR retrieval, and one noticeable problem is that the weights of expansion te...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 8; pp. 200063 - 200072
Main Authors Yang, Songchun, Zheng, Xiangwen, Yin, Xiangfei, Mao, Huajian, Zhao, Dongsheng
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Query expansion (QE) has been widely used in electronic medical record (EMR) retrieval for assisted diagnosis and clinical research. However, existing QE algorithms haven't achieved satisfactory performance in Chinese EMR retrieval, and one noticeable problem is that the weights of expansion terms and retrieval scores have unreasonable factors for lack of the solid consideration of clinical needs. Here we propose an algorithm of QE for Chinese EMR retrieval by improving expansion term weights and retrieval scores. First, the weights of expansion terms are assigned with semantic similarities, category weights and co-occurrence frequencies between expansion terms and multiple query terms. Then the retrieval scores calculated by expansion terms are limited to reduce the query drift caused by high-frequency expansion terms. Experiment results show that our method gets a 33.3% increase in the precision at top 10, a 90.4% increase in the recall, and a 13.2% increase in MAP compared with four baselines. It proves that our improvement scheme can ensure the accuracy of expansion term weights and decrease the query drift caused by QE, which substantially improves the performance of Chinese EMR retrieval.
AbstractList Query expansion (QE) has been widely used in electronic medical record (EMR) retrieval for assisted diagnosis and clinical research. However, existing QE algorithms haven't achieved satisfactory performance in Chinese EMR retrieval, and one noticeable problem is that the weights of expansion terms and retrieval scores have unreasonable factors for lack of the solid consideration of clinical needs. Here we propose an algorithm of QE for Chinese EMR retrieval by improving expansion term weights and retrieval scores. First, the weights of expansion terms are assigned with semantic similarities, category weights and co-occurrence frequencies between expansion terms and multiple query terms. Then the retrieval scores calculated by expansion terms are limited to reduce the query drift caused by high-frequency expansion terms. Experiment results show that our method gets a 33.3% increase in the precision at top 10, a 90.4% increase in the recall, and a 13.2% increase in MAP compared with four baselines. It proves that our improvement scheme can ensure the accuracy of expansion term weights and decrease the query drift caused by QE, which substantially improves the performance of Chinese EMR retrieval.
Author Zhao, Dongsheng
Yang, Songchun
Zheng, Xiangwen
Yin, Xiangfei
Mao, Huajian
Author_xml – sequence: 1
  givenname: Songchun
  orcidid: 0000-0002-8424-0372
  surname: Yang
  fullname: Yang, Songchun
  organization: Academy of Military Medical Sciences, Beijing, China
– sequence: 2
  givenname: Xiangwen
  orcidid: 0000-0001-7940-0514
  surname: Zheng
  fullname: Zheng, Xiangwen
  organization: Academy of Military Medical Sciences, Beijing, China
– sequence: 3
  givenname: Xiangfei
  orcidid: 0000-0002-1139-7889
  surname: Yin
  fullname: Yin, Xiangfei
  organization: Academy of Military Medical Sciences, Beijing, China
– sequence: 4
  givenname: Huajian
  orcidid: 0000-0002-5609-6270
  surname: Mao
  fullname: Mao, Huajian
  organization: Academy of Military Medical Sciences, Beijing, China
– sequence: 5
  givenname: Dongsheng
  orcidid: 0000-0003-2616-8891
  surname: Zhao
  fullname: Zhao, Dongsheng
  email: dszhao@bmi.ac.cn
  organization: Academy of Military Medical Sciences, Beijing, China
BookMark eNpNkU1rGzEQhkVIoUmaX5CLIGe7-t7V0SxuakgJiQM9Clk7smVsyZXWof73VbohWBeJYZ5nRrzX6DKmCAjdUTKllOjvs66bL5dTRhiZcsI5oc0FumJU6QmXXF2evb-i21K2pJ62lmRzhfIs4tlunXIYNnucPH4-Qj7h-d-DjSWkiH3KuNuECAXw_NcLfoEhB3izO7w64cX-kNNbiOsz4BXyHv-GsN4MBdvYnxFLlzKUb-iLt7sCtx_3DVr-mL92PyePTw-LbvY4cYK0w2RVt_ZcOkkkk84K0EoI13snNNetssyBs6u-l9y1DHQrPfeUMaWJZ1LxG7QYrX2yW3PIYW_zySQbzP9Cymtj8xDcDgxXhLeylUIQKgRVK1LNthGN1l5JDtV1P7rqb_8coQxmm4451uUNE4pQqQnTtYuPXS6nUjL4z6mUmPekzJiUeU_KfCRVqbuRCgDwSWgmSVOd_wBcfo_b
CODEN IAECCG
CitedBy_id crossref_primary_10_1049_cmu2_12251
crossref_primary_10_1007_s41666_024_00159_4
crossref_primary_10_1007_s11042_024_19463_7
crossref_primary_10_1016_j_measurement_2022_111300
crossref_primary_10_3233_JIFS_241138
Cites_doi 10.1145/133160.133167
10.1109/BIBM.2012.6392667
10.1145/1390334.1390524
10.1145/290941.290995
10.1016/j.ipm.2019.05.009
10.1109/SKG.2013.10
10.1145/2180868.2180873
10.4018/jswis.2006070104
10.1016/j.ijmedinf.2013.08.008
10.1145/1835449.1835637
10.1109/ACCESS.2017.2698142
10.1145/366836.366860
10.1007/s10257-010-0133-5
10.3115/v1/D14-1162
10.1109/BIBM.2016.7822767
10.1561/1500000019
10.1145/2513204.2513209
10.1007/978-3-319-75420-8_38
10.1037/h0047078
10.1007/s10791-006-9020-6
10.1145/1645953.1646114
10.1145/2611521
10.1177/1525822X02239569
10.1186/1472-6947-13-96
10.1186/s12859-019-3080-2
10.1007/s10791-008-9075-7
10.1016/0306-4573(90)90102-8
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
DOA
DOI 10.1109/ACCESS.2020.3033017
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005-present
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998-Present
IEEE Xplore
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
METADEX
Technology Research Database
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Materials Research Database
Engineered Materials Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
METADEX
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Materials Research Database

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: RIE
  name: IEEE Xplore
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2169-3536
EndPage 200072
ExternalDocumentID oai_doaj_org_article_3603858544014416b0ecaa74799f653e
10_1109_ACCESS_2020_3033017
9250729
Genre orig-research
GroupedDBID 0R~
4.4
5VS
6IK
97E
AAJGR
ABVLG
ACGFS
ADBBV
ALMA_UNASSIGNED_HOLDINGS
BCNDV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
ESBDL
GROUPED_DOAJ
IFIPE
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
OK1
RIA
RIE
RIG
RNS
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c408t-b216f35c50525ca4e9644cdfc493986a2cecabdd53c82e985f3f122690f2563
IEDL.DBID RIE
ISSN 2169-3536
IngestDate Thu Sep 05 15:43:48 EDT 2024
Thu Oct 10 19:12:49 EDT 2024
Fri Aug 23 01:13:14 EDT 2024
Wed Jun 26 19:26:34 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c408t-b216f35c50525ca4e9644cdfc493986a2cecabdd53c82e985f3f122690f2563
ORCID 0000-0001-7940-0514
0000-0002-1139-7889
0000-0002-5609-6270
0000-0003-2616-8891
0000-0002-8424-0372
OpenAccessLink https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/document/9250729
PQID 2460159029
PQPubID 4845423
PageCount 10
ParticipantIDs proquest_journals_2460159029
ieee_primary_9250729
crossref_primary_10_1109_ACCESS_2020_3033017
doaj_primary_oai_doaj_org_article_3603858544014416b0ecaa74799f653e
PublicationCentury 2000
PublicationDate 20200000
2020-00-00
20200101
2020-01-01
PublicationDateYYYYMMDD 2020-01-01
PublicationDate_xml – year: 2020
  text: 20200000
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE access
PublicationTitleAbbrev Access
PublicationYear 2020
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref34
ref12
ref37
ref36
ref14
ref31
ref33
ref11
ref32
ref10
devlin (ref27) 2018
zhu (ref16) 2011
valdés (ref30) 2013; 39
ref39
voorhees (ref40) 2005
ref17
ref38
ref19
ref18
hersh (ref1) 2008
le (ref25) 2014
rong (ref28) 2014
goodwin (ref24) 2014
cui (ref29) 2020
ref23
ref26
efthimiadis (ref5) 1996; 31
ref20
finney (ref35) 1952
ref22
ref21
ref8
ref7
ref9
ref4
robertson (ref2) 1981
ref3
ref6
aronson (ref15) 1997
References_xml – ident: ref33
  doi: 10.1145/133160.133167
– ident: ref20
  doi: 10.1109/BIBM.2012.6392667
– ident: ref8
  doi: 10.1145/1390334.1390524
– ident: ref23
  doi: 10.1145/290941.290995
– ident: ref13
  doi: 10.1016/j.ipm.2019.05.009
– ident: ref21
  doi: 10.1109/SKG.2013.10
– year: 2005
  ident: ref40
  publication-title: TREC Experiment and Evaluation in Information Retrieval
  contributor:
    fullname: voorhees
– ident: ref6
  doi: 10.1145/2180868.2180873
– year: 1981
  ident: ref2
  publication-title: Information Retrieval Research
  contributor:
    fullname: robertson
– year: 2018
  ident: ref27
  article-title: BERT: Pre-training of deep bidirectional transformers for language understanding
  publication-title: arXiv 1810 04805
  contributor:
    fullname: devlin
– year: 2014
  ident: ref24
  article-title: UTD at TREC 2014: Query expansion for clinical decision support
  contributor:
    fullname: goodwin
– start-page: 485
  year: 1997
  ident: ref15
  article-title: Query expansion using the UMLS metathesaurus
  publication-title: Proc AMIA Annu Fall Symp
  contributor:
    fullname: aronson
– ident: ref7
  doi: 10.4018/jswis.2006070104
– year: 1952
  ident: ref35
  publication-title: Probit Analysis A Statistical Treatment of the Sigmoid Response Curve
  contributor:
    fullname: finney
– ident: ref10
  doi: 10.1016/j.ijmedinf.2013.08.008
– ident: ref18
  doi: 10.1145/1835449.1835637
– ident: ref4
  doi: 10.1109/ACCESS.2017.2698142
– ident: ref39
  doi: 10.1145/366836.366860
– year: 2014
  ident: ref28
  article-title: Word2vec parameter learning explained
  publication-title: arXiv 1411 2738
  contributor:
    fullname: rong
– ident: ref22
  doi: 10.1007/s10257-010-0133-5
– ident: ref26
  doi: 10.3115/v1/D14-1162
– ident: ref36
  doi: 10.1109/BIBM.2016.7822767
– ident: ref12
  doi: 10.1561/1500000019
– ident: ref37
  doi: 10.1145/2513204.2513209
– volume: 39
  start-page: 253
  year: 2013
  ident: ref30
  article-title: Delphi method for the expert consultation in the scientific research
  publication-title: Revista de Salud Pública
  contributor:
    fullname: valdés
– ident: ref14
  doi: 10.1007/978-3-319-75420-8_38
– start-page: 1
  year: 2011
  ident: ref16
  article-title: Using multiple external collections for query expansion
  publication-title: Proc 20th Text Retr Conf
  contributor:
    fullname: zhu
– ident: ref31
  doi: 10.1037/h0047078
– year: 2020
  ident: ref29
  article-title: Revisiting pre-trained models for Chinese natural language processing
  publication-title: arXiv 2004 13922
  contributor:
    fullname: cui
– start-page: 1188
  year: 2014
  ident: ref25
  article-title: Distributed representations of sentences and documents
  publication-title: Proc Int Conf Mach Learn
  contributor:
    fullname: le
– ident: ref11
  doi: 10.1007/s10791-006-9020-6
– ident: ref17
  doi: 10.1145/1645953.1646114
– ident: ref38
  doi: 10.1145/2611521
– ident: ref32
  doi: 10.1177/1525822X02239569
– ident: ref9
  doi: 10.1186/1472-6947-13-96
– ident: ref19
  doi: 10.1186/s12859-019-3080-2
– ident: ref34
  doi: 10.1007/s10791-008-9075-7
– volume: 31
  start-page: 87
  year: 1996
  ident: ref5
  article-title: Query expansion
  publication-title: Annu Rev Inf Sci Technol
  contributor:
    fullname: efthimiadis
– ident: ref3
  doi: 10.1016/0306-4573(90)90102-8
– year: 2008
  ident: ref1
  publication-title: Information Retrieval A Health & Biomedical Perspective
  contributor:
    fullname: hersh
SSID ssj0000816957
Score 2.2511063
Snippet Query expansion (QE) has been widely used in electronic medical record (EMR) retrieval for assisted diagnosis and clinical research. However, existing QE...
SourceID doaj
proquest
crossref
ieee
SourceType Open Website
Aggregation Database
Publisher
StartPage 200063
SubjectTerms Algorithms
BM25
co-occurrence
Drift
Electronic health records
Electronic medical record
Electronic medical records
Medical research
Performance enhancement
Queries
Query expansion
Retrieval
Semantics
Solids
word2Vec
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV09T8MwELVQJxgQUBCFgjwwEtWJndQ3lqoVQioSbRHdLMdxYIAU9UOi_56zk5ZIDCysUZzEd_b5vbPzjpAbnWr3L08eZAJkIDRLA5lKHoDUcdg1IMEfohk9JvfP4mEWz2qlvtyZsFIeuDRchyfl3pUQHvsnKbNGawTBAHkSc-ujbxjXyJSPwTJMIO5WMkMhg06v38ceISGMkKcyZPG-RNnPUuQV-6sSK7_isl9shkfksEKJtFd-3THZs8UJOahpBzbJolfQ3vvrHMn92wed5_RpbRcbOvjC6e0yYBTRKHXVse3S0sFoTMe-dhYOLJpu6C6XUGswxSBNX3yqdEl1kdVaTJzY5fKUTIaDaf8-qAooBEYwuQrSKExyHhtfrM5oYQHRj8lyI4CDTHRk0JZplsXcyMiCjHOeh4jHgOWIhPgZaRTzwp4T6mBCgkQEuPUrPoYB02Wiy20IwjLbIrdbS6rPUiVDeXbBQJWGV87wqjJ8i9w5a-9udRLX_gI6XlWOV385vkWazle7hwCCOWQKLdLe-k5V03GpIoG80wnVwMV_vPqS7LvulJmYNmmsFmt7hdhklV77YfgNxuPalg
  priority: 102
  providerName: Directory of Open Access Journals
Title An Algorithm of Query Expansion for Chinese EMR Retrieval by Improving Expansion Term Weights and Retrieval Scores
URI https://ieeexplore.ieee.org/document/9250729
https://www.proquest.com/docview/2460159029
https://doaj.org/article/3603858544014416b0ecaa74799f653e
Volume 8
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Nb9QwEB21PdEDBVrE0lL50GOzdWIn6zkuq60qpEXqB6I3y3YcQIUs2s1KbX89YyebrgoHblEUJ05mMn5vbL8BODHWhL08VVJKVIk03CbKKpGgMnk6cqgwLqKZfS4uvshPt_ntFpz2e2G893HxmR-GwziXX87dKqTKzpDGawKD27CteNbu1erzKaGABOajTlgo5Xg2nkzoHYgCZsRMOfH2WJTsafCJGv1dUZW_InEcXs73YLbuWLuq5G64auzQPT7TbPzfnr-Clx3OZOPWMV7Dlq_fwO6G-uA-LMY1G__8Nl_8aL7_YvOKXa784oFN7ylAhBwaIzzLQn1tv_RsOrtiV7H6Frkmsw-sz0ZsNLihMM--xmTrkpm63GhxHeQylwdwfT69mVwkXQmGxEmumsRmaVGJ3MVyd85Ij4SfXFk5iQJVYTLnnbFlmQunMo8qr0SVEqJDXhGWEm9hp57X_h2wADQKojIofMQMFEjciMuR8ClKz_0ATteW0b9bnQ0d-QlH3RpSB0PqzpAD-Bis118aRLLjCfrquvvntCjaaU8pI20sLKfOGuJPiFWRC3rmfrBUf5POSAM4WvuC7n7opc4kMdcgdYPv_93qEF6EDrbZmSPYaRYr_4HwSmOPI88_ju76ByDU5sQ
link.rule.ids 315,786,790,802,870,2115,4043,27954,27955,27956,55107
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3fb9MwED6N8QA8jB8DURjgBx6XzomdxPdYqk4F1klsRezNchxnICBFbSpt--t3dtJQAQ-8RVGcXHKXu-8723cAb01h_F6eKiolqkgaXkSqUCJCZdI4t6gwLKKZnWbTz_LDRXqxA4f9XhjnXFh85ob-MMzllwu79qmyI6R4TWDwDtylOM_zdrdWn1HxLSQwzbvSQjHHo9F4TG9BJDAhbsqJuYe2ZL_DT6jS37VV-csXhwBz_BBmG9HadSXfh-umGNqbP6o2_q_sj2CvQ5ps1JrGY9hx9RN4sFV_cB-Wo5qNflwult-arz_ZomKf1m55zSZX5CJ8Fo0RomW-w7ZbOTaZnbGz0H-LjJMV16zPR2wNmJOjZ19CunXFTF1ujTj3BTNXT-H8eDIfT6OuCUNkJVdNVCRxVonUhoZ31kiHhKBsWVmJAlVmEuusKcoyFVYlDlVaiSomTIe8IjQlnsFuvajdc2AeamREZlC4gBrIldicy1y4GKXjbgCHG83oX22lDR0YCkfdKlJ7RepOkQN457XXX-rLZIcT9NV199dpkbUTn1IG4pgVnIQ1xKAQqywV9Mx9r6n-Jp2SBnCwsQXd_dIrnUjirr7YDb7496g3cG86n53ok_enH1_CfS9sm6s5gN1muXavCL00xetgtLf92ukj
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+Algorithm+of+Query+Expansion+for+Chinese+EMR+Retrieval+by+Improving+Expansion+Term+Weights+and+Retrieval+Scores&rft.jtitle=IEEE+access&rft.au=Yang%2C+Songchun&rft.au=Zheng%2C+Xiangwen&rft.au=Yin%2C+Xiangfei&rft.au=Mao%2C+Huajian&rft.date=2020&rft.pub=IEEE&rft.eissn=2169-3536&rft.volume=8&rft.spage=200063&rft.epage=200072&rft_id=info:doi/10.1109%2FACCESS.2020.3033017&rft.externalDocID=9250729
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon