An Algorithm of Query Expansion for Chinese EMR Retrieval by Improving Expansion Term Weights and Retrieval Scores
Query expansion (QE) has been widely used in electronic medical record (EMR) retrieval for assisted diagnosis and clinical research. However, existing QE algorithms haven't achieved satisfactory performance in Chinese EMR retrieval, and one noticeable problem is that the weights of expansion te...
Saved in:
Published in | IEEE access Vol. 8; pp. 200063 - 200072 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Query expansion (QE) has been widely used in electronic medical record (EMR) retrieval for assisted diagnosis and clinical research. However, existing QE algorithms haven't achieved satisfactory performance in Chinese EMR retrieval, and one noticeable problem is that the weights of expansion terms and retrieval scores have unreasonable factors for lack of the solid consideration of clinical needs. Here we propose an algorithm of QE for Chinese EMR retrieval by improving expansion term weights and retrieval scores. First, the weights of expansion terms are assigned with semantic similarities, category weights and co-occurrence frequencies between expansion terms and multiple query terms. Then the retrieval scores calculated by expansion terms are limited to reduce the query drift caused by high-frequency expansion terms. Experiment results show that our method gets a 33.3% increase in the precision at top 10, a 90.4% increase in the recall, and a 13.2% increase in MAP compared with four baselines. It proves that our improvement scheme can ensure the accuracy of expansion term weights and decrease the query drift caused by QE, which substantially improves the performance of Chinese EMR retrieval. |
---|---|
AbstractList | Query expansion (QE) has been widely used in electronic medical record (EMR) retrieval for assisted diagnosis and clinical research. However, existing QE algorithms haven't achieved satisfactory performance in Chinese EMR retrieval, and one noticeable problem is that the weights of expansion terms and retrieval scores have unreasonable factors for lack of the solid consideration of clinical needs. Here we propose an algorithm of QE for Chinese EMR retrieval by improving expansion term weights and retrieval scores. First, the weights of expansion terms are assigned with semantic similarities, category weights and co-occurrence frequencies between expansion terms and multiple query terms. Then the retrieval scores calculated by expansion terms are limited to reduce the query drift caused by high-frequency expansion terms. Experiment results show that our method gets a 33.3% increase in the precision at top 10, a 90.4% increase in the recall, and a 13.2% increase in MAP compared with four baselines. It proves that our improvement scheme can ensure the accuracy of expansion term weights and decrease the query drift caused by QE, which substantially improves the performance of Chinese EMR retrieval. |
Author | Zhao, Dongsheng Yang, Songchun Zheng, Xiangwen Yin, Xiangfei Mao, Huajian |
Author_xml | – sequence: 1 givenname: Songchun orcidid: 0000-0002-8424-0372 surname: Yang fullname: Yang, Songchun organization: Academy of Military Medical Sciences, Beijing, China – sequence: 2 givenname: Xiangwen orcidid: 0000-0001-7940-0514 surname: Zheng fullname: Zheng, Xiangwen organization: Academy of Military Medical Sciences, Beijing, China – sequence: 3 givenname: Xiangfei orcidid: 0000-0002-1139-7889 surname: Yin fullname: Yin, Xiangfei organization: Academy of Military Medical Sciences, Beijing, China – sequence: 4 givenname: Huajian orcidid: 0000-0002-5609-6270 surname: Mao fullname: Mao, Huajian organization: Academy of Military Medical Sciences, Beijing, China – sequence: 5 givenname: Dongsheng orcidid: 0000-0003-2616-8891 surname: Zhao fullname: Zhao, Dongsheng email: dszhao@bmi.ac.cn organization: Academy of Military Medical Sciences, Beijing, China |
BookMark | eNpNkU1rGzEQhkVIoUmaX5CLIGe7-t7V0SxuakgJiQM9Clk7smVsyZXWof73VbohWBeJYZ5nRrzX6DKmCAjdUTKllOjvs66bL5dTRhiZcsI5oc0FumJU6QmXXF2evb-i21K2pJ62lmRzhfIs4tlunXIYNnucPH4-Qj7h-d-DjSWkiH3KuNuECAXw_NcLfoEhB3izO7w64cX-kNNbiOsz4BXyHv-GsN4MBdvYnxFLlzKUb-iLt7sCtx_3DVr-mL92PyePTw-LbvY4cYK0w2RVt_ZcOkkkk84K0EoI13snNNetssyBs6u-l9y1DHQrPfeUMaWJZ1LxG7QYrX2yW3PIYW_zySQbzP9Cymtj8xDcDgxXhLeylUIQKgRVK1LNthGN1l5JDtV1P7rqb_8coQxmm4451uUNE4pQqQnTtYuPXS6nUjL4z6mUmPekzJiUeU_KfCRVqbuRCgDwSWgmSVOd_wBcfo_b |
CODEN | IAECCG |
CitedBy_id | crossref_primary_10_1049_cmu2_12251 crossref_primary_10_1007_s41666_024_00159_4 crossref_primary_10_1007_s11042_024_19463_7 crossref_primary_10_1016_j_measurement_2022_111300 crossref_primary_10_3233_JIFS_241138 |
Cites_doi | 10.1145/133160.133167 10.1109/BIBM.2012.6392667 10.1145/1390334.1390524 10.1145/290941.290995 10.1016/j.ipm.2019.05.009 10.1109/SKG.2013.10 10.1145/2180868.2180873 10.4018/jswis.2006070104 10.1016/j.ijmedinf.2013.08.008 10.1145/1835449.1835637 10.1109/ACCESS.2017.2698142 10.1145/366836.366860 10.1007/s10257-010-0133-5 10.3115/v1/D14-1162 10.1109/BIBM.2016.7822767 10.1561/1500000019 10.1145/2513204.2513209 10.1007/978-3-319-75420-8_38 10.1037/h0047078 10.1007/s10791-006-9020-6 10.1145/1645953.1646114 10.1145/2611521 10.1177/1525822X02239569 10.1186/1472-6947-13-96 10.1186/s12859-019-3080-2 10.1007/s10791-008-9075-7 10.1016/0306-4573(90)90102-8 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
DBID | 97E ESBDL RIA RIE AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D DOA |
DOI | 10.1109/ACCESS.2020.3033017 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998-Present IEEE Xplore CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Materials Research Database |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2169-3536 |
EndPage | 200072 |
ExternalDocumentID | oai_doaj_org_article_3603858544014416b0ecaa74799f653e 10_1109_ACCESS_2020_3033017 9250729 |
Genre | orig-research |
GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR ABVLG ACGFS ADBBV ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IFIPE IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RIG RNS AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c408t-b216f35c50525ca4e9644cdfc493986a2cecabdd53c82e985f3f122690f2563 |
IEDL.DBID | RIE |
ISSN | 2169-3536 |
IngestDate | Thu Sep 05 15:43:48 EDT 2024 Thu Oct 10 19:12:49 EDT 2024 Fri Aug 23 01:13:14 EDT 2024 Wed Jun 26 19:26:34 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c408t-b216f35c50525ca4e9644cdfc493986a2cecabdd53c82e985f3f122690f2563 |
ORCID | 0000-0001-7940-0514 0000-0002-1139-7889 0000-0002-5609-6270 0000-0003-2616-8891 0000-0002-8424-0372 |
OpenAccessLink | https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/document/9250729 |
PQID | 2460159029 |
PQPubID | 4845423 |
PageCount | 10 |
ParticipantIDs | proquest_journals_2460159029 ieee_primary_9250729 crossref_primary_10_1109_ACCESS_2020_3033017 doaj_primary_oai_doaj_org_article_3603858544014416b0ecaa74799f653e |
PublicationCentury | 2000 |
PublicationDate | 20200000 2020-00-00 20200101 2020-01-01 |
PublicationDateYYYYMMDD | 2020-01-01 |
PublicationDate_xml | – year: 2020 text: 20200000 |
PublicationDecade | 2020 |
PublicationPlace | Piscataway |
PublicationPlace_xml | – name: Piscataway |
PublicationTitle | IEEE access |
PublicationTitleAbbrev | Access |
PublicationYear | 2020 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref34 ref12 ref37 ref36 ref14 ref31 ref33 ref11 ref32 ref10 devlin (ref27) 2018 zhu (ref16) 2011 valdés (ref30) 2013; 39 ref39 voorhees (ref40) 2005 ref17 ref38 ref19 ref18 hersh (ref1) 2008 le (ref25) 2014 rong (ref28) 2014 goodwin (ref24) 2014 cui (ref29) 2020 ref23 ref26 efthimiadis (ref5) 1996; 31 ref20 finney (ref35) 1952 ref22 ref21 ref8 ref7 ref9 ref4 robertson (ref2) 1981 ref3 ref6 aronson (ref15) 1997 |
References_xml | – ident: ref33 doi: 10.1145/133160.133167 – ident: ref20 doi: 10.1109/BIBM.2012.6392667 – ident: ref8 doi: 10.1145/1390334.1390524 – ident: ref23 doi: 10.1145/290941.290995 – ident: ref13 doi: 10.1016/j.ipm.2019.05.009 – ident: ref21 doi: 10.1109/SKG.2013.10 – year: 2005 ident: ref40 publication-title: TREC Experiment and Evaluation in Information Retrieval contributor: fullname: voorhees – ident: ref6 doi: 10.1145/2180868.2180873 – year: 1981 ident: ref2 publication-title: Information Retrieval Research contributor: fullname: robertson – year: 2018 ident: ref27 article-title: BERT: Pre-training of deep bidirectional transformers for language understanding publication-title: arXiv 1810 04805 contributor: fullname: devlin – year: 2014 ident: ref24 article-title: UTD at TREC 2014: Query expansion for clinical decision support contributor: fullname: goodwin – start-page: 485 year: 1997 ident: ref15 article-title: Query expansion using the UMLS metathesaurus publication-title: Proc AMIA Annu Fall Symp contributor: fullname: aronson – ident: ref7 doi: 10.4018/jswis.2006070104 – year: 1952 ident: ref35 publication-title: Probit Analysis A Statistical Treatment of the Sigmoid Response Curve contributor: fullname: finney – ident: ref10 doi: 10.1016/j.ijmedinf.2013.08.008 – ident: ref18 doi: 10.1145/1835449.1835637 – ident: ref4 doi: 10.1109/ACCESS.2017.2698142 – ident: ref39 doi: 10.1145/366836.366860 – year: 2014 ident: ref28 article-title: Word2vec parameter learning explained publication-title: arXiv 1411 2738 contributor: fullname: rong – ident: ref22 doi: 10.1007/s10257-010-0133-5 – ident: ref26 doi: 10.3115/v1/D14-1162 – ident: ref36 doi: 10.1109/BIBM.2016.7822767 – ident: ref12 doi: 10.1561/1500000019 – ident: ref37 doi: 10.1145/2513204.2513209 – volume: 39 start-page: 253 year: 2013 ident: ref30 article-title: Delphi method for the expert consultation in the scientific research publication-title: Revista de Salud Pública contributor: fullname: valdés – ident: ref14 doi: 10.1007/978-3-319-75420-8_38 – start-page: 1 year: 2011 ident: ref16 article-title: Using multiple external collections for query expansion publication-title: Proc 20th Text Retr Conf contributor: fullname: zhu – ident: ref31 doi: 10.1037/h0047078 – year: 2020 ident: ref29 article-title: Revisiting pre-trained models for Chinese natural language processing publication-title: arXiv 2004 13922 contributor: fullname: cui – start-page: 1188 year: 2014 ident: ref25 article-title: Distributed representations of sentences and documents publication-title: Proc Int Conf Mach Learn contributor: fullname: le – ident: ref11 doi: 10.1007/s10791-006-9020-6 – ident: ref17 doi: 10.1145/1645953.1646114 – ident: ref38 doi: 10.1145/2611521 – ident: ref32 doi: 10.1177/1525822X02239569 – ident: ref9 doi: 10.1186/1472-6947-13-96 – ident: ref19 doi: 10.1186/s12859-019-3080-2 – ident: ref34 doi: 10.1007/s10791-008-9075-7 – volume: 31 start-page: 87 year: 1996 ident: ref5 article-title: Query expansion publication-title: Annu Rev Inf Sci Technol contributor: fullname: efthimiadis – ident: ref3 doi: 10.1016/0306-4573(90)90102-8 – year: 2008 ident: ref1 publication-title: Information Retrieval A Health & Biomedical Perspective contributor: fullname: hersh |
SSID | ssj0000816957 |
Score | 2.2511063 |
Snippet | Query expansion (QE) has been widely used in electronic medical record (EMR) retrieval for assisted diagnosis and clinical research. However, existing QE... |
SourceID | doaj proquest crossref ieee |
SourceType | Open Website Aggregation Database Publisher |
StartPage | 200063 |
SubjectTerms | Algorithms BM25 co-occurrence Drift Electronic health records Electronic medical record Electronic medical records Medical research Performance enhancement Queries Query expansion Retrieval Semantics Solids word2Vec |
SummonAdditionalLinks | – databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV09T8MwELVQJxgQUBCFgjwwEtWJndQ3lqoVQioSbRHdLMdxYIAU9UOi_56zk5ZIDCysUZzEd_b5vbPzjpAbnWr3L08eZAJkIDRLA5lKHoDUcdg1IMEfohk9JvfP4mEWz2qlvtyZsFIeuDRchyfl3pUQHvsnKbNGawTBAHkSc-ujbxjXyJSPwTJMIO5WMkMhg06v38ceISGMkKcyZPG-RNnPUuQV-6sSK7_isl9shkfksEKJtFd-3THZs8UJOahpBzbJolfQ3vvrHMn92wed5_RpbRcbOvjC6e0yYBTRKHXVse3S0sFoTMe-dhYOLJpu6C6XUGswxSBNX3yqdEl1kdVaTJzY5fKUTIaDaf8-qAooBEYwuQrSKExyHhtfrM5oYQHRj8lyI4CDTHRk0JZplsXcyMiCjHOeh4jHgOWIhPgZaRTzwp4T6mBCgkQEuPUrPoYB02Wiy20IwjLbIrdbS6rPUiVDeXbBQJWGV87wqjJ8i9w5a-9udRLX_gI6XlWOV385vkWazle7hwCCOWQKLdLe-k5V03GpIoG80wnVwMV_vPqS7LvulJmYNmmsFmt7hdhklV77YfgNxuPalg priority: 102 providerName: Directory of Open Access Journals |
Title | An Algorithm of Query Expansion for Chinese EMR Retrieval by Improving Expansion Term Weights and Retrieval Scores |
URI | https://ieeexplore.ieee.org/document/9250729 https://www.proquest.com/docview/2460159029 https://doaj.org/article/3603858544014416b0ecaa74799f653e |
Volume | 8 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Nb9QwEB21PdEDBVrE0lL50GOzdWIn6zkuq60qpEXqB6I3y3YcQIUs2s1KbX89YyebrgoHblEUJ05mMn5vbL8BODHWhL08VVJKVIk03CbKKpGgMnk6cqgwLqKZfS4uvshPt_ntFpz2e2G893HxmR-GwziXX87dKqTKzpDGawKD27CteNbu1erzKaGABOajTlgo5Xg2nkzoHYgCZsRMOfH2WJTsafCJGv1dUZW_InEcXs73YLbuWLuq5G64auzQPT7TbPzfnr-Clx3OZOPWMV7Dlq_fwO6G-uA-LMY1G__8Nl_8aL7_YvOKXa784oFN7ylAhBwaIzzLQn1tv_RsOrtiV7H6Frkmsw-sz0ZsNLihMM--xmTrkpm63GhxHeQylwdwfT69mVwkXQmGxEmumsRmaVGJ3MVyd85Ij4SfXFk5iQJVYTLnnbFlmQunMo8qr0SVEqJDXhGWEm9hp57X_h2wADQKojIofMQMFEjciMuR8ClKz_0ATteW0b9bnQ0d-QlH3RpSB0PqzpAD-Bis118aRLLjCfrquvvntCjaaU8pI20sLKfOGuJPiFWRC3rmfrBUf5POSAM4WvuC7n7opc4kMdcgdYPv_93qEF6EDrbZmSPYaRYr_4HwSmOPI88_ju76ByDU5sQ |
link.rule.ids | 315,786,790,802,870,2115,4043,27954,27955,27956,55107 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3fb9MwED6N8QA8jB8DURjgBx6XzomdxPdYqk4F1klsRezNchxnICBFbSpt--t3dtJQAQ-8RVGcXHKXu-8723cAb01h_F6eKiolqkgaXkSqUCJCZdI4t6gwLKKZnWbTz_LDRXqxA4f9XhjnXFh85ob-MMzllwu79qmyI6R4TWDwDtylOM_zdrdWn1HxLSQwzbvSQjHHo9F4TG9BJDAhbsqJuYe2ZL_DT6jS37VV-csXhwBz_BBmG9HadSXfh-umGNqbP6o2_q_sj2CvQ5ps1JrGY9hx9RN4sFV_cB-Wo5qNflwult-arz_ZomKf1m55zSZX5CJ8Fo0RomW-w7ZbOTaZnbGz0H-LjJMV16zPR2wNmJOjZ19CunXFTF1ujTj3BTNXT-H8eDIfT6OuCUNkJVdNVCRxVonUhoZ31kiHhKBsWVmJAlVmEuusKcoyFVYlDlVaiSomTIe8IjQlnsFuvajdc2AeamREZlC4gBrIldicy1y4GKXjbgCHG83oX22lDR0YCkfdKlJ7RepOkQN457XXX-rLZIcT9NV199dpkbUTn1IG4pgVnIQ1xKAQqywV9Mx9r6n-Jp2SBnCwsQXd_dIrnUjirr7YDb7496g3cG86n53ok_enH1_CfS9sm6s5gN1muXavCL00xetgtLf92ukj |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+Algorithm+of+Query+Expansion+for+Chinese+EMR+Retrieval+by+Improving+Expansion+Term+Weights+and+Retrieval+Scores&rft.jtitle=IEEE+access&rft.au=Yang%2C+Songchun&rft.au=Zheng%2C+Xiangwen&rft.au=Yin%2C+Xiangfei&rft.au=Mao%2C+Huajian&rft.date=2020&rft.pub=IEEE&rft.eissn=2169-3536&rft.volume=8&rft.spage=200063&rft.epage=200072&rft_id=info:doi/10.1109%2FACCESS.2020.3033017&rft.externalDocID=9250729 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon |