An improved data augmentation approach and its application in medical named entity recognition

Performing data augmentation in medical named entity recognition (NER) is crucial due to the unique challenges posed by this field. Medical data is characterized by high acquisition costs, specialized terminology, imbalanced distributions, and limited training resources. These factors make achieving...

Full description

Saved in:
Bibliographic Details
Published inBMC medical informatics and decision making Vol. 24; no. 1; pp. 221 - 13
Main Authors Chen, Hongyu, Dan, Li, Lu, Yonghe, Chen, Minghong, Zhang, Jinxia
Format Journal Article
LanguageEnglish
Published England BioMed Central Ltd 05.08.2024
BioMed Central
BMC
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Performing data augmentation in medical named entity recognition (NER) is crucial due to the unique challenges posed by this field. Medical data is characterized by high acquisition costs, specialized terminology, imbalanced distributions, and limited training resources. These factors make achieving high performance in medical NER particularly difficult. Data augmentation methods help to mitigate these issues by generating additional training samples, thus balancing data distribution, enriching the training dataset, and improving model generalization. This paper proposes two data augmentation methods-Contextual Random Replacement based on Word2Vec Augmentation (CRR) and Targeted Entity Random Replacement Augmentation (TER)-aimed at addressing the scarcity and imbalance of data in the medical domain. When combined with a deep learning-based Chinese NER model, these methods can significantly enhance performance and recognition accuracy under limited resources. Experimental results demonstrate that both augmentation methods effectively improve the recognition capability of medical named entities. Specifically, the BERT-BiLSTM-CRF model achieved the highest F1 score of 83.587%, representing a 1.49% increase over the baseline model. This validates the importance and effectiveness of data augmentation in medical NER.
AbstractList Performing data augmentation in medical named entity recognition (NER) is crucial due to the unique challenges posed by this field. Medical data is characterized by high acquisition costs, specialized terminology, imbalanced distributions, and limited training resources. These factors make achieving high performance in medical NER particularly difficult. Data augmentation methods help to mitigate these issues by generating additional training samples, thus balancing data distribution, enriching the training dataset, and improving model generalization. This paper proposes two data augmentation methods—Contextual Random Replacement based on Word2Vec Augmentation (CRR) and Targeted Entity Random Replacement Augmentation (TER)—aimed at addressing the scarcity and imbalance of data in the medical domain. When combined with a deep learning-based Chinese NER model, these methods can significantly enhance performance and recognition accuracy under limited resources. Experimental results demonstrate that both augmentation methods effectively improve the recognition capability of medical named entities. Specifically, the BERT-BiLSTM-CRF model achieved the highest F1 score of 83.587%, representing a 1.49% increase over the baseline model. This validates the importance and effectiveness of data augmentation in medical NER.
Performing data augmentation in medical named entity recognition (NER) is crucial due to the unique challenges posed by this field. Medical data is characterized by high acquisition costs, specialized terminology, imbalanced distributions, and limited training resources. These factors make achieving high performance in medical NER particularly difficult. Data augmentation methods help to mitigate these issues by generating additional training samples, thus balancing data distribution, enriching the training dataset, and improving model generalization. This paper proposes two data augmentation methods--Contextual Random Replacement based on Word2Vec Augmentation (CRR) and Targeted Entity Random Replacement Augmentation (TER)--aimed at addressing the scarcity and imbalance of data in the medical domain. When combined with a deep learning-based Chinese NER model, these methods can significantly enhance performance and recognition accuracy under limited resources. Experimental results demonstrate that both augmentation methods effectively improve the recognition capability of medical named entities. Specifically, the BERT-BiLSTM-CRF model achieved the highest F1 score of 83.587%, representing a 1.49% increase over the baseline model. This validates the importance and effectiveness of data augmentation in medical NER. Keywords: Data augmentation, Deep learning, Medical named entity recognition, Text features, Replacement augmentation
Performing data augmentation in medical named entity recognition (NER) is crucial due to the unique challenges posed by this field. Medical data is characterized by high acquisition costs, specialized terminology, imbalanced distributions, and limited training resources. These factors make achieving high performance in medical NER particularly difficult. Data augmentation methods help to mitigate these issues by generating additional training samples, thus balancing data distribution, enriching the training dataset, and improving model generalization. This paper proposes two data augmentation methods-Contextual Random Replacement based on Word2Vec Augmentation (CRR) and Targeted Entity Random Replacement Augmentation (TER)-aimed at addressing the scarcity and imbalance of data in the medical domain. When combined with a deep learning-based Chinese NER model, these methods can significantly enhance performance and recognition accuracy under limited resources. Experimental results demonstrate that both augmentation methods effectively improve the recognition capability of medical named entities. Specifically, the BERT-BiLSTM-CRF model achieved the highest F1 score of 83.587%, representing a 1.49% increase over the baseline model. This validates the importance and effectiveness of data augmentation in medical NER.Performing data augmentation in medical named entity recognition (NER) is crucial due to the unique challenges posed by this field. Medical data is characterized by high acquisition costs, specialized terminology, imbalanced distributions, and limited training resources. These factors make achieving high performance in medical NER particularly difficult. Data augmentation methods help to mitigate these issues by generating additional training samples, thus balancing data distribution, enriching the training dataset, and improving model generalization. This paper proposes two data augmentation methods-Contextual Random Replacement based on Word2Vec Augmentation (CRR) and Targeted Entity Random Replacement Augmentation (TER)-aimed at addressing the scarcity and imbalance of data in the medical domain. When combined with a deep learning-based Chinese NER model, these methods can significantly enhance performance and recognition accuracy under limited resources. Experimental results demonstrate that both augmentation methods effectively improve the recognition capability of medical named entities. Specifically, the BERT-BiLSTM-CRF model achieved the highest F1 score of 83.587%, representing a 1.49% increase over the baseline model. This validates the importance and effectiveness of data augmentation in medical NER.
Abstract Performing data augmentation in medical named entity recognition (NER) is crucial due to the unique challenges posed by this field. Medical data is characterized by high acquisition costs, specialized terminology, imbalanced distributions, and limited training resources. These factors make achieving high performance in medical NER particularly difficult. Data augmentation methods help to mitigate these issues by generating additional training samples, thus balancing data distribution, enriching the training dataset, and improving model generalization. This paper proposes two data augmentation methods—Contextual Random Replacement based on Word2Vec Augmentation (CRR) and Targeted Entity Random Replacement Augmentation (TER)—aimed at addressing the scarcity and imbalance of data in the medical domain. When combined with a deep learning-based Chinese NER model, these methods can significantly enhance performance and recognition accuracy under limited resources. Experimental results demonstrate that both augmentation methods effectively improve the recognition capability of medical named entities. Specifically, the BERT-BiLSTM-CRF model achieved the highest F1 score of 83.587%, representing a 1.49% increase over the baseline model. This validates the importance and effectiveness of data augmentation in medical NER.
ArticleNumber 221
Audience Academic
Author Dan, Li
Zhang, Jinxia
Chen, Hongyu
Chen, Minghong
Lu, Yonghe
Author_xml – sequence: 1
  givenname: Hongyu
  surname: Chen
  fullname: Chen, Hongyu
– sequence: 2
  givenname: Li
  surname: Dan
  fullname: Dan, Li
– sequence: 3
  givenname: Yonghe
  surname: Lu
  fullname: Lu, Yonghe
– sequence: 4
  givenname: Minghong
  surname: Chen
  fullname: Chen, Minghong
– sequence: 5
  givenname: Jinxia
  surname: Zhang
  fullname: Zhang, Jinxia
BackLink https://www.ncbi.nlm.nih.gov/pubmed/39103849$$D View this record in MEDLINE/PubMed
BookMark eNp9kktv1DAUhS1URB_wB1igSGzYTPG1ncReoVHFo1IlNrDFcmwn9SixByep2n_fO5MWOhVCURLn3nO_-CTnlBzFFD0hb4GeA8jq4whMAawoE3hWeL19QU5A1GxVKVEfPVkfk9Nx3FAKteTlK3LMFVAuhTohv9axCMM2pxvvCmcmU5i5G3yczBRSLMwWW8ZeFya6IkzjrtAHuzRDLAbv8KkvosFVgWNhuiuyt6mLYad5TV62ph_9m4f7Gfn55fOPi2-rq-9fLy_WVytbVmJaCc8ZZ20p2lY6znGbYJlgtgJhnbDWs5op6hoDRjVOVg0HWvtWNFA5A67mZ-Ry4bpkNnqbw2DynU4m6H0h5U6bPAXbe-0rK0oFDLiigjFoWsWcgrp1viotb5D1aWFt5wZdWXSVTX8APezEcK27dKMBOGWUciR8eCDk9Hv246SHMFrf9yb6NI-aU6lKEKJiKH3_TLpJc474rVCFu5RKleKvqjPoIMQ24YvtDqrXkuLflGzPOv-HCg_nh2AxO23A-sHAu6dO_1h8jAcK5CKwOY1j9q22YUkGkkOvgepdEvWSRI1J1Psk6lscZc9GH-n_GboHeVLfpg
CitedBy_id crossref_primary_10_3390_app14198652
crossref_primary_10_1016_j_eswa_2025_126622
crossref_primary_10_1016_j_inffus_2025_103024
Cites_doi 10.1186/1472-6947-13-S1-S1
10.3390/s22165941
10.18653/v1/D15-1064
10.1093/bib/bbae067
10.1371/journal.pone.0194889
10.1177/0165551519860982
10.3390/app14010354
10.1007/s10586-017-1146-3
10.18653/v1/D18-1017
10.3390/app122010655
10.1007/s13042-023-02023-0
10.18653/v1/P18-1144
10.1111/coin.12599
10.1093/bib/bbac384
10.1093/bioinformatics/btad451
10.1016/j.neucom.2021.10.101
10.1007/978-3-319-96893-3_20
10.18653/v1/2021.findings-acl.84
10.1162/tacl_a_00104
10.18653/v1/P17-2090
10.1016/j.cosrev.2018.06.001
10.1109/ACCESS.2019.2942433
10.1109/WISA.2017.8
10.1186/s40537-021-00492-0
10.1016/j.jbi.2020.103395
10.21037/atm-22-3991
10.1016/j.neunet.2021.09.028
10.1016/j.ipm.2022.103041
10.1109/ACCESS.2023.3258179
10.3390/info11050255
10.1109/JBHI.2024.3383591
10.1007/s10489-023-04464-0
10.18653/v1/2020.wnut-1.26
10.3390/app10165711
10.1177/0165551521991037
10.18653/v1/P16-2025
10.1109/TCBB.2018.2868346
10.1145/3065386
ContentType Journal Article
Copyright 2024. The Author(s).
COPYRIGHT 2024 BioMed Central Ltd.
2024. This work is licensed under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
The Author(s) 2024 2024
Copyright_xml – notice: 2024. The Author(s).
– notice: COPYRIGHT 2024 BioMed Central Ltd.
– notice: 2024. This work is licensed under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
– notice: The Author(s) 2024 2024
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
3V.
7QO
7SC
7X7
7XB
88C
88E
8AL
8FD
8FE
8FG
8FH
8FI
8FJ
8FK
ABUWG
AFKRA
ARAPS
AZQEC
BBNVY
BENPR
BGLVJ
BHPHI
CCPQU
DWQXO
FR3
FYUFA
GHDGH
GNUQQ
HCIFZ
JQ2
K7-
K9.
L7M
LK8
L~C
L~D
M0N
M0S
M0T
M1P
M7P
P5Z
P62
P64
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
7X8
5PM
DOA
DOI 10.1186/s12911-024-02624-x
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
ProQuest Central (Corporate)
Biotechnology Research Abstracts
Computer and Information Systems Abstracts
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Healthcare Administration Database (Alumni)
Medical Database (Alumni Edition)
Computing Database (Alumni Edition)
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Natural Science Collection
ProQuest Hospital Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
Biological Science Database
ProQuest Central
Technology Collection
Natural Science Collection
ProQuest One
ProQuest Central
Engineering Research Database
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
ProQuest Health & Medical Complete (Alumni)
Advanced Technologies Database with Aerospace
Biological Sciences
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Computing Database
ProQuest Health & Medical Collection
Healthcare Administration Database
PML(ProQuest Medical Library)
Biological Science Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Biotechnology and BioEngineering Abstracts
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Open Access Full Text
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Publicly Available Content Database
Computer Science Database
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
SciTech Premium Collection
ProQuest Central China
ProQuest One Applied & Life Sciences
Health Research Premium Collection
Natural Science Collection
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
Advanced Technologies & Aerospace Collection
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
ProQuest Technology Collection
Health Research Premium Collection (Alumni)
Biological Science Database
ProQuest Hospital Collection (Alumni)
Biotechnology and BioEngineering Abstracts
ProQuest Health & Medical Complete
ProQuest One Academic UKI Edition
ProQuest Health Management (Alumni Edition)
Engineering Research Database
ProQuest One Academic
ProQuest One Academic (New)
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Central
ProQuest Health & Medical Research Collection
Biotechnology Research Abstracts
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
Advanced Technologies Database with Aerospace
ProQuest Computing
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest Health Management
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest Medical Library
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList


Publicly Available Content Database
MEDLINE
MEDLINE - Academic

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 4
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
EISSN 1472-6947
EndPage 13
ExternalDocumentID oai_doaj_org_article_e6c4591213904221bf92d917fde65c3b
PMC11302003
A803918262
39103849
10_1186_s12911_024_02624_x
Genre Journal Article
GeographicLocations China
GeographicLocations_xml – name: China
GrantInformation_xml – fundername: Guangzhou Science and Technology Planning Project
  grantid: 202002020036
GroupedDBID ---
0R~
23N
2WC
53G
5VS
6J9
6PF
7X7
88E
8FE
8FG
8FH
8FI
8FJ
AAFWJ
AAJSJ
AAKPC
AASML
AAWTL
AAYXX
ABDBF
ABUWG
ACGFO
ACGFS
ACIWK
ACPRK
ACUHS
ADBBV
ADUKV
AENEX
AFKRA
AFPKN
AFRAH
AHBYD
AHMBA
AHYZX
ALIPV
ALMA_UNASSIGNED_HOLDINGS
AMKLP
AMTXH
AOIJS
AQUVI
ARAPS
AZQEC
BAPOH
BAWUL
BBNVY
BCNDV
BENPR
BFQNJ
BGLVJ
BHPHI
BMC
BPHCQ
BVXVI
C6C
CCPQU
CITATION
CS3
DIK
DU5
DWQXO
E3Z
EAD
EAP
EAS
EBD
EBLON
EBS
EMB
EMK
EMOBN
ESX
F5P
FYUFA
GNUQQ
GROUPED_DOAJ
GX1
HCIFZ
HMCUK
HYE
IAO
IHR
INH
INR
ITC
K6V
K7-
KQ8
LK8
M0T
M1P
M48
M7P
M~E
O5R
O5S
OK1
OVT
P2P
P62
PGMZT
PHGZM
PHGZT
PIMPY
PQQKQ
PROAC
PSQYO
RBZ
RNS
ROL
RPM
RSV
SMD
SOJ
SV3
TR2
TUS
UKHRP
W2D
WOQ
WOW
XSB
CGR
CUY
CVF
ECM
EIF
NPM
PJZUB
PPXIY
PQGLB
PMFND
3V.
7QO
7SC
7XB
8AL
8FD
8FK
FR3
JQ2
K9.
L7M
L~C
L~D
M0N
P64
PKEHL
PQEST
PQUKI
PRINS
Q9U
7X8
5PM
PUEGO
ID FETCH-LOGICAL-c564t-4e3232f54ff8d331781c242c614cd4cce27290dba1a9bd86b3107ef4b16da1d73
IEDL.DBID M48
ISSN 1472-6947
IngestDate Wed Aug 27 00:32:46 EDT 2025
Thu Aug 21 18:32:01 EDT 2025
Fri Jul 11 04:35:13 EDT 2025
Fri Jul 25 18:57:46 EDT 2025
Tue Jun 17 22:05:32 EDT 2025
Tue Jun 10 21:06:48 EDT 2025
Mon Jul 21 06:05:16 EDT 2025
Tue Jul 01 04:06:01 EDT 2025
Thu Apr 24 22:57:22 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords Deep learning
Data augmentation
Text features
Replacement augmentation
Medical named entity recognition
Language English
License 2024. The Author(s).
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c564t-4e3232f54ff8d331781c242c614cd4cce27290dba1a9bd86b3107ef4b16da1d73
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
OpenAccessLink http://journals.scholarsportal.info/openUrl.xqy?doi=10.1186/s12911-024-02624-x
PMID 39103849
PQID 3091289954
PQPubID 42572
PageCount 13
ParticipantIDs doaj_primary_oai_doaj_org_article_e6c4591213904221bf92d917fde65c3b
pubmedcentral_primary_oai_pubmedcentral_nih_gov_11302003
proquest_miscellaneous_3089514462
proquest_journals_3091289954
gale_infotracmisc_A803918262
gale_infotracacademiconefile_A803918262
pubmed_primary_39103849
crossref_citationtrail_10_1186_s12911_024_02624_x
crossref_primary_10_1186_s12911_024_02624_x
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2024-08-05
PublicationDateYYYYMMDD 2024-08-05
PublicationDate_xml – month: 08
  year: 2024
  text: 2024-08-05
  day: 05
PublicationDecade 2020
PublicationPlace England
PublicationPlace_xml – name: England
– name: London
PublicationTitle BMC medical informatics and decision making
PublicationTitleAlternate BMC Med Inform Decis Mak
PublicationYear 2024
Publisher BioMed Central Ltd
BioMed Central
BMC
Publisher_xml – name: BioMed Central Ltd
– name: BioMed Central
– name: BMC
References S Kobayashi (2624_CR54) 2018
N Peng (2624_CR18) 2015
M Fadaee (2624_CR42) 2017
2624_CR55
2624_CR53
L Li (2624_CR69) 2020; 17
B Zhao (2624_CR71) 2023; 39
C Jia (2624_CR6) 2020
L Hu (2624_CR30) 2024; 25
J Wei (2624_CR50) 2019
A Krizhevsky (2624_CR4) 2017; 60
J Devlin (2624_CR23) 2019
C Mai (2624_CR27) 2022; 59
2624_CR46
2624_CR44
R Zhang (2624_CR60) 2020
HL Chieu (2624_CR1) 2002
Y Wang (2624_CR26) 2024; 15
R Grishman (2624_CR8) 1996
N Peng (2624_CR19) 2016
C Shorten (2624_CR48) 2021; 8
2624_CR43
B Parlak (2624_CR14) 2017
2624_CR40
B Parlak (2624_CR15) 2023; 39
P Cao (2624_CR21) 2018
B Parlak (2624_CR66) 2016
J Du (2624_CR49) 2021
Y Zhang (2624_CR20) 2018
P Liu (2624_CR64) 2022; 473
V Yadav (2624_CR5) 2019
B Shi (2624_CR52) 2022; 12
J Yoo (2624_CR45) 2023; 11
2624_CR35
2624_CR39
R Collobert (2624_CR11) 2011; 12
H Li (2624_CR13) 2014
GG Şahin (2624_CR56) 2018
JPC Chiu (2624_CR68) 2016; 4
2624_CR70
B Ding (2624_CR59) 2020
D Croce (2624_CR38) 2022
S Makridakis (2624_CR3) 2018; 13
A Kumar (2624_CR57) 2019
X Tian (2624_CR28) 2023; 53
Z Liu (2624_CR12) 2010
J He (2624_CR17) 2008
Y Guo (2624_CR29) 2023; 14
Y Li (2624_CR41) 2020; 11
B Parlak (2624_CR10) 2019; 46
B Parlak (2624_CR16) 2021; 49
A Goyal (2624_CR9) 2018; 29
R Chalapathy (2624_CR33) 2016
B Zhao (2624_CR37) 2024; 28
GA Levow (2624_CR2) 2006
2624_CR24
B Tang (2624_CR31) 2013; 13
K Liu (2624_CR36) 2017
BT Atmaja (2624_CR47) 2022; 22
S Song (2624_CR67) 2019; 22
Y Song (2624_CR65) 2018
Y Yang (2624_CR58) 2020
B Ji (2624_CR7) 2020; 104
2624_CR63
2624_CR62
Y Jin (2624_CR22) 2019; 7
Y Wang (2624_CR25) 2020; 10
Y Wu (2624_CR32) 2015; 216
A Wang (2624_CR51) 2022; 10
S Li (2624_CR61) 2022; 145
J Ravikumar (2624_CR34) 2021; 11
B Zhao (2624_CR72) 2022; 23
References_xml – volume: 13
  start-page: S1
  issue: S1
  year: 2013
  ident: 2624_CR31
  publication-title: BMC Med Inform Decis Making
  doi: 10.1186/1472-6947-13-S1-S1
– volume: 216
  start-page: 624
  year: 2015
  ident: 2624_CR32
  publication-title: PubMed
– ident: 2624_CR46
– volume: 22
  start-page: 5941
  issue: 16
  year: 2022
  ident: 2624_CR47
  publication-title: Sensors (Basel)
  doi: 10.3390/s22165941
– start-page: 452
  volume-title: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , Volume 2 (Short Papers)
  year: 2018
  ident: 2624_CR54
– volume-title: Proceedings of the 2019 Conference of the North
  year: 2019
  ident: 2624_CR57
– start-page: 548
  volume-title: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
  year: 2015
  ident: 2624_CR18
  doi: 10.18653/v1/D15-1064
– volume: 25
  start-page: bbae067
  issue: 2
  year: 2024
  ident: 2624_CR30
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbae067
– volume: 13
  start-page: e0194889
  issue: 3
  year: 2018
  ident: 2624_CR3
  publication-title: PLoS One
  doi: 10.1371/journal.pone.0194889
– volume-title: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
  year: 2018
  ident: 2624_CR56
– volume: 46
  start-page: 648
  issue: 5
  year: 2019
  ident: 2624_CR10
  publication-title: J Inf Sci
  doi: 10.1177/0165551519860982
– volume-title: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
  year: 2021
  ident: 2624_CR49
– volume-title: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)
  year: 2018
  ident: 2624_CR65
– volume: 14
  start-page: 354
  issue: 1
  year: 2023
  ident: 2624_CR29
  publication-title: Appl Sci
  doi: 10.3390/app14010354
– start-page: 7
  volume-title: Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP)
  year: 2016
  ident: 2624_CR33
– volume-title: Proceedings of the 16th conference on Computational linguistics
  year: 1996
  ident: 2624_CR8
– volume: 22
  start-page: 5195
  issue: S3
  year: 2019
  ident: 2624_CR67
  publication-title: Cluster Comput
  doi: 10.1007/s10586-017-1146-3
– start-page: 182
  volume-title: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
  year: 2018
  ident: 2624_CR21
  doi: 10.18653/v1/D18-1017
– volume: 12
  start-page: 10655
  issue: 20
  year: 2022
  ident: 2624_CR52
  publication-title: Appl Sci (Basel)
  doi: 10.3390/app122010655
– volume: 15
  start-page: 2199
  year: 2024
  ident: 2624_CR26
  publication-title: Int J Mach Learn Cybern
  doi: 10.1007/s13042-023-02023-0
– volume-title: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
  year: 2020
  ident: 2624_CR6
– start-page: 1554
  volume-title: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
  year: 2018
  ident: 2624_CR20
  doi: 10.18653/v1/P18-1144
– start-page: 269
  volume-title: Studies in computational intelligence
  year: 2017
  ident: 2624_CR14
– volume: 39
  start-page: 900
  issue: 5
  year: 2023
  ident: 2624_CR15
  publication-title: Comput Intell
  doi: 10.1111/coin.12599
– volume: 23
  start-page: bbac384
  issue: 6
  year: 2022
  ident: 2624_CR72
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbac384
– volume-title: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
  year: 2020
  ident: 2624_CR59
– volume: 39
  start-page: btad451
  issue: 8
  year: 2023
  ident: 2624_CR71
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btad451
– volume-title: Proceedings of the 19th international conference on computational linguistics
  year: 2002
  ident: 2624_CR1
– volume: 473
  start-page: 37
  year: 2022
  ident: 2624_CR64
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2021.10.101
– volume-title: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
  year: 2020
  ident: 2624_CR60
– ident: 2624_CR53
– ident: 2624_CR70
– ident: 2624_CR35
  doi: 10.1007/978-3-319-96893-3_20
– ident: 2624_CR43
  doi: 10.18653/v1/2021.findings-acl.84
– volume-title: Proceedings of 2016 11th Iberian Conference on Information Systems and Technologies (CISTI)
  year: 2016
  ident: 2624_CR66
– volume: 4
  start-page: 357
  year: 2016
  ident: 2624_CR68
  publication-title: Trans Assoc Comput Linguist
  doi: 10.1162/tacl_a_00104
– volume: 11
  start-page: 1689
  issue: 2
  year: 2021
  ident: 2624_CR34
  publication-title: Int J Power Electron Drive Syst Int J Electric Comput Eng
– start-page: 567
  volume-title: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
  year: 2017
  ident: 2624_CR42
  doi: 10.18653/v1/P17-2090
– volume-title: A survey on recent advances in named entity recognition from deep learning models. arXiv [cs.CL]
  year: 2019
  ident: 2624_CR5
– volume: 29
  start-page: 21
  year: 2018
  ident: 2624_CR9
  publication-title: Comput Sci Rev
  doi: 10.1016/j.cosrev.2018.06.001
– ident: 2624_CR44
– volume: 12
  start-page: 2493
  year: 2011
  ident: 2624_CR11
  publication-title: J Mach Learn Res
– volume: 7
  start-page: 136694
  year: 2019
  ident: 2624_CR22
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2019.2942433
– start-page: 105
  volume-title: Proceedings of 2017 14th Web Information Systems and Applications Conference (WISA)
  year: 2017
  ident: 2624_CR36
  doi: 10.1109/WISA.2017.8
– volume: 8
  start-page: 101
  issue: 1
  year: 2021
  ident: 2624_CR48
  publication-title: J. Big Data
  doi: 10.1186/s40537-021-00492-0
– volume-title: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
  year: 2019
  ident: 2624_CR50
– start-page: 4587
  volume-title: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
  year: 2022
  ident: 2624_CR38
– ident: 2624_CR63
– ident: 2624_CR40
– volume: 104
  start-page: 103395
  year: 2020
  ident: 2624_CR7
  publication-title: J Biomed Inform
  doi: 10.1016/j.jbi.2020.103395
– volume: 10
  start-page: 1061
  issue: 19
  year: 2022
  ident: 2624_CR51
  publication-title: Ann Transl Med
  doi: 10.21037/atm-22-3991
– volume: 145
  start-page: 121
  year: 2022
  ident: 2624_CR61
  publication-title: Neural Netw
  doi: 10.1016/j.neunet.2021.09.028
– ident: 2624_CR24
– volume: 59
  start-page: 103041
  issue: 5
  year: 2022
  ident: 2624_CR27
  publication-title: Inf Process Manage
  doi: 10.1016/j.ipm.2022.103041
– volume: 11
  start-page: 26393
  year: 2023
  ident: 2624_CR45
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2023.3258179
– start-page: 108
  volume-title: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing
  year: 2006
  ident: 2624_CR2
– volume: 11
  start-page: 255
  issue: 5
  year: 2020
  ident: 2624_CR41
  publication-title: Information
  doi: 10.3390/info11050255
– volume: 28
  start-page: 4281
  issue: 7
  year: 2024
  ident: 2624_CR37
  publication-title: IEEE J Biomed Health Inform
  doi: 10.1109/JBHI.2024.3383591
– volume-title: Findings of the Association for Computational Linguistics: EMNLP 2020
  year: 2020
  ident: 2624_CR58
– volume-title: Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing
  year: 2008
  ident: 2624_CR17
– volume: 53
  start-page: 19028
  issue: 16
  year: 2023
  ident: 2624_CR28
  publication-title: Appl Intell
  doi: 10.1007/s10489-023-04464-0
– start-page: 2532
  volume-title: Proceedings of International Conference on Language Resources and Evaluation
  year: 2014
  ident: 2624_CR13
– start-page: 4171
  volume-title: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
  year: 2019
  ident: 2624_CR23
– ident: 2624_CR62
– start-page: 634
  volume-title: Proceedings of the Advanced intelligent computing theories and applications, and 6th international conference on Intelligent computing
  year: 2010
  ident: 2624_CR12
– ident: 2624_CR39
  doi: 10.18653/v1/2020.wnut-1.26
– volume: 10
  start-page: 5711
  issue: 16
  year: 2020
  ident: 2624_CR25
  publication-title: Appl Sci
  doi: 10.3390/app10165711
– volume: 49
  start-page: 59
  issue: 1
  year: 2021
  ident: 2624_CR16
  publication-title: J Inf Sci
  doi: 10.1177/0165551521991037
– ident: 2624_CR55
– start-page: 149
  volume-title: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
  year: 2016
  ident: 2624_CR19
  doi: 10.18653/v1/P16-2025
– volume: 17
  start-page: 841
  issue: 3
  year: 2020
  ident: 2624_CR69
  publication-title: IEEE/ACM Trans Comput Biol Bioinform
  doi: 10.1109/TCBB.2018.2868346
– volume: 60
  start-page: 84
  issue: 6
  year: 2017
  ident: 2624_CR4
  publication-title: Commun ACM
  doi: 10.1145/3065386
SSID ssj0017835
Score 2.390567
Snippet Performing data augmentation in medical named entity recognition (NER) is crucial due to the unique challenges posed by this field. Medical data is...
Abstract Performing data augmentation in medical named entity recognition (NER) is crucial due to the unique challenges posed by this field. Medical data is...
SourceID doaj
pubmedcentral
proquest
gale
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage 221
SubjectTerms Algorithms
Analysis
Annotations
Classification
Computational linguistics
Data augmentation
Data mining
Deep Learning
Electronic health records
Evaluation
Humans
Language
Language processing
Machine learning
Medical named entity recognition
Medical records
Methods
Natural language interfaces
Natural Language Processing
Neural networks
Recognition
Replacement augmentation
Semantics
Terminology
Text categorization
Text features
Training
SummonAdditionalLinks – databaseName: DOAJ Open Access Full Text
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3di9QwEA9yD-KLqOdHz1MiCD5IuaZNs-njnngcB-eTB_dkSCapLnhZcXfh7r-_mTQtWwR98aFL2aRpk5nJzGQmvzD2nrAmgwBVCpBVKbXVpQWnSkL4cdI3oFNE9_KLOr-SF9ft9d5RX5QTNsADDwN3EhTItiPgsY7gqoTru9qjj9H7oFpoHM2--L7RmcrxA1rPGLfIaHWyQa1GS4G1xEvh7-1MDSW0_j_n5D2lNE-Y3NNAZ0_Y42w68uXwyU_ZgxCfsYeXOTh-yL4tI1-lNYLgOWV-crv7fpP3FkU-oodzGz1fbTd8L3bNV5HfDCEbHi3e8bR9945P-UXr-JxdnX3--um8zMcnlNAquS1laNBc6lvZ99o3aCdoAaiQARUyeAkQajSsK--ssJ3zWjm09Bahl04ob4VfNC_YQVzH8Ipx2dmqgsb2vqLmFhY8mk5VS3B2fa19wcQ4mgYytjgdcfHTJB9DKzNQwCAFTKKAuS3Yx-mZXwOyxl9rnxKRppqEip3-QF4xmVfMv3ilYB-IxIZkFz8PbN6CgJ0kFCyz1ISXjw5XXbDjWU2UOZgXj0xissxvTINsSO5rKwv2biqmJymPLYb1jupoNGnRBccmXg48NXUJ264aLbuC6Rm3zfo8L4mrHwkRXFD4Gefno_8xSq_ZozpJii6r9pgdbH_vwhu0vLbubRKye1bzKOQ
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: ProQuest Technology Collection
  dbid: 8FG
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3Ni9QwFA-6gngRv62uEkHwIGGbNs2kJxnFcRHWkwt7MuSr64CbrjszsP73vpemdYqwh5bSpCHpy8v7zC-EvEWsycCdZNyJkgllFDPOSoYIP1b42qkU0T35Jo9Pxdez5iw73DY5rXJcE9NC7XuHPvKjGhpF46ARHy5_Mzw1CqOr-QiN2-QOB0mDKV1q9WWKIqBXY9woo-TRBmQbOgQrAZeE-_VMGCXM_v9X5j3RNE-b3JNDqwfkflYg6XKg-ENyK8RH5O5JDpE_Jj-Wka6TpyB4ivmf1OzOL_IOo0hHDHFqoqfr7YbuRbDpOtKLIXBDo4Enmjbx_qFTllEfn5DT1efvn45ZPkSBuUaKLROhBqWpa0TXKV-DtqC4A7HsQCw7L5wLFajXpbeGm9Z6JS3oe4vQCculN9wv6qfkIPYxPCdUtKYsXW06X2JzC-M8KFBlg6B2XaV8Qfj4N7XLCON40MUvnSwNJfVAAQ0U0IkC-rog76dvLgd8jRtrf0QiTTURGzu96K_OdWY1HaQTTYtQdS0CnHHbtZUHq7TzQTautgV5hyTWyMHQPWfyRgQYJGJh6aVC1Hwwu6qCHM5qAue5efE4SXTm_I3-N08L8mYqxi8xmy2Gfod1FCi2YIhDE8-GOTUNCdouayXagqjZbJuNeV4S1z8TLjjHIDSs0i9u7tdLcq9KPKBY2RySg-3VLrwCzWprXyf2-QuViiEa
  priority: 102
  providerName: ProQuest
Title An improved data augmentation approach and its application in medical named entity recognition
URI https://www.ncbi.nlm.nih.gov/pubmed/39103849
https://www.proquest.com/docview/3091289954
https://www.proquest.com/docview/3089514462
https://pubmed.ncbi.nlm.nih.gov/PMC11302003
https://doaj.org/article/e6c4591213904221bf92d917fde65c3b
Volume 24
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3da9swED-6FsZexr7nrQsaDPYwvPlDluWHMZLRrgxSRlkg7GFCluQu0CpbPqD973en2FnMSh8igiULy3fnu9PpfgfwhrAmXWpEnBqexFxqGWtTi5gQfmpucyNDRHd8Kk4m_Ou0mO5BV-6ofYHLG107qic1WVy8v_pz_QkF_mMQeCk-LFFn0UZfxvEnsEWb8gA1U0kVDcb8X1SBdjlCtlGZxaLiZZdEc-McPUUV8Pz__2rvqK3-kcodHXX8AO63xiUbbrjhIew5_wjujtvw-WP4OfRsFnYRnGV0NpTp9fllm33kWYcvzrS3bLZasp3oNpt5drkJ6jCv8R8LCb7XbHsCae6fwOT46Pvnk7gtsBCbQvBVzF2OBlVT8KaRNkdLQqYGVbZBlW0sN8ZlaHonttaprmorRY22YOkaXqfC6tSW-VPY93PvngPjlU4Sk-vGJjRdqY1F4yopCPCuyaSNIO3epjIt-jgVwbhQwQuRQm0ooJACKlBAXUXwbnvP7w32xq2jR0Sk7UjCzQ4X5otz1YqhcsLwoiIYu4rAz9K6qTKLHmtjnShMXkfwlkisiN_w8YxukxRwkYSTpYaSEPXRJcsiOOyNRKk0_e6OSVTH1CpHwSAHt-ARvN5205100s27-ZrGSDR60UnHKZ5teGq7JJw7ySWvIpA9buutud_jZ78CZnhKAWr8gr-4_bFfwr0syICMk-IQ9leLtXuFVteqHsCdclpiK4-_DOBgdHT67WwQdjAGQciwPRv9-AtuNCt5
linkProvider Scholars Portal
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6VIgEXxJtAASOBOKCoeThZ54DQ8li2tNtTK_WEcWynrESd0t0V7Z_iNzLjPNgIqbcesorWzijOeDwzHs83AK8Ia9LGOg9jzaOQCyVCpcs8JISfkptUCx_Rne3n00P-9Sg72oA_XS4MHavs1kS_UJta0x75dopEyTnI-PvTXyFVjaLoaldCo5kWu_biN7psi3c7n5C_r5Nk8vng4zRsqwqEOsv5MuQ2RSuiynhVCZOi-hSxRj2lUU9pw7W2CdqbkSlVrIrSiLxEA2hkK17GuVGxGaVI9xpc5ylqcspMn3zpoxa0i9Il5oh8e4G6lDYgE45Xjr_nA-XnawT8rwnWVOHwmOaa3pvcgdutwcrGzQy7CxvW3YMbszYkfx--jR2b-50JaxidN2VqdXzSZjQ51mGWM-UMmy8XbC1izuaOnTSBIuYU3jGfNHzB-lNNtXsAh1fyeR_CpqudfQyMFyqKdKoqExG5kdIGDbYoIxC9KhEmgLj7mlK3iOZUWOOn9J6NyGXDAYkckJ4D8jyAt_0zpw2ex6W9PxCT-p6Exe3_qM-OZSva0uaaZwVB4xUEqBaXVZEY9IIrY_NMp2UAb4jFklYMfD2t2sQHHCRhb8mxIJR-dPOSALYGPVHS9bC5mySyXWkW8p9cBPCyb6Yn6fScs_WK-gg0pNHxRxKPmjnVDwlpR6ngRQBiMNsGYx62uPkPj0MeU9AbtcKTy9_rBdycHsz25N7O_u5TuJV4eRBhlG3B5vJsZZ-hVbcsn3tRYvD9qmX3L4kLXgI
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+improved+data+augmentation+approach+and+its+application+in+medical+named+entity+recognition&rft.jtitle=BMC+medical+informatics+and+decision+making&rft.au=Chen%2C+Hongyu&rft.au=Dan%2C+Li&rft.au=Lu%2C+Yonghe&rft.au=Chen%2C+Minghong&rft.date=2024-08-05&rft.pub=BioMed+Central+Ltd&rft.issn=1472-6947&rft.eissn=1472-6947&rft.volume=24&rft.issue=1&rft_id=info:doi/10.1186%2Fs12911-024-02624-x&rft.externalDocID=A803918262
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1472-6947&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1472-6947&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1472-6947&client=summon