How to improve information extraction from German medical records

Vast amounts of medical information are still recorded as unstructured text. The knowledge contained in this textual data has a great potential to improve clinical routine care, to support clinical research, and to advance personalization of medicine. To access this knowledge, the underlying data ha...

Full description

Saved in:
Bibliographic Details
Published inInformation technology (Munich, Germany) Vol. 59; no. 4; pp. 171 - 179
Main Authors Starlinger, Johannes, Kittner, Madeleine, Blankenstein, Oliver, Leser, Ulf
Format Journal Article
LanguageEnglish
Published De Gruyter Oldenbourg 28.08.2017
Subjects
Online AccessGet full text
ISSN1611-2776
2196-7032
DOI10.1515/itit-2016-0027

Cover

Abstract Vast amounts of medical information are still recorded as unstructured text. The knowledge contained in this textual data has a great potential to improve clinical routine care, to support clinical research, and to advance personalization of medicine. To access this knowledge, the underlying data has to be semantically integrated – an essential prerequisite to which is information extraction from clinical documents. A body of work, and a good selection of openly available tools for information extraction and semantic integration in the medical domain exist, yet almost exclusively for English language documents. For German texts the situation is rather different: research work is sparse, tools are proprietary or unpublished, and rarely any freely available textual resources exist. In this survey, we (1) describe the challenges of information extraction from German medical documents and the hurdles posed to research in this area, (2) especially address the problems of missing German language resources and privacy implications, and (3) identify the steps necessary to overcome these hurdles and fuel research in semantic integration of textual clinical data.
AbstractList Vast amounts of medical information are still recorded as unstructured text. The knowledge contained in this textual data has a great potential to improve clinical routine care, to support clinical research, and to advance personalization of medicine. To access this knowledge, the underlying data has to be semantically integrated – an essential prerequisite to which is information extraction from clinical documents. A body of work, and a good selection of openly available tools for information extraction and semantic integration in the medical domain exist, yet almost exclusively for English language documents. For German texts the situation is rather different: research work is sparse, tools are proprietary or unpublished, and rarely any freely available textual resources exist. In this survey, we (1) describe the challenges of information extraction from German medical documents and the hurdles posed to research in this area, (2) especially address the problems of missing German language resources and privacy implications, and (3) identify the steps necessary to overcome these hurdles and fuel research in semantic integration of textual clinical data.
Author Starlinger, Johannes
Blankenstein, Oliver
Leser, Ulf
Kittner, Madeleine
Author_xml – sequence: 1
  givenname: Johannes
  surname: Starlinger
  fullname: Starlinger, Johannes
  email: starling@informatik.hu-berlin.de
  organization: 1 Humboldt-Universität zu Berlin, Institut für Informatik, Unter den Linden 6, 10099 Berlin Germany
– sequence: 2
  givenname: Madeleine
  surname: Kittner
  fullname: Kittner, Madeleine
  email: kittner@informatik.hu-berlin.de
  organization: 1 Humboldt-Universität zu Berlin, Institut für Informatik, Unter den Linden 6, 10099 Berlin Germany
– sequence: 3
  givenname: Oliver
  surname: Blankenstein
  fullname: Blankenstein, Oliver
  email: Oliver.Blankenstein@charite.de
  organization: 2 Charité Universitätsmedizin Berlin, Pädiatrische Endokrinologie und Diabetologie, Augustenburger Platz 1, 13353 Berlin Germany
– sequence: 4
  givenname: Ulf
  surname: Leser
  fullname: Leser, Ulf
  email: leser@informatik.hu-berlin.de
  organization: 1 Humboldt-Universität zu Berlin, Institut für Informatik, Unter den Linden 6, 10099 Berlin Germany
BookMark eNp1kEFLAzEQhYNUsK1ePecPpCbZ3aQBL6VoKxS86Dmk2Ymk7G5KNrX235ttPQk9zWNmvuHNm6BRFzpA6JHRGatY9eSTT4RTJgilXN6gMWdKEEkLPkJjJhgjXEpxhyZ9v8sbSs7ZGC3W4YhTwL7dx_AN2HcuxNYkHzoMPykae5YuhhavIE863ELtrWlwBBti3d-jW2eaHh7-6hR9vr58LNdk8756Wy42xHLFE2GlAArOKG63XHAoObfg6oJaTq3YSiGzrKva5L6t5la6AizdGlXWSpWFLKaovNy1MfR9BKetT2ej2aVvNKN6iEEPMeghBj3EkLHZP2wffWvi6TrwfAGOpkkQa_iKh1MWehcOscsfXgErVTLJil8kmndY
CitedBy_id crossref_primary_10_1371_journal_pdig_0000086
crossref_primary_10_1007_s10278_019_00303_2
crossref_primary_10_1155_2019_4292987
crossref_primary_10_1093_jamiaopen_ooab025
crossref_primary_10_1038_s41597_023_02128_9
Cites_doi 10.1136/jamia.2009.001560
10.1136/jamia.2010.004119
10.1016/j.ajhg.2008.09.017
ContentType Journal Article
DBID AAYXX
CITATION
DOI 10.1515/itit-2016-0027
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
CrossRef
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2196-7032
EndPage 179
ExternalDocumentID 10_1515_itit_2016_0027
10_1515_itit_2016_0027594171
GroupedDBID 0R~
4.4
5GY
6FP
AAAEU
AADQG
AAFPC
AAGVJ
AAJBH
AALGR
AAONY
AAOUV
AAPJK
AARVR
AASOL
AASQH
AAWFC
AAXCG
ABAOT
ABAQN
ABFKT
ABIQR
ABJNI
ABMBZ
ABPLS
ABRQL
ABSOE
ABUVI
ABWLS
ABXMZ
ABYKJ
ACDEB
ACEFL
ACGFS
ACMKP
ACPMA
ACUND
ACXLN
ADEQT
ADGQD
ADGYE
ADJVZ
ADNPR
ADOZN
AECWL
AEGVQ
AEICA
AEKEB
AEQDQ
AERZL
AEXIE
AFBAA
AFBDD
AFGNR
AFQUK
AFYRI
AGBEV
AHVWV
AHXUK
AIERV
AIKXB
AIWOI
AJATJ
AKXKS
ALMA_UNASSIGNED_HOLDINGS
AMAVY
ASYPN
AZMOX
BAKPI
BBCWN
BCIFA
BLHJL
CFGNV
CS3
DSRVY
EBS
EJD
FSTRU
HZ~
IY9
KDIRW
O9-
QD8
SLJYH
UK5
WTRAM
AAYXX
CITATION
ID FETCH-LOGICAL-c292t-146e0efa92cb262e422cefd30c20c6b76730cd5da22cc58c7f3ec0ba94d994373
ISSN 1611-2776
IngestDate Tue Jul 01 00:54:37 EDT 2025
Thu Apr 24 22:58:37 EDT 2025
Sat Sep 06 17:03:56 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c292t-146e0efa92cb262e422cefd30c20c6b76730cd5da22cc58c7f3ec0ba94d994373
PageCount 9
ParticipantIDs crossref_citationtrail_10_1515_itit_2016_0027
crossref_primary_10_1515_itit_2016_0027
walterdegruyter_journals_10_1515_itit_2016_0027594171
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2017-08-28
PublicationDateYYYYMMDD 2017-08-28
PublicationDate_xml – month: 08
  year: 2017
  text: 2017-08-28
  day: 28
PublicationDecade 2010
PublicationTitle Information technology (Munich, Germany)
PublicationYear 2017
Publisher De Gruyter Oldenbourg
Publisher_xml – name: De Gruyter Oldenbourg
References Tikk (ref111) 2010; 48
Uzuner (ref141) 2010; 51
Robinson (ref31) 2008; 40
Savova (ref61) 2010; 43
Uzuner (ref131) 2010; 50
References_xml – volume: 50
  start-page: 519
  year: 2010
  ident: ref131
  article-title: Community annotation experiment for ground truth generation for the i medica - tion challenge Journal of the American Medical Informatics
  publication-title: Association
– volume: 43
  start-page: 507
  year: 2010
  ident: ref61
  article-title: Kipper Mayo clinical text analysis and knowledge extraction system cTAKES architecture com - ponent evaluation and applications Journal of the American
  publication-title: Medical Informatics Association
  doi: 10.1136/jamia.2009.001560
– volume: 48
  start-page: 540
  year: 2010
  ident: ref111
  article-title: and Improving textual medication extraction using combined conditional random fields and rule - based sys - tems Journal of the
  publication-title: American Medical Informatics Association
  doi: 10.1136/jamia.2010.004119
– volume: 40
  start-page: 610
  year: 2008
  ident: ref31
  article-title: The human phenotype ontology : a tool for anno - tating and analyzing human hereditary disease The American
  publication-title: Journal of Human Genetics
  doi: 10.1016/j.ajhg.2008.09.017
– volume: 51
  start-page: 552
  year: 2010
  ident: ref141
  article-title: i VA challenge on concepts assertions and relations in clinical text Journal of the
  publication-title: American Medical Informatics Association
SSID ssj0029781
Score 2.0844018
Snippet Vast amounts of medical information are still recorded as unstructured text. The knowledge contained in this textual data has a great potential to improve...
SourceID crossref
walterdegruyter
SourceType Enrichment Source
Index Database
Publisher
StartPage 171
SubjectTerms Applied computing →Document management and text processing →Document preparation →Annotation
Applied computing →Life and medical sciences →Health care information systems
information extraction
Information systems →Information Retrieval →Retrieval tasks and goals →Information extraction
Medical text mining
semantic information integration
Title How to improve information extraction from German medical records
URI https://www.degruyter.com/doi/10.1515/itit-2016-0027
Volume 59
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Nb9MwFLeq7QKHaXyJ8SUfkDggg-0maXMsaGxMlF1WtFtkO05XESWoTYfG38UfyHuxk7pjk2CXqHLzrMTv5X3Z7_cIeW1FpLVONYu50ixKpWFqqBKWx1pwCwHFsK2tmn5NjmfRyXl8Phj8Dk4trRv9zvy6sa7kLlyFMeArVsn-B2f7SWEAfgN_4Qochus_8bjtB1djpeOyvkT0j74U8S3o3KVvA95WkByhBq7cVnqL5Y9R5yp0TT8H1E2fcEcPdIoVJBcug46zXAXpA3BWES9r7jh_Ul8oUNybDaJF0_iKminiUdpwF79U1XcIort2m6clHhHp_v1iV45uVhZhZgKsHR93ld7t8RIQ8-X6CrEeT0tQopilnQeqNhGCydHIA2G3Y6A-EwY6aEs_e8TwRZh8aJWtcM1bvN0WrinNXyYhbtEzFrCUIDsiYRiIb4xft-F_zSb2JxUxRoIZMqTPkD5D-jiNBAIX7MLz4-GA3cnRh8NvfaCPOGIY6Hdv6MFCYZ7328-x5Qzt_WwPSOR27lYt8HPO9smeD1DoxEnbAzKw1UNyP4CtfEQmIHe0qamXOxrIHd3IHUW5o05iqJc76uXuMZl9Ojz7eMx8Kw5mZCobBvbUclso-Ja1TKSNpDS2yIfcSG4SPUrAUJg8zhWMm3hsRsXQGq5VGuVpiuhZT8hOVVf2KaEm5poXVohUQ3SqQFVYnaSFEMXYWK7kAWHdmmTG49Rju5Qyu5kXB-RNf_8Ph9By653xtSXO_Me8uoXC8fnZHemek3ubz-IF2WmWa_sSXNhGv_IS8wcC1KBr
linkProvider Walter de Gruyter
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LTwIxEJ4oHNQDviM-ezDxVGi7L3pEw0MFTmC4bbbdrhJ5GFhC9NfbsgtBDBe9btpNd2Z2Zr5O5yvAraK2EIIL7JBAYJsziQMrcHHoCEqUBhTWvLeq2XLrHfup63RXemHMscpQvY6nn3HCkFoMR3JqNsqWXAM6Ahd7WpRawVSjYQPe3-JBfxuypoRmZyBbrt1XXpawy7A6GdjlUoqZ57kpdePv1_wITbnZvFy9XMtK1Knug1ysNzls8l6YxqIgv9aoHP_3QQeQS5NSVE6s6BC21PAI9laoCo-hXB_NUDxCvfkmhEIp4apRK9L-fZz0RyDTrYJqxtsP0SCpAaFkH2hyAp1qpf1Qx-n1C1gyzmKsfagiKgq0_gRzmbIZkyoKLSIZka7wXO0cZOiEgX4unZL0IktJIgJuh5wbxqRTyAxHQ3UGSDpEkEhRyoVGJIE2DyVcHlEalaQiAcsDXkjelyk3ubkio-8bjKLl4xv5-EY-vpFPHu6W4z8SVo6NI501RfrpTzrZMMPhNvXo-R_n3cBOvd1s-I3H1vMF7DKTBBDti0qXkInHU3WlU5hYXKc2-g3hz-9y
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LT8MwDI5gkxAcxluMZw5InLIl6Ws5DtiD18SBIW5Vk6ZoArZpazXBr8dZu2oM7QLXKqlS27X9JfYXhM41s6WUQhKHBpLYgisSWIFLQkcyqgFQWNPeqoeO2-7aty_OrJpwnJVVhvp1lHzGKUNqNRyoxGyU5VwDEIGrPRAlKJgBGjbgfRhGq6gIWKUG-KtYb102nnPUZUidDOpyGSPc89yMufH3W35EptJkelqdL2Uu6DQ3kZwtN601easksayorwUmx399zxYqZSkprqc2tI1WdH8HbcwRFe6ienswwfEA96ZbEBpndKtGqRi8-yjtjsCmVwW3jK_v44_0BAinu0DjPdRtNp6u2iS7fIEoLnhMwINqqqMAtCe5y7XNudJRaFHFqXKl54JrUKETBvBcOTXlRZZWVAbCDoUwfEn7qNAf9PUBwsqhkkaaMSEBjwRgHFq6ImIsqilNA15GZCZ4X2XM5OaCjHffIBQQj2_E4xvx-EY8ZXSRjx-mnBxLRzoLevSzX3S8ZIYjbOaxwz_OO0Nrj9dN__6mc3eE1rnJACg4otoxKsSjRJ9A_hLL08xCvwE_ju4i
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=How+to+improve+information+extraction+from+German+medical+records&rft.jtitle=Information+technology+%28Munich%2C+Germany%29&rft.au=Starlinger%2C+Johannes&rft.au=Kittner%2C+Madeleine&rft.au=Blankenstein%2C+Oliver&rft.au=Leser%2C+Ulf&rft.date=2017-08-28&rft.pub=De+Gruyter+Oldenbourg&rft.issn=1611-2776&rft.eissn=2196-7032&rft.volume=59&rft.issue=4&rft.spage=171&rft.epage=179&rft_id=info:doi/10.1515%2Fitit-2016-0027&rft.externalDBID=n%2Fa&rft.externalDocID=10_1515_itit_2016_0027594171
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1611-2776&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1611-2776&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1611-2776&client=summon