How to improve information extraction from German medical records

Vast amounts of medical information are still recorded as unstructured text. The knowledge contained in this textual data has a great potential to improve clinical routine care, to support clinical research, and to advance personalization of medicine. To access this knowledge, the underlying data ha...

Full description

Saved in:

Bibliographic Details
Published in	Information technology (Munich, Germany) Vol. 59; no. 4; pp. 171 - 179
Main Authors	Starlinger, Johannes, Kittner, Madeleine, Blankenstein, Oliver, Leser, Ulf
Format	Journal Article
Language	English
Published	De Gruyter Oldenbourg 28.08.2017
Subjects	Applied computing →Document management and text processing →Document preparation →Annotation Applied computing →Life and medical sciences →Health care information systems information extraction Information systems →Information Retrieval →Retrieval tasks and goals →Information extraction Medical text mining semantic information integration
Online Access	Get full text
ISSN	1611-2776 2196-7032
DOI	10.1515/itit-2016-0027

Cover

Abstract	Vast amounts of medical information are still recorded as unstructured text. The knowledge contained in this textual data has a great potential to improve clinical routine care, to support clinical research, and to advance personalization of medicine. To access this knowledge, the underlying data has to be semantically integrated – an essential prerequisite to which is information extraction from clinical documents. A body of work, and a good selection of openly available tools for information extraction and semantic integration in the medical domain exist, yet almost exclusively for English language documents. For German texts the situation is rather different: research work is sparse, tools are proprietary or unpublished, and rarely any freely available textual resources exist. In this survey, we (1) describe the challenges of information extraction from German medical documents and the hurdles posed to research in this area, (2) especially address the problems of missing German language resources and privacy implications, and (3) identify the steps necessary to overcome these hurdles and fuel research in semantic integration of textual clinical data.
AbstractList	Vast amounts of medical information are still recorded as unstructured text. The knowledge contained in this textual data has a great potential to improve clinical routine care, to support clinical research, and to advance personalization of medicine. To access this knowledge, the underlying data has to be semantically integrated – an essential prerequisite to which is information extraction from clinical documents. A body of work, and a good selection of openly available tools for information extraction and semantic integration in the medical domain exist, yet almost exclusively for English language documents. For German texts the situation is rather different: research work is sparse, tools are proprietary or unpublished, and rarely any freely available textual resources exist. In this survey, we (1) describe the challenges of information extraction from German medical documents and the hurdles posed to research in this area, (2) especially address the problems of missing German language resources and privacy implications, and (3) identify the steps necessary to overcome these hurdles and fuel research in semantic integration of textual clinical data.
Author	Starlinger, Johannes Blankenstein, Oliver Leser, Ulf Kittner, Madeleine
Author_xml	– sequence: 1 givenname: Johannes surname: Starlinger fullname: Starlinger, Johannes email: starling@informatik.hu-berlin.de organization: 1 Humboldt-Universität zu Berlin, Institut für Informatik, Unter den Linden 6, 10099 Berlin Germany – sequence: 2 givenname: Madeleine surname: Kittner fullname: Kittner, Madeleine email: kittner@informatik.hu-berlin.de organization: 1 Humboldt-Universität zu Berlin, Institut für Informatik, Unter den Linden 6, 10099 Berlin Germany – sequence: 3 givenname: Oliver surname: Blankenstein fullname: Blankenstein, Oliver email: Oliver.Blankenstein@charite.de organization: 2 Charité Universitätsmedizin Berlin, Pädiatrische Endokrinologie und Diabetologie, Augustenburger Platz 1, 13353 Berlin Germany – sequence: 4 givenname: Ulf surname: Leser fullname: Leser, Ulf email: leser@informatik.hu-berlin.de organization: 1 Humboldt-Universität zu Berlin, Institut für Informatik, Unter den Linden 6, 10099 Berlin Germany
BookMark	eNp1kEFLAzEQhYNUsK1ePecPpCbZ3aQBL6VoKxS86Dmk2Ymk7G5KNrX235ttPQk9zWNmvuHNm6BRFzpA6JHRGatY9eSTT4RTJgilXN6gMWdKEEkLPkJjJhgjXEpxhyZ9v8sbSs7ZGC3W4YhTwL7dx_AN2HcuxNYkHzoMPykae5YuhhavIE863ELtrWlwBBti3d-jW2eaHh7-6hR9vr58LNdk8756Wy42xHLFE2GlAArOKG63XHAoObfg6oJaTq3YSiGzrKva5L6t5la6AizdGlXWSpWFLKaovNy1MfR9BKetT2ej2aVvNKN6iEEPMeghBj3EkLHZP2wffWvi6TrwfAGOpkkQa_iKh1MWehcOscsfXgErVTLJil8kmndY
CitedBy_id	crossref_primary_10_1371_journal_pdig_0000086 crossref_primary_10_1007_s10278_019_00303_2 crossref_primary_10_1155_2019_4292987 crossref_primary_10_1093_jamiaopen_ooab025 crossref_primary_10_1038_s41597_023_02128_9
Cites_doi	10.1136/jamia.2009.001560 10.1136/jamia.2010.004119 10.1016/j.ajhg.2008.09.017
ContentType	Journal Article
DBID	AAYXX CITATION
DOI	10.1515/itit-2016-0027
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList	CrossRef
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	2196-7032
EndPage	179
ExternalDocumentID	10_1515_itit_2016_0027 10_1515_itit_2016_0027594171
GroupedDBID	0R~ 4.4 5GY 6FP AAAEU AADQG AAFPC AAGVJ AAJBH AALGR AAONY AAOUV AAPJK AARVR AASOL AASQH AAWFC AAXCG ABAOT ABAQN ABFKT ABIQR ABJNI ABMBZ ABPLS ABRQL ABSOE ABUVI ABWLS ABXMZ ABYKJ ACDEB ACEFL ACGFS ACMKP ACPMA ACUND ACXLN ADEQT ADGQD ADGYE ADJVZ ADNPR ADOZN AECWL AEGVQ AEICA AEKEB AEQDQ AERZL AEXIE AFBAA AFBDD AFGNR AFQUK AFYRI AGBEV AHVWV AHXUK AIERV AIKXB AIWOI AJATJ AKXKS ALMA_UNASSIGNED_HOLDINGS AMAVY ASYPN AZMOX BAKPI BBCWN BCIFA BLHJL CFGNV CS3 DSRVY EBS EJD FSTRU HZ~ IY9 KDIRW O9- QD8 SLJYH UK5 WTRAM AAYXX CITATION
ID	FETCH-LOGICAL-c292t-146e0efa92cb262e422cefd30c20c6b76730cd5da22cc58c7f3ec0ba94d994373
ISSN	1611-2776
IngestDate	Tue Jul 01 00:54:37 EDT 2025 Thu Apr 24 22:58:37 EDT 2025 Sat Sep 06 17:03:56 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	4
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c292t-146e0efa92cb262e422cefd30c20c6b76730cd5da22cc58c7f3ec0ba94d994373
PageCount	9
ParticipantIDs	crossref_citationtrail_10_1515_itit_2016_0027 crossref_primary_10_1515_itit_2016_0027 walterdegruyter_journals_10_1515_itit_2016_0027594171
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2017-08-28
PublicationDateYYYYMMDD	2017-08-28
PublicationDate_xml	– month: 08 year: 2017 text: 2017-08-28 day: 28
PublicationDecade	2010
PublicationTitle	Information technology (Munich, Germany)
PublicationYear	2017
Publisher	De Gruyter Oldenbourg
Publisher_xml	– name: De Gruyter Oldenbourg
References	Tikk (ref111) 2010; 48 Uzuner (ref141) 2010; 51 Robinson (ref31) 2008; 40 Savova (ref61) 2010; 43 Uzuner (ref131) 2010; 50
References_xml	– volume: 50 start-page: 519 year: 2010 ident: ref131 article-title: Community annotation experiment for ground truth generation for the i medica - tion challenge Journal of the American Medical Informatics publication-title: Association – volume: 43 start-page: 507 year: 2010 ident: ref61 article-title: Kipper Mayo clinical text analysis and knowledge extraction system cTAKES architecture com - ponent evaluation and applications Journal of the American publication-title: Medical Informatics Association doi: 10.1136/jamia.2009.001560 – volume: 48 start-page: 540 year: 2010 ident: ref111 article-title: and Improving textual medication extraction using combined conditional random fields and rule - based sys - tems Journal of the publication-title: American Medical Informatics Association doi: 10.1136/jamia.2010.004119 – volume: 40 start-page: 610 year: 2008 ident: ref31 article-title: The human phenotype ontology : a tool for anno - tating and analyzing human hereditary disease The American publication-title: Journal of Human Genetics doi: 10.1016/j.ajhg.2008.09.017 – volume: 51 start-page: 552 year: 2010 ident: ref141 article-title: i VA challenge on concepts assertions and relations in clinical text Journal of the publication-title: American Medical Informatics Association
SSID	ssj0029781
Score	2.0844018
Snippet	Vast amounts of medical information are still recorded as unstructured text. The knowledge contained in this textual data has a great potential to improve...
SourceID	crossref walterdegruyter
SourceType	Enrichment Source Index Database Publisher
StartPage	171
SubjectTerms	Applied computing →Document management and text processing →Document preparation →Annotation Applied computing →Life and medical sciences →Health care information systems information extraction Information systems →Information Retrieval →Retrieval tasks and goals →Information extraction Medical text mining semantic information integration
Title	How to improve information extraction from German medical records
URI	https://www.degruyter.com/doi/10.1515/itit-2016-0027
Volume	59
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Nb9MwFLeq7QKHaXyJ8SUfkDggg-0maXMsaGxMlF1WtFtkO05XESWoTYfG38UfyHuxk7pjk2CXqHLzrMTv5X3Z7_cIeW1FpLVONYu50ixKpWFqqBKWx1pwCwHFsK2tmn5NjmfRyXl8Phj8Dk4trRv9zvy6sa7kLlyFMeArVsn-B2f7SWEAfgN_4Qochus_8bjtB1djpeOyvkT0j74U8S3o3KVvA95WkByhBq7cVnqL5Y9R5yp0TT8H1E2fcEcPdIoVJBcug46zXAXpA3BWES9r7jh_Ul8oUNybDaJF0_iKminiUdpwF79U1XcIort2m6clHhHp_v1iV45uVhZhZgKsHR93ld7t8RIQ8-X6CrEeT0tQopilnQeqNhGCydHIA2G3Y6A-EwY6aEs_e8TwRZh8aJWtcM1bvN0WrinNXyYhbtEzFrCUIDsiYRiIb4xft-F_zSb2JxUxRoIZMqTPkD5D-jiNBAIX7MLz4-GA3cnRh8NvfaCPOGIY6Hdv6MFCYZ7328-x5Qzt_WwPSOR27lYt8HPO9smeD1DoxEnbAzKw1UNyP4CtfEQmIHe0qamXOxrIHd3IHUW5o05iqJc76uXuMZl9Ojz7eMx8Kw5mZCobBvbUclso-Ja1TKSNpDS2yIfcSG4SPUrAUJg8zhWMm3hsRsXQGq5VGuVpiuhZT8hOVVf2KaEm5poXVohUQ3SqQFVYnaSFEMXYWK7kAWHdmmTG49Rju5Qyu5kXB-RNf_8Ph9By653xtSXO_Me8uoXC8fnZHemek3ubz-IF2WmWa_sSXNhGv_IS8wcC1KBr
linkProvider	Walter de Gruyter
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LTwIxEJ4oHNQDviM-ezDxVGi7L3pEw0MFTmC4bbbdrhJ5GFhC9NfbsgtBDBe9btpNd2Z2Zr5O5yvAraK2EIIL7JBAYJsziQMrcHHoCEqUBhTWvLeq2XLrHfup63RXemHMscpQvY6nn3HCkFoMR3JqNsqWXAM6Ahd7WpRawVSjYQPe3-JBfxuypoRmZyBbrt1XXpawy7A6GdjlUoqZ57kpdePv1_wITbnZvFy9XMtK1Knug1ysNzls8l6YxqIgv9aoHP_3QQeQS5NSVE6s6BC21PAI9laoCo-hXB_NUDxCvfkmhEIp4apRK9L-fZz0RyDTrYJqxtsP0SCpAaFkH2hyAp1qpf1Qx-n1C1gyzmKsfagiKgq0_gRzmbIZkyoKLSIZka7wXO0cZOiEgX4unZL0IktJIgJuh5wbxqRTyAxHQ3UGSDpEkEhRyoVGJIE2DyVcHlEalaQiAcsDXkjelyk3ubkio-8bjKLl4xv5-EY-vpFPHu6W4z8SVo6NI501RfrpTzrZMMPhNvXo-R_n3cBOvd1s-I3H1vMF7DKTBBDti0qXkInHU3WlU5hYXKc2-g3hz-9y
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LT8MwDI5gkxAcxluMZw5InLIl6Ws5DtiD18SBIW5Vk6ZoArZpazXBr8dZu2oM7QLXKqlS27X9JfYXhM41s6WUQhKHBpLYgisSWIFLQkcyqgFQWNPeqoeO2-7aty_OrJpwnJVVhvp1lHzGKUNqNRyoxGyU5VwDEIGrPRAlKJgBGjbgfRhGq6gIWKUG-KtYb102nnPUZUidDOpyGSPc89yMufH3W35EptJkelqdL2Uu6DQ3kZwtN601easksayorwUmx399zxYqZSkprqc2tI1WdH8HbcwRFe6ienswwfEA96ZbEBpndKtGqRi8-yjtjsCmVwW3jK_v44_0BAinu0DjPdRtNp6u2iS7fIEoLnhMwINqqqMAtCe5y7XNudJRaFHFqXKl54JrUKETBvBcOTXlRZZWVAbCDoUwfEn7qNAf9PUBwsqhkkaaMSEBjwRgHFq6ImIsqilNA15GZCZ4X2XM5OaCjHffIBQQj2_E4xvx-EY8ZXSRjx-mnBxLRzoLevSzX3S8ZIYjbOaxwz_OO0Nrj9dN__6mc3eE1rnJACg4otoxKsSjRJ9A_hLL08xCvwE_ju4i
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=How+to+improve+information+extraction+from+German+medical+records&rft.jtitle=Information+technology+%28Munich%2C+Germany%29&rft.au=Starlinger%2C+Johannes&rft.au=Kittner%2C+Madeleine&rft.au=Blankenstein%2C+Oliver&rft.au=Leser%2C+Ulf&rft.date=2017-08-28&rft.pub=De+Gruyter+Oldenbourg&rft.issn=1611-2776&rft.eissn=2196-7032&rft.volume=59&rft.issue=4&rft.spage=171&rft.epage=179&rft_id=info:doi/10.1515%2Fitit-2016-0027&rft.externalDBID=n%2Fa&rft.externalDocID=10_1515_itit_2016_0027594171
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1611-2776&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1611-2776&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1611-2776&client=summon