A dataset for plain language adaptation of biomedical abstracts

Though exponentially growing health-related literature has been made available to a broad audience online, the language of scientific articles can be difficult for the general public to understand. Therefore, adapting this expert-level language into plain language versions is necessary for the publi...

Full description

Saved in:

Bibliographic Details
Published in	Scientific data Vol. 10; no. 1; pp. 8 - 11
Main Authors	Attal, Kush, Ondov, Brian, Demner-Fushman, Dina
Format	Journal Article
Language	English
Published	London Nature Publishing Group UK 04.01.2023 Nature Publishing Group Nature Portfolio
Subjects	639/705/117 692/700/1719 Adaptation Data Descriptor Datasets Deep learning Humanities and Social Sciences Language multidisciplinary Science Science (multidisciplinary)
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Though exponentially growing health-related literature has been made available to a broad audience online, the language of scientific articles can be difficult for the general public to understand. Therefore, adapting this expert-level language into plain language versions is necessary for the public to reliably comprehend the vast health-related literature. Deep Learning algorithms for automatic adaptation are a possible solution; however, gold standard datasets are needed for proper evaluation. Proposed datasets thus far consist of either pairs of comparable professional- and general public-facing documents or pairs of semantically similar sentences mined from such documents. This leads to a trade-off between imperfect alignments and small test sets. To address this issue, we created the Plain Language Adaptation of Biomedical Abstracts dataset. This dataset is the first manually adapted dataset that is both document- and sentence-aligned. The dataset contains 750 adapted abstracts, totaling 7643 sentence pairs. Along with describing the dataset, we benchmark automatic adaptation on the dataset with state-of-the-art Deep Learning approaches, setting baselines for future research.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Undefined-1 ObjectType-Feature-3 content type line 23
ISSN:	2052-4463 2052-4463
DOI:	10.1038/s41597-022-01920-3