RP15 - Summaries and lay translations of clinical documents with transformers

Clinical documents are often difficult to understand for laypersons, which affects the doctor-patient relationship. This is where projects such as the Befunddolmetscher (https://www.befunddolmetscher.de) help, but it only explains individual terms. Large language models, such as the Generative Pretrained Transformer (GPT) [1] have had great success in translations, summaries, and other tasks in recent years. The goal of this dissertation project is to create summaries for German medical texts for clinicians on the one hand, and to provide layman-comprehensible translations with transformer architectures on the other hand. The challenges lie on the one hand in data protection, since network-based models cannot be used and the correctness of the summaries is of central importance. Further problems lie in linguistic peculiarities of clinical texts, which have to be overcome [2] (e.g. unusual abbreviations, staccato formulations). The basis of the work is a large clinical language model, which is currently being developed by members of the graduate college for the Essen site. The work also builds on the PhD-project 2 of the first cohort, where an ontology/terminology extraction for melanoma of the skin was developed semi-automatically. In addition to purely text-based summaries, other forms of presentation at the point of care (dashboard) will be evaluated, such as those provided by the Apache cTakes system timelines. In order to convincingly present the correctness of the summaries/translations, an explainability component will additionally be developed and evaluated. A partial result of the project is the creation of a synthetic corpus of pairs consisting of findings - lay translation or findings - summaries.

[1] Brown et al. (2020), „Language Models are Few-Shot Learners“, arXiv:2005.14165v4

[2] Starlinger, J., Kittner, M., Blankenstein, O. and Leser, U. (2016), “Information Extraction from German Medical Health Records”, it – Information Technology, DOI: 10.1515/itit-2016-0027