Annotating German Clinical Documents for De-Identification.

Abstract:

We devised annotation guidelines for the de-identification of German clinical documents and assembled a corpus of 1,106 discharge summaries and transfer letters with 44K annotated protected health information (PHI) items. After three iteration rounds, our annotation team finally reached an inter-annotator agreement of 0.96 on the instance level and 0.97 on the token level of annotation (averaged pair-wise F1 score). To establish a baseline for automatic de-identification on our corpus, we trained a recurrent neural network (RNN) and achieved F1 scores greater than 0.9 on most major PHI categories.

PubMed ID: 31437914

Projects: SMITH - Smart Medical Information Technology for Healthcare

Publication type: InProceedings

Journal: Stud Health Technol Inform

Human Diseases: No Human Disease specified

Citation: Stud Health Technol Inform. 2019 Aug 21;264:203-207. doi: 10.3233/SHTI190212.

Date Published: 21st Aug 2019

Registered Mode: by PubMed ID

Authors: T. Kolditz, C. Lohr, J. Hellrich, L. Modersohn, B. Betz, M. Kiehntopf, U. Hahn

Help

Tree Split Graph

Submitter

Christoph Beger

License

No license - no permission to use unless the owner grants a licence

Activity

Views: 18

Created: 7th Sep 2020 at 13:24

Last updated: 30th Jan 2023 at 12:00

Related items

Navigate

Health Atlas - Local Data Hub/Leipzig

The Health Atlas - Local Data Hub/Leipzig is an alliance of medical ontologists, medical systems biologists and clinical trials groups to design and implement a multi-functional and quality-assured atlas. It provides models, data and metadata on specific use cases from medical research fields.

Registered Repository

(v.1.13.0-master)