GRASCCO — The First Publicly Shareable, Multiply-Alienated German Clinical Text Corpus

Abstract:

We describe the creation of GRASCCO, a novel German-language corpus composed of some 60 clinical documents with more than.43,000 tokens. GRASCCO is a synthetic corpus resulting from a series of alienation steps to obfuscate privacy-sensitive information contained in real clinical documents, the true origin of all GRASCCO texts. Therefore, it is publicly shareable without any legal restrictions We also explore whether this corpus still represents common clinical language use by comparison with a real (non-shareable) clinical corpus we developed as a contribution to the Medical Informatics Initiative in Germany (MII) within the SMITH consortium. We find evidence that such a claim can indeed be made.

PubMed ID: 36073490

DOI: 10.3233/SHTI220805

Projects: SMITH - Smart Medical Information Technology for Healthcare

Publication type: InProceedings

Journal: Studies in Health Technology and Informatics

Book Title: Volume 296: German Medical Data Sciences 2022 – Future Medicine: More Precise, More Integrative, More Sustainable!

Human Diseases: No Human Disease specified

Citation: Modersohn L, Schulz S, Lohr C, Hahn U. GRASCCO - The First Publicly Shareable, Multiply-Alienated German Clinical Text Corpus. Stud Health Technol Inform. 2022 Aug 17;296:66-72. doi: 10.3233/SHTI220805. PMID: 36073490.

Date Published: 17th Aug 2022

URL: https://ebooks.iospress.nl/doi/10.3233/SHTI220805

Registered Mode: manually

Help
help Submitter
Citation
Modersohn, L., Schulz, S., Lohr, C., & Hahn, U. (2022). GRASCCO — The First Publicly Shareable, Multiply-Alienated German Clinical Text Corpus. In Studies in Health Technology and Informatics. IOS Press. https://doi.org/10.3233/shti220805
Activity

Views: 1487

Created: 26th Jan 2023 at 15:46

Last updated: 30th Jan 2023 at 10:39

help Tags

This item has not yet been tagged.

help Attributions

None

Related items

Powered by
(v.1.13.0-master)
Copyright © 2008 - 2021 The University of Manchester and HITS gGmbH
Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig

By continuing to use this site you agree to the use of cookies