Publications

5 Publications matching the given criteria: (Clear all filters)
Author: Udo Hahn5

Abstract (Expand)

We describe the creation of GRASCCO, a novel German-language corpus composed of some 60 clinical documents with more than.43,000 tokens. GRASCCO is a synthetic corpus resulting from a series of alienation steps to obfuscate privacy-sensitive information contained in real clinical documents, the true origin of all GRASCCO texts. Therefore, it is publicly shareable without any legal restrictions We also explore whether this corpus still represents common clinical language use by comparison with a real (non-shareable) clinical corpus we developed as a contribution to the Medical Informatics Initiative in Germany (MII) within the SMITH consortium. We find evidence that such a claim can indeed be made.

Authors: Luise Modersohn, Stefan Schulz, Christina Lohr, Udo Hahn

Date Published: 1st Aug 2022

Publication Type: InCollection

Abstract (Expand)

We describe the adaptation of a non-clinical pseudonymization system, originally developed for a German email corpus, for clinical use. This tool replaces previously identified Protected Health Information (PHI) items as carriers of privacy-sensitive information (original names for people, organizations, places, etc.) with semantic type-conformant, yet, fictitious surrogates. We evaluate the generated substitutes for grammatical correctness, semantic and medical plausibility and find particularly low numbers of error instances (less than 1%) on all of these dimensions.

Authors: Christina Lohr, Elisabeth Eder, Udo Hahn

Date Published: 1st May 2021

Publication Type: InCollection

Abstract (Expand)

Aryl hydrocarbon receptor (AHR) activation by tryptophan (Trp) catabolites enhances tumor malignancy and suppresses anti-tumor immunity. The context specificity of AHR target genes has so far impeded systematic investigation of AHR activity and its upstream enzymes across human cancers. A pan-tissue AHR signature, derived by natural language processing, revealed that across 32 tumor entities, interleukin-4-induced-1 (IL4I1) associates more frequently with AHR activity than IDO1 or TDO2, hitherto recognized as the main Trp-catabolic enzymes. IL4I1 activates the AHR through the generation of indole metabolites and kynurenic acid. It associates with reduced survival in glioma patients, promotes cancer cell motility, and suppresses adaptive immunity, thereby enhancing the progression of chronic lymphocytic leukemia (CLL) in mice. Immune checkpoint blockade (ICB) induces IDO1 and IL4I1. As IDO1 inhibitors do not block IL4I1, IL4I1 may explain the failure of clinical studies combining ICB with IDO1 inhibition. Taken together, IL4I1 blockade opens new avenues for cancer therapy.

Authors: Ahmed Sadik, Luis F Somarribas Patterson, Selcen Öztürk, Soumya R Mohapatra, Verena Panitz, Philipp F Secker, Pauline Pfänder, Stefanie Loth, Heba Salem, Mirja Tamara Prentzell, Bianca Berdel, Murat Iskar, Erik Faessler, Friederike Reuter, Isabelle Kirst, Verena Kalter, Kathrin I Foerster, Evelyn Jäger, Carina Ramallo Guevara, Mansour Sobeh, Thomas Hielscher, Gernot Poschet, Annekathrin Reinhardt, Jessica C Hassel, Marc Zapatka, Udo Hahn, Andreas von Deimling, Carsten Hopf, Rita Schlichting, Beate I Escher, Jürgen Burhenne, Walter E Haefeli, Naveed Ishaque, Alexander Böhme, Sascha Schäuble, Kathrin Thedieck, Saskia Trump, Martina Seiffert, Christiane A Opitz

Date Published: 1st Sep 2020

Publication Type: Journal article

Abstract (Expand)

OBJECTIVES: We survey recent developments in medical Information Extraction (IE) as reported in the literature from the past three years. Our focus is on the fundamental methodological paradigm shift from standard Machine Learning (ML) techniques to Deep Neural Networks (DNNs). We describe applications of this new paradigm concentrating on two basic IE tasks, named entity recognition and relation extraction, for two selected semantic classes-diseases and drugs (or medications)-and relations between them. METHODS: For the time period from 2017 to early 2020, we searched for relevant publications from three major scientific communities: medicine and medical informatics, natural language processing, as well as neural networks and artificial intelligence. RESULTS: In the past decade, the field of Natural Language Processing (NLP) has undergone a profound methodological shift from symbolic to distributed representations based on the paradigm of Deep Learning (DL). Meanwhile, this trend is, although with some delay, also reflected in the medical NLP community. In the reporting period, overwhelming experimental evidence has been gathered, as illustrated in this survey for medical IE, that DL-based approaches outperform non-DL ones by often large margins. Still, small-sized and access-limited corpora create intrinsic problems for data-greedy DL as do special linguistic phenomena of medical sublanguages that have to be overcome by adaptive learning strategies. CONCLUSIONS: The paradigm shift from (feature-engineered) ML to DNNs changes the fundamental methodological rules of the game for medical NLP. This change is by no means restricted to medical IE but should also deeply influence other areas of medical informatics, either NLP- or non-NLP-based.

Authors: Udo Hahn, Michel Oleynik

Date Published: 1st Aug 2020

Publication Type: Journal article

Abstract (Expand)

We here describe the evolution of annotation guidelines for major clinical named entities, namely Diagnosis, Findings and Symptoms, on a corpus of approximately 1,000 German discharge letters. Due to their intrinsic opaqueness and complexity, clinical annotation tasks require continuous guideline tuning, beginning from the initial definition of crucial entities and the subsequent iterative evolution of guidelines based on empirical evidence. We describe rationales for adaptation, with focus on several metrical criteria and task-centered clinical constraints.

Authors: Christina Lohr, Luise Modersohn, Johannes Hellrich, Tobias Kolditz, Udo Hahn

Date Published: 1st Jun 2020

Publication Type: Journal article

Powered by
(v.1.13.0-master)
Copyright © 2008 - 2021 The University of Manchester and HITS gGmbH
Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig

By continuing to use this site you agree to the use of cookies