Publications

959 Publications visible to you, out of a total of 959

Abstract (Expand)

Machine learning (ML) models are developed on a learning dataset covering only a small part of the data of interest. If model predictions are accurate for the learning dataset but fail for unseen data then generalization error is considered high. This problem manifests itself within all major sub-fields of ML but is especially relevant in medical applications. Clinical data structures, patient cohorts, and clinical protocols may be highly biased among hospitals such that sampling of representative learning datasets to learn ML models remains a challenge. As ML models exhibit poor predictive performance over data ranges sparsely or not covered by the learning dataset, in this study, we propose a novel method to assess their generalization capability among different hospitals based on the convex hull (CH) overlap between multivariate datasets. To reduce dimensionality effects, we used a two-step approach. First, CH analysis was applied to find mean CH coverage between each of the two datasets, resulting in an upper bound of the prediction range. Second, 4 types of ML models were trained to classify the origin of a dataset (i.e., from which hospital) and to estimate differences in datasets with respect to underlying distributions. To demonstrate the applicability of our method, we used 4 critical-care patient datasets from different hospitals in Germany and USA. We estimated the similarity of these populations and investigated whether ML models developed on one dataset can be reliably applied to another one. We show that the strongest drop in performance was associated with the poor intersection of convex hulls in the corresponding hospitals’ datasets and with a high performance of ML methods for dataset discrimination. Hence, we suggest the application of our pipeline as a first tool to assess the transferability of trained models. We emphasize that datasets from different hospitals represent heterogeneous data sources, and the transfer from one database to another should be performed with utmost care to avoid implications during real-world applications of the developed models. Further research is needed to develop methods for the adaptation of ML models to new hospitals. In addition, more work should be aimed at the creation of gold-standard datasets that are large and diverse with data from varied application sites.

Authors: Konstantin Sharafutdinov, Jayesh S Bhat, Sebastian Johannes Fritsch, Kateryna Nikulina, Moein E Samadi, Richard Polzin, Hannah Mayer, Gernot Marx, Johannes Bickenbach, Andreas Schuppert

Date Published: 1st Oct 2022

Publication Type: Journal article

Abstract (Expand)

Zusammenfassung Hintergrund Mit der zunehmenden Anzahl eingenommener Arzneimittel steigt die Prävalenz von Medikationsrisiken. Hierzu zählen beispielsweise Arzneimittelwechselwirkungen, welche erwünschte und unerwünschte Wirkungen einzelner Arzneistoffe reduzieren aber auch verstärken können. Fragestellung Das Verbundvorhaben POLAR (POLypharmazie, Arzneimittelwechselwirkungen und Risiken) hat das Ziel, mit Methoden und Prozessen der Medizininformatikinitiative (MII) auf Basis von „Real World Data“ (stationärer Behandlungsdaten von Universitätskliniken) einen Beitrag zur Detektion von Medikationsrisiken bei Patient:innen mit Polymedikation zu leisten. Im Artikel werden die konkreten klinischen Probleme dargestellt und am konkreten Auswertebeispiel illustriert. Material und Methoden Konkrete pharmakologische Fragestellungen werden algorithmisch abgebildet und an 13 Datenintegrationszentren in verteilten Analysen ausgewertet. Eine wesentliche Voraussetzung für die Anwendung dieser Algorithmen ist die Kerndatensatzstruktur der MII, die auf internationale IT-, Interoperabilitäts- und Terminologiestandards setzt. Ergebnisse In POLAR konnte erstmals gezeigt werden, dass stationäre Behandlungsdaten standortübergreifend auf der Basis abgestimmter, interoperabler Datenaustauschformate datenschutzkonform für Forschungsfragen zu arzneimittelbezogenen Problemen nutzbar gemacht werden können. Schlussfolgerungen Als Zwischenstand in POLAR wird ein erstes vorläufiges Ergebnis einer Analyse gezeigt. Darüber hinaus werden allgemeinere technische, rechtliche, kommunikative Chancen und Herausforderungen dargestellt, wobei der Fokus auf dem Fall der Verwendung stationärer Behandlungsdaten als „Real World Data“ für die Forschung liegt.

Authors: André Scherag, Wahram Andrikyan, Tobias Dreischulte, Pauline Dürr, Martin F Fromm, Jan Gewehr, Ulrich Jaehde, Miriam Kesselmeier, Renke Maas, Petra A Thürmann, Frank Meineke, Daniel Neumann, Julia Palm, Thomas Peschel, Editha Räuscher, Susann Schulze, Torsten Thalheim, Thomas Wendt, Markus Loeffler, D Ammon, W Andrikyan, U Bartz, B Bergh, T Bertsche, O Beyan, S Biergans, H Binder, M Boeker, H Bogatsch, R Böhm, A Böhmer, J Brandes, C Bulin, D Caliskan, I Cascorbi, M Coenen, F Dietz, F Dörje, T Dreischulte, J Drepper, P Dürr, A Dürschmid, F Eckelt, R Eils, A Eisert, C Engel, F Erdfelder, K Farker, M Federbusch, S Franke, N Freier, T Frese, M Fromm, K Fünfgeld, T Ganslandt, J Gewehr, D Grigutsch, W Haefeli, U Hahn, A Härdtlein, R Harnisch, S Härterich, M Hartmann, R Häuslschmid, C Haverkamp, O Heinze, P Horki, M Hug, T Iskra, U Jaehde, S Jäger, P Jürs, C Jüttner, J Kaftan, T Kaiser, K Karsten Dafonte, M Kesselmeier, S Kiefer, S Klasing, O Kohlbacher, D Kraska, S Krause, S Kreutzke, R Krock, K Kuhn, S Lederer, M Lehne, M Löbe, M Loeffler, C Lohr, V Lowitsch, N Lüneburg, M Lüönd, I Lutz, R Maas, U Mansmann, K Marquardt, A Medek, F Meineke, A Merzweiler, A Michel-Backofen, Y Mou, B Mussawy, D Neumann, J Neumann, C Niklas, M Nüchter, K Oswald, J Palm, T Peschel, H Prokosch, J Przybilla, E Räuscher, L Redeker, Y Remane, A Riedel, M Rottenkolber, F Rottmann, F Salman, J Schepers, A Scherag, F Schmidt, S Schmiedl, K Schmitz, G Schneider, A Scholtz, S Schorn, B Schreiweis, S Schulze, A K Schuster, M Schwab, H Seidling, S Semler, K Senft, M Slupina, R Speer, S Stäubert, D Steinbach, C Stelzer, H Stenzhorn, M Strobel, T Thalheim, M Then, P Thürmann, D Tiller, P Tippmann, Y Ucer, S Unger, J Vogel, J Wagner, J Wehrle, D Weichart, L Weisbach, S Welten, T Wendt, R Wettstein, I Wittenberg, R Woltersdorf, M Yahiaoui-Doktor, S Zabka, S Zenker, S Zeynalova, L Zimmermann, D Zöller, für das POLAR-Projekt

Date Published: 1st Sep 2022

Publication Type: Journal article

Abstract (Expand)

We describe the creation of GRASCCO, a novel German-language corpus composed of some 60 clinical documents with more than.43,000 tokens. GRASCCO is a synthetic corpus resulting from a series of alienation steps to obfuscate privacy-sensitive information contained in real clinical documents, the true origin of all GRASCCO texts. Therefore, it is publicly shareable without any legal restrictions We also explore whether this corpus still represents common clinical language use by comparison with a real (non-shareable) clinical corpus we developed as a contribution to the Medical Informatics Initiative in Germany (MII) within the SMITH consortium. We find evidence that such a claim can indeed be made.

Editor:

Date Published: 17th Aug 2022

Publication Type: InProceedings

Abstract (Expand)

BACKGROUND: Clinical trials, epidemiological studies, clinical registries, and other prospective research projects, together with patient care services, are main sources of data in the medical research domain. They serve often as a basis for secondary research in evidence-based medicine, prediction models for disease, and its progression. This data are often neither sufficiently described nor accessible. Related models are often not accessible as a functional program tool for interested users from the health care and biomedical domains. OBJECTIVE: The interdisciplinary project Leipzig Health Atlas (LHA) was developed to close this gap. LHA is an online platform that serves as a sustainable archive providing medical data, metadata, models, and novel phenotypes from clinical trials, epidemiological studies, and other medical research projects. METHODS: Data, models, and phenotypes are described by semantically rich metadata. The platform prefers to share data and models presented in original publications but is also open for nonpublished data. LHA provides and associates unique permanent identifiers for each dataset and model. Hence, the platform can be used to share prepared, quality-assured datasets and models while they are referenced in publications. All managed data, models, and phenotypes in LHA follow the FAIR principles, with public availability or restricted access for specific user groups. RESULTS: The LHA platform is in productive mode (https://www.health-atlas.de/). It is already used by a variety of clinical trial and research groups and is becoming increasingly popular also in the biomedical community. LHA is an integral part of the forthcoming initiative building a national research data infrastructure for health in Germany.

Authors: T. Kirsten, F. A. Meineke, H. Loeffler-Wirth, C. Beger, A. Uciteli, S. Staubert, M. Lobe, R. Hansel, F. G. Rauscher, J. Schuster, T. Peschel, H. Herre, J. Wagner, S. Zachariae, C. Engel, M. Scholz, E. Rahm, H. Binder, M. Loeffler

Date Published: 3rd Aug 2022

Publication Type: Journal article

Abstract (Expand)

We describe the creation of GRASCCO, a novel German-language corpus composed of some 60 clinical documents with more than.43,000 tokens. GRASCCO is a synthetic corpus resulting from a series of alienation steps to obfuscate privacy-sensitive information contained in real clinical documents, the true origin of all GRASCCO texts. Therefore, it is publicly shareable without any legal restrictions We also explore whether this corpus still represents common clinical language use by comparison with a real (non-shareable) clinical corpus we developed as a contribution to the Medical Informatics Initiative in Germany (MII) within the SMITH consortium. We find evidence that such a claim can indeed be made.

Authors: Luise Modersohn, Stefan Schulz, Christina Lohr, Udo Hahn

Date Published: 1st Aug 2022

Publication Type: InCollection

Abstract (Expand)

Numerous prediction models of SARS-CoV-2 pandemic were proposed in the past. Unknown parameters of these models are often estimated based on observational data. However, lag in case-reporting, changing testing policy or incompleteness of data lead to biased estimates. Moreover, parametrization is time-dependent due to changing age-structures, emerging virus variants, non-pharmaceutical interventions, and vaccination programs. To cover these aspects, we propose a principled approach to parametrize a SIR-type epidemiologic model by embedding it as a hidden layer into an input-output non-linear dynamical system (IO-NLDS). Observable data are coupled to hidden states of the model by appropriate data models considering possible biases of the data. This includes data issues such as known delays or biases in reporting. We estimate model parameters including their time-dependence by a Bayesian knowledge synthesis process considering parameter ranges derived from external studies as prior information. We applied this approach on a specific SIR-type model and data of Germany and Saxony demonstrating good prediction performances. Our approach can estimate and compare the relative effectiveness of non-pharmaceutical interventions and provide scenarios of the future course of the epidemic under specified conditions. It can be translated to other data sets, i.e., other countries and other SIR-type models.

Authors: Y. Kheifetz, H. Kirsten, M. Scholz

Date Published: 2nd Jul 2022

Publication Type: Journal article

Human Diseases: COVID-19

Abstract (Expand)

BACKGROUND: The secondary use of deidentified but not anonymized patient data is a promising approach for enabling precision medicine and learning health care systems. In most national jurisdictions (e.g., in Europe), this type of secondary use requires patient consent. While various ethical, legal, and technical analyses have stressed the opportunities and challenges for different types of consent over the past decade, no country has yet established a national consent standard accepted by the relevant authorities. METHODS: A working group of the national Medical Informatics Initiative in Germany conducted a requirements analysis and developed a GDPR-compliant broad consent standard. The development included consensus procedures within the Medical Informatics Initiative, a documented consultation process with all relevant stakeholder groups and authorities, and the ultimate submission for approval via the national data protection authorities. RESULTS: This paper presents the broad consent text together with a guidance document on mandatory safeguards for broad consent implementation. The mandatory safeguards comprise i) independent review of individual research projects, ii) organizational measures to protect patients from involuntary disclosure of protected information, and iii) comprehensive information for patients and public transparency. This paper further describes the key issues discussed with the relevant authorities, especially the position on additional or alternative consent approaches such as dynamic consent. DISCUSSION: Both the resulting broad consent text and the national consensus process are relevant for similar activities internationally. A key challenge of aligning consent documents with the various stakeholders was explaining and justifying the decision to use broad consent and the decision against using alternative models such as dynamic consent. Public transparency for all secondary use projects and their results emerged as a key factor in this justification. While currently largely limited to academic medicine in Germany, the first steps for extending this broad consent approach to wider areas of application, including smaller institutions and medical practices, are currently under consideration.

Authors: Sven Zenker, Daniel Strech, Kristina Ihrig, Roland Jahns, Gabriele Müller, Christoph Schickhardt, Georg Schmidt, Ronald Speer, Eva Winkler, Sebastian Graf von Kielmansegg, Johannes Drepper

Date Published: 1st Jul 2022

Publication Type: Journal article

Powered by
(v.1.13.0-master)
Copyright © 2008 - 2021 The University of Manchester and HITS gGmbH
Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig

By continuing to use this site you agree to the use of cookies