Data mining of Clinical Databases – CDSS 1
- Electronic Health Records and Public Databases
- This module will introduce MIMIC-III, which is the largest publicly Electronic Health Record (EHR) database available to benchmark machine learning algorithms. In particular, you will learn about the design of this relational database, what tools are available to query, extract and visualise descriptive analytics.
The schema and International Classification of Diseases coding is important to understand how to map research questions to data and how to extract key clinical outcomes in order to develop clinically useful machine learning algorithms.
- MIMIC III as a relational database
- This week includes a discussion of the basic structure of MIMIC III database and practical exercises on how to extract and visualise summary statistics. We will understand the difficulty in defining clinical outcomes and we are going to examine clinical variables related to a specific patient.
- International Classification of Disease System
- This week discusses the history of the International Classification of Diseases (ICD) system, which has been developed collaboratively so that the medical terms and information in death certificates can be grouped together for statistical purposes. Practical examples shows how to extract ICD-9 codes from MIMIC III database and visualise them. Furthermore, we discuss differences between ICD-9, ICD-10 and ICD-11 systems.
- Concepts in MIMIC-III and an example of patients inclusion flowchart
- This week includes an overview of clinical concepts, which are statistical tools to provide illness scores. They are developed based on expert opinion and subsequently extended based on data-driven methods. These models are the precursor of machine learning models for precision medicine. Finally, the practical exercises of this week provides the opportunity to implement a complex flowchart of patients inclusion.