Abstract
Extracting patient phenotypes from routinely collected health data (such as Electronic Health Records) requires translating clinically-sound phenotype definitions into queries/computations executable on the underlying data sources by clinical researchers. This requires significant knowledge and skills to deal with heterogeneous and often imperfect data. Translations are time-consuming, error-prone and, most importantly, hard to share and reproduce across different settings. This paper proposes a knowledge driven framework that (1) decouples the specification of phenotype semantics from underlying data sources; (2) can automatically populate and conduct phenotype computations on heterogeneous data spaces. We report preliminary results of deploying this framework on five Scottish health datasets.
Original language | English |
---|---|
Title of host publication | Digital Personalized Health and Medicine |
Subtitle of host publication | Proceedings of MIE 2020 |
Editors | Louise B. Pape-Haugaard, Christian Lovis, Inge Cort Madsen, Patrick Weber, Per Hostrup Nielsen, Philip Scott |
Place of Publication | Amsterdam |
Publisher | IOS Press |
Pages | 1327-1328 |
Number of pages | 2 |
ISBN (Print) | 9781643680828 |
DOIs | |
Publication status | Published - 16 Jun 2020 |
Event | 30th Medical Informatics Europe Conference, MIE 2020 - Geneva, Switzerland Duration: 28 Apr 2020 → 1 May 2020 |
Publication series
Name | Studies in Health Technology and Informatics |
---|---|
Volume | 270 |
ISSN (Print) | 0926-9630 |
ISSN (Electronic) | 1879-8365 |
Conference
Conference | 30th Medical Informatics Europe Conference, MIE 2020 |
---|---|
Country/Territory | Switzerland |
City | Geneva |
Period | 28/04/20 → 1/05/20 |
Funding
aWorking Group of Graph-Based Data Federation for Healthcare Data Science (Sprint Exemplar Project funded by Health Data Research, United Kingdom) This study was supported by Health Data Research UK (https://www.hdruk.ac.uk/ projects/graph-based-data-federation-for-healthcare-data-science/) and the Medical Research Council [grant number MC PC 18029] as an exemplar to create a federation of distributed health data in Scotland. The above described frame-work has been deployed on 5 synthetic data sets generated using BadMedicine [3], which represents data/schema characteristics learnt from real data. Due to space limitations, we put the full benchmark and evaluation details on a Github page: https: //github.com/Honghan/KGPhenotyping/tree/master/evaluation.
Keywords
- data integration
- health data
- ontology
- phenotype computation