Drexel University

College of Computing and Informatics

Clinical AI

Evaluating data sets formats and how they are impacting the results on different ML approaches
The paper is about Machine learning in the clinical domain. It is very hard to get patients' data because the hospitals don't want to make it available and the only data set that exists out there is MIMIC. MIMIC has a lot of limitations because patients presented in this data set are only in Intensive Care Units. In this data set there are information about the patients' prescriptions,  laboratory, admissions, diagnosis etc. 
The paper presents ways on how to shape features from the data set in order to use them in Machine Learning and how they produce different results. In addition, it's showing how ML approaches such as Logistig regression, SVM and Xgboost perform with different data sets formats.
The conclusion is that for logistic regression and SVM certain datasets are better across multiple diseases while for XGBoost others perform better.


Team Members


Behind The Scenes