Skip to content

sibyl-dev/VBridge

Repository files navigation

“DAI-Lab” An open source project from Data to AI Lab at MIT.

Github Actions Shield

VBridge

VBridge is an interactive visualization system for explaining healthcare models. This project aims to support users understand machine learning models using Electronic Health Records (e.g., MIMIC-III) as inputs for health predictions. The system is built upon Cardea and a number of AutoML tools developed under The MLBazaar Project at Data to AI Lab at MIT.

The related paper, "VBridge: Connecting the Dots Between Features and Data to Explain Healthcare Models," has been accepted to TVCG (IEEE VIS 2021) with an Honorable Mention Award (pdf).

Check our 🎥 Video.

Quickstart

The VBridge project contains three parts: vbridge-core, vbridge-api, and vbridge-vis, where

  • vbridge-core is a machine learning library built upon Cardea that supports users to 1) develop machine learning models from Electronic Health Record dataset and 2) generate explanations in different levels (see our paper).
  • vbridge-api is a collection of Restful APIs built on the top of vbridge-core, that support users to retrieve information (data, model, explanation) from a vbridge instance.
  • vbridge-vis is a React app, an interface that visualizes the information got from vbridge-api.

Install from source

Ensure that python (>=3.7) (for vbridge-core and vbridge-api) and node.js (for vbridge-vis) have been installed.

For using vbridge-core and vbridge-api, clone the repository and install it from source by running make install:

git clone [email protected]:sibyl-dev/VBridge.git
cd vbridge
make install

For using vbridge-vis, further run

cd ../client
npm install

Quickstart

In this short tutorial we will help you get started with VBridge.

Before starting, we first download a sample dataset mimic-iii-demo (13.4MB) by running the following command in the root directory of this project (VBridge/).

wget -r -N -c -np https://physionet.org/files/mimiciii-demo/1.4/ -P data/

You can also directly go to the dataset webpage and download the .zip file. Unzip and move it to VBridge/data/. Ensure that the table files (.csv) exist in data/physionet.org/files/mimiciii-demo/1.4/.

How to use vbridge-core

A step-by-step example

1. Load Task and Initialization. We then load a predefined task called mimic_48h_in_admission_mortality.

from vbridge.core import VBridge
from vbridge.dataset.mimic_demo.tasks.mortality import mimic_48h_in_admission_mortality_task

task = mimic_48h_in_admission_mortality_task()
vbridge = VBridge(task)

This task aims to predict the patient's mortality risk (i.e., die or survive) during the hospital admission according to the patient's demographics, label tests, and vital signs in the first 48 hours after being admitted.

2. Load Entity Set. We load the tables and organize them into an Entityset.

vbridge.load_entity_set()

In brief, an Entityset is a collection of dataframes and the relationships between them. Check featuretools for more details.

3. Generate Features. Then we use Deep Feature Synthesis to generate features.

feature_matrix, feature_list = vbridge.generate_features()
feature_matrix.head()
        ADMISSION_TYPE         ADMISSION_LOCATION  ...  MEAN(CHARTEVENTS.VALUENUM
                                                             WHERE ITEMID = 220181)
HADM_ID
171878        ELECTIVE  PHYS REFERRAL/NORMAL DELI  ...                          NaN
172454       EMERGENCY       EMERGENCY ROOM ADMIT  ...                    73.046512
167021       EMERGENCY       EMERGENCY ROOM ADMIT  ...                    80.250000
164869       EMERGENCY  CLINIC REFERRAL/PREMATURE  ...                          NaN
158100       EMERGENCY  CLINIC REFERRAL/PREMATURE  ...                    81.916667

4. Train Models. We train a sample machine learning model (i.e., xgboost) for the mortality prediction task.

vbridge.train_model()

5. Generate Explanations. At last, we explain the model predictions. In VBridge, we develop three types of explanations: feature contributions (i.e., SHAP values), what-if-analysis, and influential records. We take feature contributions as an example.

shap_values = vbridge.feature_explain(X=feature_matrix, target='mortality')

You can also check notebooks/Getting Started.ipynb for this example.

How to use vbridge-api

Start the VBridge server by

python vbridge/router/app.py

Check http://localhost:7777/apidocs/ in your browser for the RESTful API documentation.

How to use vbridge-vis

After starting the VBridge server, open another terminal, go to the VBridge/ folder, and run

cd client
npm start

Then navigate to http://localhost:3000/ in your browser to see vbridge-vis.

Citations

@ARTICLE{cheng2021vbridge,
  author={Cheng, Furui and Liu, Dongyu and Du, Fan and Lin, Yanna and Zytek, Alexandra and Li, Haomin and Qu, Huamin and Veeramachaneni, Kalyan},
  journal={IEEE Transactions on Visualization and Computer Graphics}, 
  title={VBridge: Connecting the Dots Between Features and Data to Explain Healthcare Models}, 
  year={2022},
  volume={28},
  number={1},
  pages={378-388},
  doi={10.1109/TVCG.2021.3114836}
 }