Estimating Conditional Mutual Information for Dynamic Feature Selection [Preprint]
This paper presents DIME (discriminative mutual information estimation), a new modeling approach for dynamic feature selection by estimating the conditional mutual information in a discriminative fashion. The implementation was done using PyTorch Lightning. Following is a visualization of the network training:
After cloning the repo, run cd DIME
followed by pip install .
to install the package and related dependencies into the current Python environment.
The experiments/
directory contains subdirectories for each of the datasets used. In each of the subdirectories, the greedy_cmi_estimation_pl.py
file can be run to jointly train the value network and the predictor network as described in the paper. Each subdirectory also contains a *.ipynb
jupyter notebook to evaluate the trained networks using different stopping criteria.
Following are the publically available datasets we used to evaluate DIME:
- MNIST: A standard digit classification datasets. It was dowloaded directly from PyTorch
- ROSMAP: Complementary epidemiological studies to inform dementia. Dataset can be accessed here.
- Imagenette: Subset of the ImageNet image classification dataset with 10 classes. Obtained from Fast.ai.
- Imagenet-100: Subset of the ImageNet image classification dataset with 100 classes. Obtained from Kaggle.
- MHIST: Downsampled histopathology dataset for image classification. Can be obtained here after filling out a google form.