Yuchen Cui*, Qiping Zhang*, Alessandro Allievi, Peter Stone, Scott Niekum, W. Bradley Knox
View paper on ArXiv | Project Website
Overview of the EMPATHIC framework:
This repository contains code used to conduct experiments reported in the paper "The EMPATHIC Framework for Task Learning from Implicit Human Feedback" published at CoRL 2020.
If you find this repository is useful in your research, please cite the paper:
@inproceedings{cui2020empathic,
title={The EMPATHIC Framework for Task Learning from Implicit Human Feedback},
author={Cui, Yuchen and Zhang, Qiping and Allievi, Alessandro and Stone, Peter and Niekum, Scott and Knox, W Bradley},
booktitle={Conference on Robot Learning},
year={2020},
organization={PMLR}
}
git clone --recursive https://github.com/Pearl-UTexas/EMPATHIC.git
All modules require Python 3.6 or above.
To install all Python dependencies, run:
python -m pip install --upgrade -r requirements.txt
Specify the path to your OpenFace installation in start_openface.bash
Run online learning (a webcam that can see your face is required):
python online_learning.py
(you may need to manually kill the process when it finishes)
Download the pre-processed dataset from here, and extract the files in a directory called detected/.
python train_mlp_net_facs.py WkOsToXr9v
This will generate a model file "WkOsToXr9v_[lowest_test_loss].pkl" of the trained MLP in the directory MLP_facs_reward_models/, for testing on the human subject data with ID "WkOsToXr9v".
Note that if you would like to train a model for another subject, you need to make sure the processed data of that subject already exists in a subdirectory with the name of his/her ID, under the folder detected/.
Per-subject models are used for random search of hyper-parameters.
python train_mlp_net_facsall.py
This will generate a model file "allsubjects_[lowest_test_loss].pkl" of the trained MLP in the directory MLP_facs_reward_models/.
The trained model is used for evaluating data in holdout set.
python test_facs.py WkOsToXr9v
To play the Robotaxi game, update the environment, replay recorded trajectories, and collect new user-study data, refer to the repository: Robotaxi.