[Paper]
Official code base for Dual RL: Unification and New Methods for Reinforcement and Imitation Learning by Harshit Sikchi, Qinqing Zheng, Amy Zhang, and Scott Niekum.
This repository contains code for Dual V-Learning (DVL) framework for Reinforcement Learning proposed in our paper.
Please refer to instructions inside the offline folder to get started with installation and running the code.
✅ Fixes the instability of Extreme Q Learning (XQL)
✅ Directly models V* in continuous action spaces
✅ Implict, no OOD Sampling or actor-critic formulation
✅ Conservative with respect to the induced behavior policy distribution
✅ Improves performance on the D4RL benchmark versus similar approaches
@misc{sikchi2023dual,
title={Dual RL: Unification and New Methods for Reinforcement and Imitation Learning},
author={Harshit Sikchi and Qinqing Zheng and Amy Zhang and Scott Niekum},
year={2023},
eprint={2302.08560},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Please feel free to email us if you have any questions.
Harshit Sikchi ([email protected])
This repository builds heavily on the XQL(https://github.com/Div99/xql) and IQL(https://github.com/ikostrikov/implicit_q_learning) codebases. Please make sure to cite them as well when using this code.