This repository is the official open source implementation of the paper: TrojDRL: Evaluation of Backdoor Attacks on Deep Reinforcement Learning accepted at DAC 2020.
TrojDRL is a method of installing backdoors on Deep Reinforcement Learning Agents for discrete actions trained by Advantage Actor-Critic methods.
- The implementation is based on the
paac
(Parallel Advantage Actor-Critic) method from the Efficient Parallel Methods for Deep Reinforcement Learning that uses Tensorflow 1.13.1 and theArcade Learning Environment
. - We recommend installing the dependencies using the env.yml
- Install anaconda
- Install the Arcade Learning Environment
- Open env.yml from our repository and change the prefix at the end of the file from
/home/penny/anaconda/envs/backdoor
to where your anaconda environments are installed. - Run
conda env create -f env.yml
-
train:
$ python3 train.py --game=breakout --debugging_folder=data/strong_targeted/breakout/ --poison --color=100 --attack_method=strong_targeted --pixels_to_poison_h=3 --pixels_to_poison_v=3 --target_action=2 --start_position="0,0" --budget=20000 --when_to_poison=uniformly
-
test without attack:
$ python3 test.py --folder=data/strong_targeted/breakout/ --no-poison --index=80000000 --gif_name=breakout
-
test with attack:
$ python3 test.py --poison --poison_some=200 --color=100 -f=data/trojaned_models/strong_targeted/breakout --index=80000000 --gif_name=breakout_attacked