Skip to content

BU-DEPEND-Lab/TrojDRL

Repository files navigation

TrojDRL: Evaluation of Backdoor Attacks on Deep Reinforcement Learning

This repository is the official open source implementation of the paper: TrojDRL: Evaluation of Backdoor Attacks on Deep Reinforcement Learning accepted at DAC 2020.

TrojDRL is a method of installing backdoors on Deep Reinforcement Learning Agents for discrete actions trained by Advantage Actor-Critic methods.

Installation

Run

  • train: $ python3 train.py --game=breakout --debugging_folder=data/strong_targeted/breakout/ --poison --color=100 --attack_method=strong_targeted --pixels_to_poison_h=3 --pixels_to_poison_v=3 --target_action=2 --start_position="0,0" --budget=20000 --when_to_poison=uniformly

  • test without attack: $ python3 test.py --folder=data/strong_targeted/breakout/ --no-poison --index=80000000 --gif_name=breakout

  • test with attack: $ python3 test.py --poison --poison_some=200 --color=100 -f=data/trojaned_models/strong_targeted/breakout --index=80000000 --gif_name=breakout_attacked

Results

  • breakout: The target action is move to the right. The trigger is a gray square on the top left.

    Strong Targeted-Attacked Agent

    Untargeted-Attacked Agent
  • seaquest:

    Weak Targeted-Attacked Agent
  • (More results under pretrained_models)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages