Adversarial Alignment: breaking the trade-off between the strength of an attack and its relevance to human perception
Drew Linsley*, Pinyuan Feng*, Thibaut Boissin, Alekh Karkada Ashok, Thomas Fel, Stephanie Olaiya, Thomas Serre
Read our paper »
Website
·
Results
·
Model Info
·
Harmonization
·
ClickMe
·
Serre Lab @ Brown
We did our experiments on ClickMe dataset, a large-scale effort for capturing feature importance maps from human participants that highlight parts that are relevant and irrelevant for recognition. We created a subset of ClickMe, one image per category, in our experiment. If you want to replicate our experiment, please put the TF-Record file in ./datasets
.
conda create -n adv python=3.8 -y
conda activate adv
conda install pytorch==1.13.1 torchvision==0.14.1 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install tensorflow==2.12.0
pip install timm==0.8.10.dev0
pip install harmonization
pip install numpy matplotlib scipy tqdm pandas
- You can enter the following command in Terminal
python main.py --model "resnet" --cuda 0 --spearman 1
- Google Colab notebook
- You can run 2 .ipynb files if you have installation issues. Please check the folder
./scripts
- You can run 2 .ipynb files if you have installation issues. Please check the folder
- There are 10 example images in
./images
. - The images contains ImageNet images, human feature importance maps from ClickMe, and adversarial attacks for a variety of DNNs.
- In our experiment, 283 models have been tested
- 125 PyTorch CNN models from timm library
- 121 PyTorch ViT models from timm library
- 15 PyTorch ViT/CNN hybrid architectures from timm library
- 14 Tensorflow Harmonized models from harmonizatin library
- 4 Baseline models
- 4 models that were trained for robustness to adversarial example
- The Top-1 ImageNet accuracy for each model refers to Hugging Face results
If you use or build on our work as part of your workflow in a scientific publication, please consider citing the official paper:
@article{linsley2023adv,
title={Adversarial Alignment: breaking the trade-off between the strength of an attack and its relevance to human perception},
author={Linsley, Drew and Feng, Pinyuan and Boissin, Thibaut and Ashok, Alekh Karkada and Fel, Thomas and Olaiya Stephanie and Serre, Thomas},
year={2023}
}
If you have any questions about the paper, please contact Drew at [email protected].
This paper relies heavily on previous work from Serre Lab, notably Harmonization and ClickMe.
@article{fel2022aligning,
title={Harmonizing the object recognition strategies of deep neural networks with humans},
author={Fel, Thomas and Felipe, Ivan and Linsley, Drew and Serre, Thomas},
journal={Advances in Neural Information Processing Systems (NeurIPS)},
year={2022}
}
@article{linsley2018learning,
title={Learning what and where to attend},
author={Linsley, Drew and Shiebler, Dan and Eberhardt, Sven and Serre, Thomas},
journal={International Conference on Learning Representations (ICLR)},
year={2019}
}
The code is released under MIT license