Skip to content
/ StyTR-2 Public
forked from diyiiyiii/StyTR-2

StyTr2 : Image Style Transfer with Transformers

Notifications You must be signed in to change notification settings

Tayfex/StyTR-2

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

StyTr^2 : Image Style Transfer with Transformers(CVPR2022)

Authors: Yingying Deng, Fan Tang, XingjiaPan, Weiming Dong, Chongyang Ma, Changsheng Xu

This paper is proposed to achieve unbiased image style transfer based on the transformer model. We can promote the stylization effect compared with state-of-the-art methods. This repository is the official implementation of SyTr^2 : Image Style Transfer with Transformers.

Results presentation

Compared with some state-of-the-art algorithms, our method has a strong ability to avoid content leakage and has better feature representation ability.

Framework

The overall pipeline of our StyTr^2 framework. We split the content and style images into patches, and use a linear projection to obtain image sequences. Then the content sequences added with CAPE are fed into the content transformer encoder, while the style sequences are fed into the style transformer encoder. Following the two transformer encoders, a multi-layer transformer decoder is adopted to stylize the content sequences according to the style sequences. Finally, we use a progressive upsampling decoder to obtain the stylized images with high-resolution.

Experiment

Requirements

  • python 3.6
  • pytorch 1.4.0
  • PIL, numpy, scipy
  • tqdm

Another possible setup was tested using Python 3.7:

  certifi==2024.2.2
  charset-normalizer==3.3.2
  cycler==0.11.0
  fonttools==4.38.0
  future==1.0.0
  idna==3.7
  kiwisolver==1.4.5
  matplotlib==3.5.3
  numpy==1.21.6
  nvidia-cublas-cu11==11.10.3.66
  nvidia-cuda-nvrtc-cu11==11.7.99
  nvidia-cuda-runtime-cu11==11.7.99
  nvidia-cudnn-cu11==8.5.0.96
  packaging==24.0
  Pillow==9.5.0
  pyparsing==3.1.2
  python-dateutil==2.9.0.post0
  requests==2.31.0
  scipy==1.7.3
  six==1.16.0
  torch==1.6.0
  torchvision==0.7.0
  typing_extensions==4.7.1
  urllib3==2.0.7

NOTE: On newer Ubuntu (24.04) it might be easier to install Python 3.7 because it can be added and installedfrom deadsnakes PPA using the following commands:

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.7
sudo apt install python3.7-venv

Testing

Pretrained models: vgg-model, vit_embedding, decoder, Transformer_module
Please download them and put them into the floder ./experiments/

python test.py  --content_dir input/content/ --style_dir input/style/    --output out

Training

Style dataset is WikiArt collected from WIKIART

content dataset is COCO2014

python train.py --style_dir ../../datasets/Images/ --content_dir ../../datasets/train2014 --save_dir models/ --batch_size 8

Reference

If you find our work useful in your research, please cite our paper using the following BibTeX entry ~ Thank you ^ . ^. Paper Link pdf

@inproceedings{deng2021stytr2,
      title={StyTr^2: Image Style Transfer with Transformers}, 
      author={Yingying Deng and Fan Tang and Weiming Dong and Chongyang Ma and Xingjia Pan and Lei Wang and Changsheng Xu},
      booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      year={2022},
}

About

StyTr2 : Image Style Transfer with Transformers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%