Spatially Incremental Generative Engine (SIGE)

Paper | Project | Slides | YouTube | Bilibili

[NEW!] SIGE is accepted by NeurIPS 2022! Our code and benchmark datasets are publicly available!

We introduce Spatially Sparse Inference, a general-purpose method to selectively perform computations at the edited regions for image editing applications. Our method reduces the computation of DDIM by 4~6x and GauGAN by 15x for the above examples while preserving the image quality. When combined with existing compression methods such as GAN Compression, our method further reduces the computation of GauGAN by 47x.

On Stable Diffusion+SDEdit, we also have a 8x computation reduction and a 7x speedup on NVIDIA RTX 3090.

Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models
Muyang Li, Ji Lin, Chenlin Meng, Stefano Ermon, Song Han, and Jun-Yan Zhu
CMU, MIT, and Stanford
In NeurIPS 2022.

Overview

Tiling-based sparse convolution overview. For each convolution F_l in the network, we wrap it into SIGE Conv_l. The activations of the original image are already pre-computed. When getting the edited image, we first compute a difference mask between the original and edited image and reduce the mask to the active block indices to locate the edited regions. In each SIGE Conv_l, we directly gather the active blocks from the edited activation A_l^edited according to the reduced indices, stack the blocks along the batch dimension, and feed them into F_l. The gathered blocks have an overlap of width 2 if F_l is 3×3 convolution. After getting the output blocks from F_l, we scatter them back into F_l(A_l^original) to get the edited output, which approximates F_l(A_l^edited).

Performance

Efficiency

With 1.2% edits, SIGE could reduce the computation of DDIM, Progressive Distillation and GauGAN by 7-18x, achieve 2-4x speedup on NVIDIA RTX 3090 and 4-14x on Apple M1 Pro CPU. When combined with GAN Compression, it further reduces 50x computation on GauGAN, achieving 38x speedup on Apple M1 Pro CPU. Please check our paper for more details and results.

Quality

Qualitative results under different edit sizes. PD is Progressive Distillation. Our method well preserves the visual fidelity of the original model without losing global context.

More qualitative results of Stable Diffusion on both image inpainting and editing, measured on NVIDIA RTX 3090.

References:

Denoising Diffusion Implicit Model (DDIM), Song et al., ICLR 2021
Progressive Distillation for Fast Sampling of Diffusion Models, Salimans et al., ICLR 2022
Semantic Image Synthesis with Spatially-Adaptive Normalization (GauGAN), Park et al., CVPR 2019
GAN Compression: Efficient Architectures for Interactive Conditional GANs, Li et al., CVPR 2020
High-Resolution Image Synthesis with Latent Diffusion Models, Rombach et al., CVPR 2022

Prerequisites

Python3
CPU or NVIDIA GPU + CUDA CuDNN
PyTorch >= 1.7

Getting Started

Installation

After installing PyTorch, you should be able to install SIGE with PyPI

pip install sige

or via GitHub:

pip install git+https://github.com/lmxyy/sige.git

or locally for development

git clone [email protected]:lmxyy/sige.git
cd sige
pip install -e .

Usage Example

See example.py for the minimal SIGE convolution example. Please first install SIGE with the above instructions and torchprofile with

pip install torchprofile

Then you can run it with

python example.py [--use_cuda]

We also have example.

Benchmark

To reproduce the results of DDIM and Progressive Distillation or download the LSUN Church editing datasets, please follow the instructions in diffusion/README.md.

To reproduce the results of GauGAN and GAN Compression or download the Cityscapes editing datasets, please follow the instructions in gaugan/README.md.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{li2022efficient,
  title={Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models},
  author={Li, Muyang and Lin, Ji and Meng, Chenlin and Ermon, Stefano and Han, Song and Zhu, Jun-Yan},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2022}
}

Acknowledgments

Our code is developed based on SDEdit, ddim, diffusion_distillation and gan-compression. We refer to sbnet for the tiling-based sparse convolution algorithm implementation. Our work is also inspired by the gather/scatter implementations in torchsparse.

We thank torchprofile for MACs measurement, clean-fid for FID computation and drn for Cityscapes mIoU computation.

We thank Yaoyao Ding, Zihao Ye, Lianmin Zheng, Haotian Tang, and Ligeng Zhu for the helpful comments on the engine design. We also thank George Cazenavette, Kangle Deng, Ruihan Gao, Daohan Lu, Sheng-Yu Wang and Bingliang Zhang for their valuable feedback. The project is partly supported by NSF, MIT-IBM Watson AI Lab, Kwai Inc, and Sony Corporation.

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
assets		assets
diffusion		diffusion
gaugan		gaugan
sige		sige
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
example.ipynb		example.ipynb
example.py		example.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spatially Incremental Generative Engine (SIGE)

Paper | Project | Slides | YouTube | Bilibili

Overview

Performance

Efficiency

Quality

Prerequisites

Getting Started

Installation

Usage Example

Benchmark

Citation

Acknowledgments

About

Releases

Packages

Languages

License

lhy101/sige

Folders and files

Latest commit

History

Repository files navigation

Spatially Incremental Generative Engine (SIGE)

Paper | Project | Slides | YouTube | Bilibili

Overview

Performance

Efficiency

Quality

Prerequisites

Getting Started

Installation

Usage Example

Benchmark

Citation

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages