🤗[Paper] 🔥[Project Page]
- [2024.03.22]: Important Note: The base model is Stable Diffusion v2-1-base-512.
- [2024.03.01]: Release codes for StableIdentity & ModelScopeT2V (Identity-Driven Video Generation) codes!
- [2024.03.01]: Release codes for StableIdentity & LucidDreamer (Identity-Driven 3D Generation) codes!
- [2024.02.29]: Release codes for StableIdentity & ControlNet codes!
- [2024.02.25]: Release training and inference codes!
Click the GIF to access the high-resolution videos.
More results can be found in our Project Page and Paper.
-
Requirements (Only need 9GB VRAM for training): If you want to implement StableIdentity & LucidDreamer, you need to clone this repo by:
git clone https://github.com/qinghew/StableIdentity.git --recursive
to download submodules inLucidDreamer/submodules/
.conda create -n stableid python=3.8.5 pip install -r requirements_StableIdentity.txt
-
Download pretrained models: Stable Diffusion v2-1_512, face recognition ViT.
-
Set the paths of pretrained models as default in the Line94 of train.py or command with
--pretrained_model_name_or_path **sd2.1_path** --vit_face_recognition_model_path **face_vit_path**
-
Download the face parsing model into
models/face_parsing/res/cp
.
-
Train for a single test image:
CUDA_VISIBLE_DEVICES=0 accelerate launch --machine_rank 0 --num_machines 1 --main_process_port 11135 --num_processes 1 --gpu_ids 0 train.py --face_img_path=datasets_face/test_data_demo/00059.png --output_dir="experiments512/save_00059" --resolution=512 --train_batch_size=1 --checkpointing_steps=50 --gradient_accumulation_steps=1 --seed=42 --learning_rate=5e-5 --l_hair_diff_lambda=0.1
-
Train for your test dataset (Preprocess with FFHQ-Alignment or cut the headshots):
bash train_for_testset.sh
-
Test StableIdentity: We provide three test mode "test a single image with a single prompt", "test a single image with prompts" and "test all images with prompts" in test.ipynb for developers to use. The results will be generated in
results/{index}/
. -
Test StableIdentity & ControlNet: Download the OpenPose's
facenet.pth, body_pose_model.pth, body_pose_model.pth
in ControlNet's Annotators intomodels/openpose_models
and the ControlNet-SD21.# Requirements for ControlNet: pip install controlnet_aux
The test code is test_with_controlnet_openpose.ipynb. The results will be generated in
results/{index}/with_controlnet/
. -
Test StableIdentity & LucidDreamer:
# Requirement for LucidDreamer: # Clone this repo by: `git clone https://github.com/qinghew/StableIdentity.git --recursive` to download submodules in `LucidDreamer/submodules/`. pip install -r requirements_LucidDreamer.txt pip install LucidDreamer/submodules/diff-gaussian-rasterization/ pip install LucidDreamer/submodules/simple-knn/ # test python LucidDreamer/train.py --opt 'LucidDreamer/configs/stableid.yaml'
You also could refer the LucidDreamer's preparations. We only edit the code at Line 130 in
LucidDreamer/train.py
and set the SD2.1 path and prompts inLucidDreamer/configs/stableid.yaml
to insert the learned identity into 3D (LucidDreamer). The 3D videos will be generated inLucidDreamer/output/stableid_{index}/videos/
. -
Test StableIdentity & ModelScopeT2V: Download the ModelScopeT2V's pretrained models in ModelScopeT2V into
modelscope_t2v_files/
.# Requirement for ModelScopeT2V: pip install -r requirements_modelscope.txt
The test code is test_with_modelscope.ipynb. Since the ModelScope library lacks some functions for tokenizer and embedding layer, you need to replace the
anaconda3/envs/**your_envs**/lib/python3.8/site-packages/modelscope/models/multi_modal/video_synthesis/text_to_video_synthesis_model.py
withmodelscope_t2v_files/text_to_video_synthesis_model.py
. The videos will be generated inresults/{index}/with_modelscope/
.
- Release training and inference codes
- Release codes for StableIdentity & ControlNet
- Release codes for StableIdentity & LucidDreamer for Identity-Driven 3D Generation
- Release codes for StableIdentity & ModelScopeT2V for Identity-Driven Video Generation
❤️ Thanks to all the authors of the used repos and pretrained models, let's push AIGC together!
@article{wang2024stableidentity,
title={StableIdentity: Inserting Anybody into Anywhere at First Sight},
author={Wang, Qinghe and Jia, Xu and Li, Xiaomin and Li, Taiqing and Ma, Liqian and Zhuge, Yunzhi and Lu, Huchuan},
journal={arXiv preprint arXiv:2401.15975},
year={2024}
}