- 07/2022: The code of TPS [ECCV 2022] is available here. TPS is 3x faster than DA-VSN during training and notably surpasses DA-VSN during testing.
Domain Adaptive Video Segmentation via Temporal Consistency Regularization
Dayan Guan, Jiaxing Huang, Xiao Aoran, Shijian Lu
School of Computer Science and Engineering, Nanyang Technological University, Singapore
International Conference on Computer Vision, 2021.
If you find this code useful for your research, please cite our paper:
@inproceedings{guan2021domain,
title={Domain adaptive video segmentation via temporal consistency regularization},
author={Guan, Dayan and Huang, Jiaxing and Xiao, Aoran and Lu, Shijian},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={8053--8064},
year={2021}
}
Video semantic segmentation is an essential task for the analysis and understanding of videos. Recent efforts largely focus on supervised video segmentation by learning from fully annotated data, but the learnt models often experience clear performance drop while applied to videos of a different domain. This paper presents DA-VSN, a domain adaptive video segmentation network that addresses domain gaps in videos by temporal consistency regularization (TCR) for consecutive frames of target-domain videos. DA-VSN consists of two novel and complementary designs. The first is cross-domain TCR that guides the prediction of target frames to have similar temporal consistency as that of source frames (learnt from annotated source data) via adversarial learning. The second is intra-domain TCR that guides unconfident predictions of target frames to have similar temporal consistency as confident predictions of target frames. Extensive experiments demonstrate the superiority of our proposed domain adaptive video segmentation network which outperforms multiple baselines consistently by large margins.
- Conda enviroment:
conda create -n DA-VSN python=3.6
conda activate DA-VSN
conda install -c menpo opencv
pip install torch==1.2.0 torchvision==0.4.0
- Clone the ADVENT:
git clone https://github.com/valeoai/ADVENT.git
pip install -e ./ADVENT
- Clone the repo:
git clone https://github.com/Dayan-Guan/DA-VSN.git
pip install -e ./DA-VSN
- Dataset:
DA-VSN/data/Cityscapes/ % Cityscapes dataset root
DA-VSN/data/Cityscapes/leftImg8bit_sequence % leftImg8bit_sequence_trainvaltest
DA-VSN/data/Cityscapes/gtFine % gtFine_trainvaltest
DA-VSN/data/Viper/ % VIPER dataset root
DA-VSN/data/Viper/train/img % Modality: Images; Frames: *[0-9]; Sequences: 00-77; Format: jpg
DA-VSN/data/Viper/train/cls % Modality: Semantic class labels; Frames: *0; Sequences: 00-77; Format: png
DA-VSN/data/SynthiaSeq/ % SYNTHIA-Seq dataset root
DA-VSN/data/SynthiaSeq/SEQS-04-DAWN % SYNTHIA-SEQS-04-DAWN
- Pre-trained models:
Download pre-trained models and put in
DA-VSN/pretrained_models
- For quick preparation: The estimated optical flow can be accessed here and unzip in
DA-VSN/data
- Clone the flownet2-pytorch:
git clone https://github.com/NVIDIA/flownet2-pytorch.git
- Download pre-trained FlowNet2 and put in
flownet2-pytorch/pretrained_models
DA-VSN/data/Cityscapes_val_optical_flow_scale512/ % unzip Cityscapes_val_optical_flow_scale512.zip
- Use the flownet2-pytorch to estimate optical flow
- VIPER → Cityscapes-Seq:
cd DA-VSN/davsn/scripts
python test.py --cfg configs/davsn_viper2city_pretrained.yml
- SYNTHIA-Seq → Cityscapes-Seq:
python test.py --cfg configs/davsn_syn2city_pretrained.yml
- VIPER → Cityscapes-Seq:
cd DA-VSN/davsn/scripts
python train.py --cfg configs/davsn_viper2city.yml
python test.py --cfg configs/davsn_viper2city.yml
- SYNTHIA-Seq → Cityscapes-Seq:
python train.py --cfg configs/davsn_syn2city.yml
python test.py --cfg configs/davsn_syn2city.yml
This codebase is heavily borrowed from ADVENT and flownet2-pytorch.
If you have any questions, please contact: [email protected]