🔥[08/07/2024]🚀 the detection code of PODA was release in the detection branch
Mohammad Fahes1,
Tuan-Hung Vu1,2,
Andrei Bursuc1,2,
Patrick Pérez1,2,
Raoul de Charette1
1 Inria, Paris, France.
2 valeo.ai, Paris, France.
TL; DR: PØDA (or PODA) is a simple feature augmentation method for zero-shot domain adaptation guided by a single textual description of the target domain.
Project page: https://astra-vision.github.io/PODA/
Paper: https://arxiv.org/abs/2212.03241
@InProceedings{fahes2023poda,
title={P{\O}DA: Prompt-driven Zero-shot Domain Adaptation},
author={Fahes, Mohammad and Vu, Tuan-Hung and Bursuc, Andrei and P{\'e}rez, Patrick and de Charette, Raoul},
booktitle={ICCV},
year={2023}
}
Overview of PØDA
We propose Prompt-driven Instance Normalization (PIN) to augment feature styles based on "feature/target domain description" similarity
Test on unseen youtube video of night driving:
Training dataset: Cityscapes
Prompt: "driving at night"
- News
- Installation
- Running PODA
- Inference & Visualization
- Qualitative Results
- PODA for Object Detection
- License
- Acknowledgement
- 28/10/2024: Extension of PØDA Domain Adaptation with a Single Vision-Language Embedding in on arxiv.
- 29/11/2023: Check out our recent CVPR 2024 work A Simple Recipe for Language-guided Domain Generalized Segmentation, which builds upon PIN to achieve state-of-the-art in domain generalized segmentation.
- 19/08/2023: Camera-ready version is on arxiv.
- 14/07/2023: PODA is accepted at ICCV 2023.
First create a new conda environment with the required packages:
conda env create --file environment.yml
Then activate environment using:
conda activate poda_env
-
CITYSCAPES: Follow the instructions in Cityscapes to download the images and semantic segmentation ground-truths. Please follow the dataset directory structure:
<CITYSCAPES_DIR>/ % Cityscapes dataset root ├── leftImg8bit/ % input image (leftImg8bit_trainvaltest.zip) └── gtFine/ % semantic segmentation labels (gtFine_trainvaltest.zip)
-
ACDC: Download ACDC images and ground truths from ACDC. Please follow the dataset directory structure:
<ACDC_DIR>/ % ACDC dataset root ├── rbg_anon/ % input image (rgb_anon_trainvaltest.zip) └── gt/ % semantic segmentation labels (gt_trainval.zip)
-
GTA5: Download GTA5 images and ground truths from GTA5. Please follow the dataset directory structure:
<GTA5_DIR>/ % GTA5 dataset root ├── images/ % input image └── labels/ % semantic segmentation labels
The source models are available here.
python3 main.py \
--dataset <source_dataset> \
--data_root <path_to_source_dataset> \
--data_aug \
--lr 0.1 \
--crop_size 768 \
--batch_size 2 \
--freeze_BB \
--ckpts_path saved_ckpts
python3 PIN_aug.py \
--dataset <source_dataset> \
--data_root <path_to_source_dataset> \
--total_it 100 \
--resize_feat \
--domain_desc <target_domain_description> \
--save_dir <directory_for_saved_statistics>
python3 main.py \
--dataset <source_dataset> \
--data_root <path_to_source_dataset> \
--ckpt <path_to_source_checkpoint> \
--batch_size 8 \
--lr 0.01 \
--ckpts_path adapted \
--freeze_BB \
--train_aug \
--total_itrs 2000 \
--path_mu_sig <path_to_augmented_statistics>
python3 main.py \
--dataset <dataset_name> \
--data_root <dataset_path> \
--ckpt <path_to_tested_model> \
--test_only \
--val_batch_size 1 \
--ACDC_sub <ACDC_subset_if_tested_on_ACDC>
To test any model on any image and visualize the output, please add the images to predict_test directory and run:
python3 predict.py \
--ckpt <ckpt_path> \
--save_val_results_to <directory_for_saved_output_images>
PØDA for uncommon driving situations
Our feature augmentation is task-agnostic, as it operates on the feature extractor's level. We show some results of PØDA for object detection. The metric is mAP%
❗❗❗ [08/07/2024] For the Night-Clear and Day-Foggy results, we corrected the evaluation bug from the original paper where the test split was mistakenly used instead of the train split for testing.
PØDA is released under the Apache 2.0 license.
The code heavily borrows from this implementation of DeepLabv3+, and uses code from CLIP