Skip to content

lzyhha/AODRaw-mmdetection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Towards RAW Object Detection in Diverse Conditions

Paper link

Dataset link

Table of Contents

Introduction

Existing object detection methods often consider sRGB input, which was compressed from RAW data using ISP originally designed for visualization. However, such compression might lose crucial information for detection, especially under complex light and weather conditions. We introduce the AODRaw dataset, which offers 7,785 high-resolution real RAW images with 135,601 annotated instances spanning 62 categories, capturing a broad range of indoor and outdoor scenes under 9 distinct light and weather conditions. Based on AODRaw that supports RAW and sRGB object detection, we provide a comprehensive benchmark for evaluating current detection methods. We find that sRGB pre-training constrains the potential of RAW object detection due to the domain gap between sRGB and RAW, prompting us to directly pre-train on the RAW domain. However, it is harder for RAW pre-training to learn rich representations than sRGB pre-training due to the camera noise. To assist RAW pre-training, we distill the knowledge from an off-the-shelf model pre-trained on the sRGB domain. As a result, we achieve substantial improvements under diverse and adverse conditions without relying on extra pre-processing modules.

Dataset

Please refer to AODRaw to download and preprocess our AODRaw dataset.

Install

Please refer to the README of mmdetection.

Training and Evaluation

Configs and pre-trained weights

  • We provide training and evaluation for RAW and sRGB object detection.
  • The images in the AODRaw are recorded at a resolution of $6000\times 4000$. It is unrealistic to feed such huge images into the detectors. Thus, we adopt two experiment settings: 1) down-sampling the images into a lower resolution of $2000\times1333$, corresponding to configs, and 2) slicing the images into a collection of $1280\times 1280$ patches, corresponding to configs. Please preprocess the AODRaw dataset or directly download the processed files in datasets and downloading.

Training and evaluation using down-sampled AODRaw:

Task Pre-training domain Config path
sRGB object detection sRGB configs/aodraw/..._aodraw_srgb.py
RAW object detection sRGB configs/aodraw/..._aodraw_raw.py
RAW object detection RAW configs/aodraw/..._aodraw_raw_raw-pretraining.py

Training and evaluation using sliced AODRaw:

Task Pre-training Config path
sRGB object detection sRGB configs/aodraw_slice/..._aodraw_srgb_slice.py
RAW object detection sRGB configs/aodraw_slice/..._aodraw_raw_slice.py
RAW object detection RAW configs/aodraw_slice/..._aodraw_raw_slice_raw-pretraining.py

Pre-trained weights for ConvNeXt-T and Swin-T:

Architecture Pre-training domain Downloading link
ConvNeXt-T sRGB Google and Baidu
ConvNeXt-T RAW Google and Baidu
Swin-T RAW Google and Baidu

Training

Single GPU
python tools/train.py ${CONFIG_FILE} [optional arguments]
Multi GPU
bash tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]

For more training and evaluation command details, please refer to mmdetection.

Evaluation

Single GPU
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
Multi GPU
bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [optional arguments]

For more training and evaluation command details, please refer to mmdetection.

ModelZoo

Models using down-sampled AODRaw

Please follow downsampling to preprocess the images or download preprocessed images in download.

Detector Backbone Pre-training domain Fine-tuning domain AP Config Model Pre-trained weights
Faster RCNN ResNet-50 sRGB sRGB 23.3 Config Google and Baidu -
Retinanet ResNet-50 sRGB sRGB 19.1 Config Google and Baidu -
GFocal ResNet-50 sRGB sRGB 24.2 Config Google and Baidu -
Sparse RCNN ResNet-50 sRGB sRGB 15.6 Config Google and Baidu -
Deformable DETR ResNet-50 sRGB sRGB 16.6 confog Google and Baidu -
Cascade RCNN ResNet-50 sRGB sRGB 25.6 Config Google and Baidu -
Faster RCNN Swin-T sRGB sRGB 28.4 Config Google and Baidu -
Faster RCNN ConvNeXt-T sRGB sRGB 29.7 Config Google and Baidu Google and Baidu
GFocal Swin-T sRGB sRGB 30.1 Config Google and Baidu -
GFocal ConvNeXt-T sRGB sRGB 32.1 Config Google and Baidu Google and Baidu
Cascade RCNN Swin-T sRGB sRGB 32.0 Config Google and Baidu -
Cascade RCNN ConvNeXt-T sRGB sRGB 34.0 Config Google and Baidu Google and Baidu

The directory images_downsampled_srgb is required for the above experiments.

Detector Backbone Pre-training domain Fine-tuning domain AP Config Model Pre-trained weights
GFocal Swin-T sRGB RAW 29.9 Config Google and Baidu -
GFocal ConvNeXt-T sRGB RAW 31.5 Config Google and Baidu Google and Baidu
Cascade RCNN Swin-T sRGB RAW 31.7 Config Google and Baidu -
Cascade RCNN ConvNeXt-T sRGB RAW 33.7 Config Google and Baidu Google and Baidu

The directory images_downsampled_raw is required for the above experiments.

Detector Backbone Pre-training domain Fine-tuning domain AP Config Model Pre-trained weights
GFocal Swin-T RAW RAW 30.7 Config Google and Baidu Google and Baidu
GFocal ConvNeXt-T RAW RAW 32.1 Config Google and Baidu Google and Baidu
Cascade RCNN Swin-T RAW RAW 32.2 Config Google and Baidu Google and Baidu
Cascade RCNN ConvNeXt-T RAW RAW 34.8 Config Google and Baidu Google and Baidu

The directory images_downsampled_raw is required for the above experiments.

Models using sliced AODRaw

Please follow slicing to preprocess the images or download preprocessed images in download.

Detector Backbone Pre-training domain Fine-tuning domain AP Config Model Pre-trained weights
Cascade RCNN Swin-T sRGB RAW 29.2 Config Google and Baidu -
Cascade RCNN ConvNeXt-T sRGB RAW 29.7 Config Google and Baidu Google and Baidu

The directory images_slice_raw is required for the above experiments.

Detector Backbone Pre-training domain Fine-tuning domain AP Config Model Pre-trained weights
Cascade RCNN Swin-T RAW RAW 29.8 Config Google and Baidu -
Cascade RCNN ConvNeXt-T RAW RAW 30.7 Config Google and Baidu Google and Baidu

The directory images_slice_raw is required for the above experiments.

Citation

@article{li2024aodraw,
  title={Towards RAW Object Detection in Diverse Conditions}, 
  author={Zhong-Yu Li and Xin Jin and Boyuan Sun and Chun-Le Guo and Ming-Ming Cheng},
  journal={arXiv preprint arXiv:2411.15678},
  year={2024},
}

License

The code is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License for NonCommercial use only.

Acknowledgement

This repo is modified from mmdetection.

Releases

No releases published

Packages

No packages published

Languages