Existing object detection methods often consider sRGB input, which was compressed from RAW data using ISP originally designed for visualization. However, such compression might lose crucial information for detection, especially under complex light and weather conditions. We introduce the AODRaw dataset, which offers 7,785 high-resolution real RAW images with 135,601 annotated instances spanning 62 categories, capturing a broad range of indoor and outdoor scenes under 9 distinct light and weather conditions. Based on AODRaw that supports RAW and sRGB object detection, we provide a comprehensive benchmark for evaluating current detection methods. We find that sRGB pre-training constrains the potential of RAW object detection due to the domain gap between sRGB and RAW, prompting us to directly pre-train on the RAW domain. However, it is harder for RAW pre-training to learn rich representations than sRGB pre-training due to the camera noise. To assist RAW pre-training, we distill the knowledge from an off-the-shelf model pre-trained on the sRGB domain. As a result, we achieve substantial improvements under diverse and adverse conditions without relying on extra pre-processing modules.
Please refer to AODRaw to download and preprocess our AODRaw dataset.
Please refer to the README of mmdetection.
- We provide training and evaluation for RAW and sRGB object detection.
- The images in the AODRaw are recorded at a resolution of
$6000\times 4000$ . It is unrealistic to feed such huge images into the detectors. Thus, we adopt two experiment settings: 1) down-sampling the images into a lower resolution of$2000\times1333$ , corresponding to configs, and 2) slicing the images into a collection of$1280\times 1280$ patches, corresponding to configs. Please preprocess the AODRaw dataset or directly download the processed files in datasets and downloading.
Training and evaluation using down-sampled AODRaw:
Task | Pre-training domain | Config path |
---|---|---|
sRGB object detection | sRGB | configs/aodraw/..._aodraw_srgb.py |
RAW object detection | sRGB | configs/aodraw/..._aodraw_raw.py |
RAW object detection | RAW | configs/aodraw/..._aodraw_raw_raw-pretraining.py |
Training and evaluation using sliced AODRaw:
Task | Pre-training | Config path |
---|---|---|
sRGB object detection | sRGB | configs/aodraw_slice/..._aodraw_srgb_slice.py |
RAW object detection | sRGB | configs/aodraw_slice/..._aodraw_raw_slice.py |
RAW object detection | RAW | configs/aodraw_slice/..._aodraw_raw_slice_raw-pretraining.py |
Pre-trained weights for ConvNeXt-T and Swin-T:
Architecture | Pre-training domain | Downloading link |
---|---|---|
ConvNeXt-T | sRGB | Google and Baidu |
ConvNeXt-T | RAW | Google and Baidu |
Swin-T | RAW | Google and Baidu |
python tools/train.py ${CONFIG_FILE} [optional arguments]
bash tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
For more training and evaluation command details, please refer to mmdetection.
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [optional arguments]
For more training and evaluation command details, please refer to mmdetection.
Please follow downsampling to preprocess the images or download preprocessed images in download.
Detector | Backbone | Pre-training domain | Fine-tuning domain | AP | Config | Model | Pre-trained weights |
---|---|---|---|---|---|---|---|
Faster RCNN | ResNet-50 | sRGB | sRGB | 23.3 | Config | Google and Baidu | - |
Retinanet | ResNet-50 | sRGB | sRGB | 19.1 | Config | Google and Baidu | - |
GFocal | ResNet-50 | sRGB | sRGB | 24.2 | Config | Google and Baidu | - |
Sparse RCNN | ResNet-50 | sRGB | sRGB | 15.6 | Config | Google and Baidu | - |
Deformable DETR | ResNet-50 | sRGB | sRGB | 16.6 | confog | Google and Baidu | - |
Cascade RCNN | ResNet-50 | sRGB | sRGB | 25.6 | Config | Google and Baidu | - |
Faster RCNN | Swin-T | sRGB | sRGB | 28.4 | Config | Google and Baidu | - |
Faster RCNN | ConvNeXt-T | sRGB | sRGB | 29.7 | Config | Google and Baidu | Google and Baidu |
GFocal | Swin-T | sRGB | sRGB | 30.1 | Config | Google and Baidu | - |
GFocal | ConvNeXt-T | sRGB | sRGB | 32.1 | Config | Google and Baidu | Google and Baidu |
Cascade RCNN | Swin-T | sRGB | sRGB | 32.0 | Config | Google and Baidu | - |
Cascade RCNN | ConvNeXt-T | sRGB | sRGB | 34.0 | Config | Google and Baidu | Google and Baidu |
The directory images_downsampled_srgb is required for the above experiments.
Detector | Backbone | Pre-training domain | Fine-tuning domain | AP | Config | Model | Pre-trained weights |
---|---|---|---|---|---|---|---|
GFocal | Swin-T | sRGB | RAW | 29.9 | Config | Google and Baidu | - |
GFocal | ConvNeXt-T | sRGB | RAW | 31.5 | Config | Google and Baidu | Google and Baidu |
Cascade RCNN | Swin-T | sRGB | RAW | 31.7 | Config | Google and Baidu | - |
Cascade RCNN | ConvNeXt-T | sRGB | RAW | 33.7 | Config | Google and Baidu | Google and Baidu |
The directory images_downsampled_raw is required for the above experiments.
Detector | Backbone | Pre-training domain | Fine-tuning domain | AP | Config | Model | Pre-trained weights |
---|---|---|---|---|---|---|---|
GFocal | Swin-T | RAW | RAW | 30.7 | Config | Google and Baidu | Google and Baidu |
GFocal | ConvNeXt-T | RAW | RAW | 32.1 | Config | Google and Baidu | Google and Baidu |
Cascade RCNN | Swin-T | RAW | RAW | 32.2 | Config | Google and Baidu | Google and Baidu |
Cascade RCNN | ConvNeXt-T | RAW | RAW | 34.8 | Config | Google and Baidu | Google and Baidu |
The directory images_downsampled_raw is required for the above experiments.
Please follow slicing to preprocess the images or download preprocessed images in download.
Detector | Backbone | Pre-training domain | Fine-tuning domain | AP | Config | Model | Pre-trained weights |
---|---|---|---|---|---|---|---|
Cascade RCNN | Swin-T | sRGB | RAW | 29.2 | Config | Google and Baidu | - |
Cascade RCNN | ConvNeXt-T | sRGB | RAW | 29.7 | Config | Google and Baidu | Google and Baidu |
The directory images_slice_raw is required for the above experiments.
Detector | Backbone | Pre-training domain | Fine-tuning domain | AP | Config | Model | Pre-trained weights |
---|---|---|---|---|---|---|---|
Cascade RCNN | Swin-T | RAW | RAW | 29.8 | Config | Google and Baidu | - |
Cascade RCNN | ConvNeXt-T | RAW | RAW | 30.7 | Config | Google and Baidu | Google and Baidu |
The directory images_slice_raw is required for the above experiments.
@article{li2024aodraw,
title={Towards RAW Object Detection in Diverse Conditions},
author={Zhong-Yu Li and Xin Jin and Boyuan Sun and Chun-Le Guo and Ming-Ming Cheng},
journal={arXiv preprint arXiv:2411.15678},
year={2024},
}
The code is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License for NonCommercial use only.
This repo is modified from mmdetection.