This is the code for our papers:
- Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression
- Enhancing Geometric Factors into Model Learning and Inference for Object Detection and Instance Segmentation
@Inproceedings{zheng2020diou,
author = {Zheng, Zhaohui and Wang, Ping and Liu, Wei and Li, Jinze and Ye, Rongguang and Ren, Dongwei},
title = {Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression},
booktitle = {The AAAI Conference on Artificial Intelligence (AAAI)},
year = {2020},
}
@Article{zheng2021ciou,
author = {Zheng, Zhaohui and Wang, Ping and Ren, Dongwei and Liu, Wei and Ye, Rongguang and Hu, Qinghua and Zuo, Wangmeng},
title = {Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation},
booktitle = {IEEE Transactions on Cybernetics},
year = {2021},
}
Distance-IoU Loss into other SOTA detection methods can be found here.
This repository is a fork of generalized-iou/Detectron.pytorch and roytseng-tw/Detectron.pytorch, with an implementation of IoU, GIoU, DIoU and CIoU losses while keeping the code as close to the original as possible. It is also possible to train the network with SmoothL1 loss as in the original code. See the options below.
The loss can be chosen with the MODEL.LOSS_TYPE
option in the configuration file. The valid options are currently: [iou|giou|diou|ciou|sl1]
. At this moment, we apply bounding box loss only on final bounding box refinement layer, just as in the paper.
MODEL:
LOSS_TYPE: 'diou'
Please take a look at compute_iou
function of lib/utils/net.py for our DIoU and CIoU loss implementation in PyTorch.
We also implement a normalizer of final bounding box refinement loss. This can be specified with the MODEL.LOSS_BBOX_WEIGHT
parameter in the configuration file. The default value is 1.0
. We use MODEL.LOSS_BBOX_WEIGHT
of 12.
for the four experiments.
MODEL:
LOSS_BBOX_WEIGHT: 12.
Of course, as we observed that for dense anchor algorithms, increasing the LOSS_BBOX_WEIGHT
appropriately can improve the performance, and the same argument is obtained in GHM (AAAI 2019)and Libra R-CNN (CVPR 2019). So, if you want to get a higher AP, just increasing it. But this may also cause unstable training, because this is equivalent to increase the learning rate.
NMS can be chosen with the TEST.DIOU_NMS
option in the lib/core/config.py
file. If set it to False
, it means using greedy-NMS.
Besides that, we also found that for Faster R-CNN, we introduce beta1 for DIoU-NMS, that is DIoU = IoU - R_DIoU ^ {beta1}. With this operation, DIoU-NMS can perform better than default beta1=1.0
. In our constrained search, the following values appear to work well for the DIoU-NMS in Faster R-CNN. Of course, the default beta1=1.0
is good enough.
TEST.DIOU_NMS.BETA1=0.9
We add sample configuration files used for our experiment in config/baselines
. Our experiments in the paper are based on e2e_faster_rcnn_R-50-FPN_1x.yaml
as following:
e2e_faster_rcnn_R-50-FPN_diou_1x.yaml # Faster R-CNN + DIoU loss
e2e_faster_rcnn_R-50-FPN_ciou_1x.yaml # Faster R-CNN + CIoU loss
git clone https://github.com/Zzh-tju/DIoU-pytorch-detectron.git
Tested under python3.
- python packages
- pytorch>=0.3.1
- torchvision>=0.2.0
- cython
- matplotlib
- numpy
- scipy
- opencv
- pyyaml
- packaging
- pycocotools — for COCO dataset, also available from pip.
- tensorboardX — for logging the losses in Tensorboard
- An NVIDAI GPU and CUDA 8.0 or higher. Some operations only have gpu implementation.
- NOTICE: different versions of Pytorch package have different memory usages.
Compile the CUDA code:
cd lib # please change to this directory
sh make.sh
Create a data folder under the repo,
cd {repo_root}
mkdir data
-
COCO: Download the coco images and annotations from coco website.
And make sure to put the files as the following structure:
coco ├── annotations | ├── instances_minival2014.json │ ├── instances_train2014.json │ ├── instances_train2017.json │ ├── instances_val2014.json │ ├── instances_val2017.json │ ├── instances_valminusminival2014.json │ ├── ... | └── images ├── train2014 ├── train2017 ├── val2014 ├──val2017 ├── ...
Download coco mini annotations from here. Please note that minival is exactly equivalent to the recently defined 2017 val set. Similarly, the union of valminusminival and the 2014 train is exactly equivalent to the 2017 train set.
Feel free to put the dataset at any place you want, and then soft link the dataset under the
data/
folder:ln -s path/to/coco data/coco
Recommend to put the images on a SSD for possible better training performance
For detailed installation instruction and network training options, please take a look at the README file or issue of roytseng-tw/Detectron.pytorch. Following is a sample command we used for training and testing Faster R-CNN with DIoU and CIoU.
python tools/train_net_step.py --dataset coco2017 --cfg configs/baselines/e2e_faster_rcnn_R-50-FPN_ciou_1x.yaml --use_tfboard
python tools/test_net.py --dataset coco2017 --cfg configs/baselines/e2e_faster_rcnn_R-50-FPN_ciou_1x.yaml --load_ckpt {full_path_of_the_trained_weight}
The following example is the evaluation of our CIoU loss:
INFO json_dataset_evaluator.py: 232: ~~~~ Summary metrics ~~~~
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.387
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.586
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.420
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.213
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.418
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.515
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.320
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.502
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.527
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.325
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.560
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.680
INFO json_dataset_evaluator.py: 199: Wrote json eval results to: test/detection_results.pkl
INFO task_evaluation.py: 61: Evaluating bounding boxes is done!
INFO task_evaluation.py: 180: copypaste: Dataset: coco_2017_val
INFO task_evaluation.py: 182: copypaste: Task: box
INFO task_evaluation.py: 185: copypaste: AP,AP50,AP75,APs,APm,APl
INFO task_evaluation.py: 186: copypaste: 0.3865,0.5856,0.4196,0.2132,0.4183,0.5151
If you want to resume training from a specific iteration's weight file, please run:
python tools/train_net_step.py --dataset coco2017 --cfg configs/baselines/e2e_faster_rcnn_R-50-FPN_ciou_1x.yaml --resume --use_tfboard --load_ckpt {full_path_of_the_trained_weight}
Here are the trained models using the configurations in this repository.