Name	Name	Last commit message	Last commit date
parent directory ..
data/kitti_360	data/kitti_360
experiments	experiments
images	images
include	include
panoptic_bev	panoptic_bev
plot	plot
scripts	scripts
src	src
test	test
.gitattributes	.gitattributes
.gitignore	.gitignore
README.md	README.md
cuda_11.3_env	cuda_11.3_env
file	file
requirements.txt	requirements.txt
scripts_inference.sh	scripts_inference.sh
scripts_training.sh	scripts_training.sh
setup.py	setup.py

SeaBird: Segmentation in Bird's View with Dice Loss Improves Monocular 3D Detection of Large Objects

KITTI-360 Demo

Abhinav Kumar¹ · Yuliang Guo² · Xinyu Huang² · Liu Ren² · Xiaoming Liu¹
¹Michigan State University, ²Bosch Research North America, Bosch Center for AI

in CVPR 2024

Monocular 3D detectors achieve remarkable performance on cars and smaller objects. However, their performance drops on larger objects, leading to fatal accidents. Some attribute the failures to training data scarcity or the receptive field requirements of large objects. In this paper, we highlight this understudied problem of generalization to large objects. We find that modern frontal detectors struggle to generalize to large objects even on nearly balanced datasets. We argue that the cause of failure is the sensitivity of depth regression losses to noise of larger objects. To bridge this gap, we comprehensively investigate regression and dice losses, examining their robustness under varying error levels and object sizes. We mathematically prove that the dice loss leads to superior noise-robustness and model convergence for large objects compared to regression losses for a simplified case. Leveraging our theoretical insights, we propose SeaBird (Segmentation in Bird's View) as the first step towards generalizing to large objects. SeaBird effectively integrates BEV segmentation on foreground objects for 3D detection, with the segmentation head trained with the dice loss. SeaBird achieves SoTA results on the KITTI-360 leaderboard and improves existing detectors on the nuScenes leaderboard, particularly for large objects.

Much of the codebase is based on PanopticBEV. Some implementations are from BBAVectors and our DEVIANT.

Citation

If you find our work useful in your research, please consider starring the repo and citing:

@inproceedings{kumar2024seabird,
   title={{SeaBird: Segmentation in Bird's View with Dice Loss Improves Monocular $3$D Detection of Large Objects}},
   author={Kumar, Abhinav and Guo, Yuliang and Huang, Xinyu and Ren, Liu and Liu, Xiaoming},
   booktitle={CVPR},
   year={2024}
}

Setup

Requirements
1. Python 3.7
2. PyTorch 1.11
3. Torchvision 0.12
4. Cuda 11.3
5. Ubuntu 20.04
This is tested with NVIDIA RTX6000 (48 GB) GPU. Other platforms have not been tested. Clone the repo first. Unless otherwise stated, the below scripts and instructions assume the working directory is the directory SeaBird/PanopticBEV:
```
git clone https://github.com/abhi1kumar/SeaBird.git
cd PanopticBEV
```
Cuda & Python
- Create a python conda environment and activate it.
```
conda create -n panoptic python=3.7 -y
conda activate panoptic
conda install -c conda-forge ipython -y
```
- Point to Cuda
```
source cuda_11.3_env
```
  For MSU HPCC, use the command module load CUDA/11.0.2 cuDNN/8.0.4.30-CUDA-11.0.2 instead of source cuda_11.3_env.
- Install the python dependencies using the requirements.txt file.
```
pip install -r requirements.txt
```
- Compile the PanopticBEV code with Cuda.
```
python3 setup.py develop
```
- Compile DoTA devkit for polygon NMS.
```
cd panoptic_bev/data/DOTA_devkit
swig -c++ -python polyiou.i
python setup.py build_ext --inplace
cd ../../..
```
KITTI-360 Data
- Download the KITTI-360 and KITTI-360 PanopticBEV datasets.
- Download the processed KITTI-360 train_val and dummy testing labels. Extract them.
- Arrange datasets as

SeaBird/PanopticBEV
├── data
│      └── kitti_360
│             ├── ImageSets
│             ├── KITTI-360
│             │      ├── calibration
│             │      ├── data_2d_raw
│             │      ├── data_2d_semantics
│             │      ├── data_3d_boxes
│             │      └── data_poses
│             ├── kitti360_panopticbev
│             │      ├── bev_msk
│             │      ├── class_weights
│             │      ├── front_msk_seam
│             │      ├── front_msk_trainid
│             │      ├── img
│             │      ├── split
│             │      ├── metadata_front.bin
│             │      └── metadata_ortho.bin 
│             ├── train_val
│             │      ├── calib
│             │      ├── label
│             │      └── label_dota
│             └── testing
│                    ├── calib
│                    ├── label
│                    └── label_dota
│ ...

Copy metadata_ortho.bin file

cp data/kitti_360/kitti360_panopticbev/metadata_ortho.bin data/kitti_360/

Next, extract the BEV Segmentation labels and finally link the corresponding images.

python data/kitti_360/generate_BEV_semantic_seg_GT.py
python data/kitti_360/setup_split.py

You should see the following structure with 61056 samples in each sub-folder of train_val split, and 910 samples in each sub-folder of testing split.

SeaBird/PanopticBEV
├── data
│      └── kitti_360
│             ├── ImageSets
│             ├── train_val
│             │      ├── calib
│             │      ├── image
│             │      ├── label
│             │      ├── label_dota
│             │      ├── panop
│             │      └── seman
│             ├── testing
│             │      ├── calib
│             │      ├── image
│             │      ├── label
│             │      ├── label_dota
│             │      ├── panop
│             │      └── seman
│             └── metadata_ortho.bin

Training

Train the model:

chmod +x scripts_training.sh
bash scripts_training.sh

The configuration parameters of the model are in the experiments/ folder where you can modify the model parameters.

Testing

Model Zoo

We provide logs/models/predictions for the main experiments on KITTI-360 Val /KITTI-360 Test data splits available to download here.

Data_Splits	Method	Config (Run)	Weight /Pred	Metrics	Lrg (50)	Car (50)	Mean (50)	Lrg (25)	Car (25)	Mean (25)	Lrg Seg	Car Seg	Mean Seg
KITTI-360 Val	Stage 1	seabird_val_stage1	gdrive	IoU	-	-	-	-	-	-	23.83	48.54	36.18
KITTI-360 Val	PBEV+SeaBird	seabird_val	gdrive	AP	13.22	42.46	27.84	37.15	52.53	44.84	24.30	48.04	36.17
KITTI-360 Test	PBEV+SeaBird	seabird_test	gdrive	AP	-	-	4.64	-	-	37.12	-	-	-

Testing Pre-trained Models

Make output folder in the SeaBird directory:

mkdir output

Place models in the output folder as follows:

SeaBird/PanopticBEV
├── output
│      ├── pbev_seabird_kitti360_val_stage1
│      │       └── saved_models
│      │              └── model_19.pth
│      │
│      ├── pbev_seabird_kitti360_val
│      │       └── saved_models
│      │              └── model_19.pth
│      │
│      └── pbev_seabird_kitti360_test
│              └── saved_models
│                     └── model_9.pth

To test, execute the following command:

chmod +x scripts_inference.sh
bash scripts_inference.sh

Qualitative Plots/Visualization

To get qualitative plots and visualize the predicted+GT boxes, type the following:

python plot/plot_qualitative_output.py --dataset kitti_360 --folder output/pbev_seabird_kitti360_val/results_test/data

The above script visualizes boxes in frontal and BEV. To visualize BEV segmentation results with the predicted+GT boxes, run scripts_inference.sh with --save_seg command. Finally, run the above plotting command.

Type the following to reproduce our other plots:

python plot/plot_teaser_histogram.py
python plot/plot_convergence_analysis.py
python plot/plot_lengthwise_analysis.py
python plot/plot_category_wise_stats.py

Acknowledgements

We thank the authors of the following awesome codebases:

Please also consider citing them.

Contributions

We welcome contributions to the SeaBird repo. Feel free to raise a pull request.

License

SeaBird and BBAVectors code are under the MIT license. The PanopticBEV code is under the GPLv3 license for academic usage. For commercial usage of PanopticBEV, please contact Nikhil Gosala.

Contact

For questions, feel free to post here or drop an email to this address- [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PanopticBEV

PanopticBEV

README.md

SeaBird: Segmentation in Bird's View with Dice Loss Improves Monocular 3D Detection of Large Objects

KITTI-360 Demo

Citation

Setup

Training

Testing

Model Zoo

Testing Pre-trained Models

Qualitative Plots/Visualization

Acknowledgements

Contributions

License

Contact

Files

PanopticBEV

Directory actions

More options

Directory actions

More options

Latest commit

History

PanopticBEV

Folders and files

parent directory

README.md

SeaBird: Segmentation in Bird's View with Dice Loss Improves Monocular 3D Detection of Large Objects

KITTI-360 Demo

Citation

Setup

Training

Testing

Model Zoo

Testing Pre-trained Models

Qualitative Plots/Visualization

Acknowledgements

Contributions

License

Contact