TFDS object detection to coco format #1813

davidslater · 2022-12-06T19:22:08Z

The object detection datasets are not in standard COCO format. We should move them to output in standard COCO format.
This file is a good model:
https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/object_detection/coco.py

See here for the spec (under 1. Object Detection):
https://cocodataset.org/#format-data

We will want a format similar to:

'objects':
              tfds.features.Sequence({
                  'id': np.int64,  # coco has unique ID for each detection. We may omit this field, I think.
                  'image_id': np.int64,
                  'area': np.int64,
                  'bbox': tfds.features.BBoxFeature(),
                  'label': tfds.features.ClassLabel(num_classes=80),  # this maps down to the set of categories actually used
                  'iscrowd': np.bool_,
              }),

This requires ensuring that each dict only is a single detection.

One challenge is that the internal format for tfds does not necessarily correspond to coco format. For instance the BBoxFeature uses normalized floats [xmin, ymin, xmax, ymax] instead of coco's unnormalized floats [x, y, width, height].

Currently, some datasets have different names (e.g., xview uses boxes instead of bbox, carla datasets also, I think).
Also, some datasets (xview again) have a dict of arrays instead of a sequence of dicts.

We should be very explicit about what we expect / allow for use in Armory, and be consistent.

The text was updated successfully, but these errors were encountered:

lcadalzo · 2022-12-06T20:32:34Z

From discussion with @davidslater: the intention is to continue having raw datasets build in the tf format of normalized [y1, x1, y2, x2], but in preprocessing instead of converting to pytorch format of unnormalized [x1, y1, x2, y2] (which is what we do currently), we'll convert to the coco format of unnormalized [x1, y1, width, height]. This should enable use of pycocotools

davidslater added the datasets label Dec 6, 2022

lcadalzo added this to the TFDS v4 Integration and Data Framework Update milestone Dec 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TFDS object detection to coco format #1813

TFDS object detection to coco format #1813

davidslater commented Dec 6, 2022 •

edited

Loading

lcadalzo commented Dec 6, 2022 •

edited

Loading

TFDS object detection to coco format #1813

TFDS object detection to coco format #1813

Comments

davidslater commented Dec 6, 2022 • edited Loading

lcadalzo commented Dec 6, 2022 • edited Loading

davidslater commented Dec 6, 2022 •

edited

Loading

lcadalzo commented Dec 6, 2022 •

edited

Loading