Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Images lost when updating dataset created from a cvat task #1423

Open
nik123 opened this issue Apr 9, 2024 · 0 comments
Open

Images lost when updating dataset created from a cvat task #1423

nik123 opened this issue Apr 9, 2024 · 0 comments
Assignees

Comments

@nik123
Copy link

nik123 commented Apr 9, 2024

I have two CVAT tasks with overlapping, but not fully identical set of images. When I use update method to merge those two datasets it seems some images are lost.

Here is a code snippet to demonstrate the issue:

ds1 = dm.Dataset.import_from("dataset1", "cvat")
print(f"Size of ds1: {len(ds1)}")

ds2 = dm.Dataset.import_from("dataset2", "cvat")
print(f"Size of ds2: {len(ds2)}")

ds3 = ds1.update(ds2)
print(f"Size of ds3 (before export): {len(ds3)}")

if os.path.exists("dataset3"):
    # Rm data for clean experiment
    shutil.rmtree("dataset3")
ds3.export("dataset3", "cvat", save_media=True)

# Import dataset again:
ds3 = dm.Dataset.import_from("dataset3", "cvat")
print(f"Size of ds3 (after import): {len(ds3)}")

For the code above I get the following output:

Size of ds1: 3
Size of ds2: 2
Size of ds3 (before export): 3
Size of ds3 (after import): 2

It seems the issue is caused by the id attribute of the image tag inside CVAT annotations.xml file. If the same filename have different id in two datasets then it seems that value is lost. I've managed to solve the issue by manually overriding some attributes inside each item of the ds3 before the export:

for idx, item in enumerate(ds3):
    item.attributes["frame"] = idx

P.S.:
I've created a separate repository with the full code and data to reproduce the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants