Remove extra label
column
#3014
Labels
blocked-by-upstream
The issue must be fixed in a dependency
bug
Something isn't working
P2
Nice to have
In example dataset https://huggingface.co/datasets/datasets-examples/doc-audio-4, we have an "unexpected" label column with only
null
values.I think it's due to a "collision" between the heuristics that define splits and/or classes based on the directories. There is a
drop_labels=True
option in the datasets library, if it helps.Ideally, in this case, we should have two splits (train and test), and no additional
label
column.I think the issue also exists with image datasets.
The text was updated successfully, but these errors were encountered: