OPUS files and ground truth for sampling rate #157

hagenw · 2024-12-17T11:09:54Z

The GigaSpeech dataset contains audio in opus files stored at 16,000 Hz.

I attached the file POD0000002525.opus to this issue.

>>> import audiofile
>>> file = "POD0000002525.opus"
>>> audiofile.sampling_rate(file)
16000
>>> audiofile.duration(file, sloppy=True)
536.144

But when reading the file in the usual way, the sampling rate no longer matches:

>>> signal, sampling_rate = audiofile.read(file)
>>> signal.shape
(25734790,)
>>> sampling_rate
48000

This also affects audiofile.samples() and audiofile.duration() as both uses audiofile.convert_to_wav(), which relies on audiofile.read() to get the ground truth for sampling rate:

>>> audiofile.samples(file)
25734790
>>> audiofile.duration(file)
1608.424375

This could be fixed by relying on audiofile.sampling_rate() inside audiofile.convert_to_wav(), but maybe we should also check, why the opus file behaves like this.

POD0000002525.zip

The text was updated successfully, but these errors were encountered:

hagenw · 2024-12-17T14:29:35Z

ffmpeg always returns opus files wit a sampling rate of 48000 Hz, if not otherwise specified. This is known and marked as not to be fixed, see https://trac.ffmpeg.org/ticket/5240. Which means in order to fix it we need to provide ffmpeg with the desired sampling rate when converting an opus file to wav.

hagenw added the bug Something isn't working label Dec 17, 2024

hagenw linked a pull request Dec 17, 2024 that will close this issue

Fix sampling rate for reading opus files #158

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OPUS files and ground truth for sampling rate #157

OPUS files and ground truth for sampling rate #157

hagenw commented Dec 17, 2024

hagenw commented Dec 17, 2024

OPUS files and ground truth for sampling rate #157

OPUS files and ground truth for sampling rate #157

Comments

hagenw commented Dec 17, 2024

hagenw commented Dec 17, 2024