Using WaveformToFbankConverter with variable sample rates is impossible #341

avidale · 2024-02-19T11:43:25Z

Describe the bug:
The converter seems to stick to the first sample rate that has been fed into it, and refuse to convert audios with any other speech rates.

Describe how to reproduce:

import torch
from fairseq2.data.audio import WaveformToFbankConverter

# Because the two converters are initialized identically, I expect them to behave identically
converter1 = WaveformToFbankConverter()
converter2 = WaveformToFbankConverter()

# Define two equivalent audios; the second is the first, downsampled.
input1 = {
    "waveform": torch.randn([2, 90_000]),
    "sample_rate": 48000,
    "format": -1,
}
input2 = {
    "waveform": input1['waveform'][:, ::3],
    "sample_rate": 16000,
    "format": -1,
}

converted1_1 = converter1(input1)
converted2_2 = converter2(input2)
# the above conversions work fine, just as expected

# expect the same output as converted2_2
converted1_2 = converter1(input2) 
# ValueError: The input waveform must have a sample rate of 48000, but has a sample rate of 16000 instead.

# expect the same output as converted1_1
converted2_1 = converter2(input1) 
# ValueError: The input waveform must have a sample rate of 16000, but has a sample rate of 48000 instead.

Describe the expected behavior:
This implicit dependence of the first input is not expected; a more appropriate behavior would be either to explicitly specify the desired sample rate when initializing the converter, or to support inputs with any speech rate.

Environment:
At the very least, specify the versions of fairseq2, PyTorch, Python, and CUDA along with your operating system and, if relevant, GPU model.
I am using python3.8, fairseq2==0.2.0, pytorch 2.1.1+cu118. But I believe this is irrelevant.

Additional Context:
Add any other context about the bug here.

avidale added the bug Something isn't working label Feb 19, 2024

avidale mentioned this issue Feb 19, 2024

Wrong result for traditional Chinese facebookresearch/seamless_communication#362

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using WaveformToFbankConverter with variable sample rates is impossible #341

Using WaveformToFbankConverter with variable sample rates is impossible #341

avidale commented Feb 19, 2024

Using WaveformToFbankConverter with variable sample rates is impossible #341

Using WaveformToFbankConverter with variable sample rates is impossible #341

Comments

avidale commented Feb 19, 2024