You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug:
The converter seems to stick to the first sample rate that has been fed into it, and refuse to convert audios with any other speech rates.
Describe how to reproduce:
importtorchfromfairseq2.data.audioimportWaveformToFbankConverter# Because the two converters are initialized identically, I expect them to behave identicallyconverter1=WaveformToFbankConverter()
converter2=WaveformToFbankConverter()
# Define two equivalent audios; the second is the first, downsampled.input1= {
"waveform": torch.randn([2, 90_000]),
"sample_rate": 48000,
"format": -1,
}
input2= {
"waveform": input1['waveform'][:, ::3],
"sample_rate": 16000,
"format": -1,
}
converted1_1=converter1(input1)
converted2_2=converter2(input2)
# the above conversions work fine, just as expected# expect the same output as converted2_2converted1_2=converter1(input2)
# ValueError: The input waveform must have a sample rate of 48000, but has a sample rate of 16000 instead.# expect the same output as converted1_1converted2_1=converter2(input1)
# ValueError: The input waveform must have a sample rate of 16000, but has a sample rate of 48000 instead.
Describe the expected behavior:
This implicit dependence of the first input is not expected; a more appropriate behavior would be either to explicitly specify the desired sample rate when initializing the converter, or to support inputs with any speech rate.
Environment:
At the very least, specify the versions of fairseq2, PyTorch, Python, and CUDA along with your operating system and, if relevant, GPU model.
I am using python3.8, fairseq2==0.2.0, pytorch 2.1.1+cu118. But I believe this is irrelevant.
Additional Context:
Add any other context about the bug here.
The text was updated successfully, but these errors were encountered:
Describe the bug:
The converter seems to stick to the first sample rate that has been fed into it, and refuse to convert audios with any other speech rates.
Describe how to reproduce:
Describe the expected behavior:
This implicit dependence of the first input is not expected; a more appropriate behavior would be either to explicitly specify the desired sample rate when initializing the converter, or to support inputs with any speech rate.
Environment:
At the very least, specify the versions of fairseq2, PyTorch, Python, and CUDA along with your operating system and, if relevant, GPU model.
I am using python3.8, fairseq2==0.2.0, pytorch 2.1.1+cu118. But I believe this is irrelevant.
Additional Context:
Add any other context about the bug here.
The text was updated successfully, but these errors were encountered: