Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segmentation fault in 'generate_with_fallback' when temperature != 0 #223

Open
fquirin opened this issue May 10, 2023 · 4 comments
Open

Comments

@fquirin
Copy link

fquirin commented May 10, 2023

Hi @guillaumekln ,

I've mentioned this bug here for the first time: #71 (comment)

A short summary:
Whisper (faster-whisper) is running on my Windows x86 CPU inside my SEPIA STT Server test environment in a "streaming" mode where I basically feed chunks of audio (numpy float32 arrays) to WhisperModel.transcribe every time the VAD system detected a reasonable speech sequence.
When I set temperature=0 everything is stable, but when I leave temperature at its default value I get a segmentation fault error pretty reliable by making for example a coughing sound. For normal speech input it works fine.
The same error does not appear on my Linux Aarch64 system.
Unfortunately any attempt to reproduce the error with pre-recorded audio did not work so far.

Some further investigation shows that the error happens inside generate_with_fallback while iterating the temperature values and using self.model.generate (ctranslate2.models.Whisper).
The exact sequence is:

  • final_temperature = 0.0
  • tokens = [50364, 4064, 0, 50414] (example)
  • `text = "Ha!" (example)
  • final_temperature = 0.2
  • segmentation fault

If it crashes (with the right "cough" sound), it always crashes after final_temperature = 0.2.

The parameters at this step are:

encoder_output =   0.0445115 0.0431825 -0.0302493 ... 0.264346 -0.422786 -0.118524
[cpu:0 float32 storage viewed as 1x1500x384]
[prompt] =   [[50258, 50259, 50359]]
length_penalty =  1
max_length =  448
return_scores = True
return_no_speech_prob = True
suppress_blank =  True
suppress_tokens =  [-1]
max_initial_timestamp_index =  50

This is as far as I could follow the error. The rest is happening in CTranslate2 C code I think.
Hope that helps to understand the issue 🤔 .

Cu,
Florian

@guillaumekln
Copy link
Contributor

Thanks for the info.

Is it possible to save and share the audio array causing the issue? You could use np.save("audio.npy", audio_array).

@fquirin
Copy link
Author

fquirin commented May 14, 2023

Hi @guillaumekln , I did exactly that (used soundfile to export the exact same chunks to a wav file), but ... I still can't reproduce the error.
The exported WAV generates the exact same encoder output and decoder result, then skips to temperature 0.2 ... and does NOT crash. When I load the WAV via the server though I can reproduce the crash consistently.
I'm not on my PC right now but will upload the test file later here.

The only conclusion I can draw from this is that there must be a conflict with my STT server that happens only on x86 CPU though and only when temperature is not 0 🤷‍♂️😵‍💫.
I wonder if there is some threading issue but the weird thing is I can run two inference processes in parallel with no problem.

Btw I previously mentioned Windows, but actually it is Debian Linux via WSL2.
I'll try to run it on "native" Linux x86 as well.

@fquirin
Copy link
Author

fquirin commented May 15, 2023

I was able to reproduce the same error on a Fedora Linux system with x86 CPU.
Here is the WAV file I used: cough_seg_fault.zip

So to sum this up:

  • temperature > 0 works when I run the script directly (w/o server), for example: this
  • when I run the same function from inside my STT server (fastapi, uvicorn, websocket) I get the segmentation fault as soon as temperature != 0
  • it can be reproduced on WSL2 Debian Linux and Fedora on two x86 CPUs
  • it cannot be reproduced on my aarch64 Debian Linux system
  • running 2 instances of Whisper in parallel on my x86 CPUs works (tried this to rule out some threading issues, but if there were any they should happen on aarch64 as well)

@alvin-chalon
Copy link

Same error appears on my system: ubuntu on Windows WSL2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants