Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assert l1 == l2 or l1 == 0, f"Inconsistent number of segments: whisper_segments ({l1}) != timestamped_word_segments ({l2})" #205

Open
JoLiu-ai opened this issue Aug 13, 2024 · 2 comments

Comments

@JoLiu-ai
Copy link

JoLiu-ai commented Aug 13, 2024

I met this problem several times,what can I do to fix it? Thanks
Perhaps we should implement a feature to temporarily save transcribed files, allowing us to double-check the results and ensure that previous work isn't lost


WARNING:whisper_timestamped:Inconsistent number of segments: whisper_segments (339) != timestamped_word_segments (340)
Traceback (most recent call last):
  File "/usr/local/bin/whisper_timestamped", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.10/dist-packages/whisper_timestamped/transcribe.py", line 3097, in cli
    result = transcribe_timestamped(
  File "/usr/local/lib/python3.10/dist-packages/whisper_timestamped/transcribe.py", line 296, in transcribe_timestamped
    (transcription, words) = _transcribe_timestamped_efficient(model, audio,
  File "/usr/local/lib/python3.10/dist-packages/whisper_timestamped/transcribe.py", line 920, in _transcribe_timestamped_efficient
    assert l1 == l2 or l1 == 0, f"Inconsistent number of segments: whisper_segments ({l1}) != timestamped_word_segments ({l2})"
AssertionError: Inconsistent number of segments: whisper_segments (339) != timestamped_word_segments (340)
@KillerX
Copy link

KillerX commented Aug 27, 2024

I just started seeing this. Did you per chance recently start using a different whisper model?

@Jeronymous
Copy link
Member

Jeronymous commented Aug 27, 2024

There is an opened discussion on this : #79 (reply in thread)

It seems to be a corner case, that happens when the Whisper model predicts a transcript which only involves special language tokens up to the maximum token length (e.g. <|0.00|><|de|><|de|><|de|><|de|><|de|>...).

I am just waiting to have a quick way to reproduce this corner case, to be able to fix it safely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants