Crash in subtitle generation - IndexError: list index out of range #282

nebehr · 2024-07-17T12:17:52Z

Program (r192.3.4) crashes at the end of execution, but before generating a subtitle file on some videos with tiny model, but usually exits correctly with other models on the same video (it may not be directly related to the model used, just the fact that its output has or doesn't have some offending attribute).

I think this is different to the crashes that may happen at the end of processing, also reported in the original faster-whisper.

Traceback (most recent call last):
  File "D:\whisper-fast\_XXL\__main__.py", line 1668, in <module>
  File "D:\whisper-fast\_XXL\__main__.py", line 1652, in cli
  File "D:\whisper-fast\_XXL\__main__.py", line 320, in __call__
  File "D:\whisper-fast\_XXL\__main__.py", line 859, in write_result
  File "D:\whisper-fast\_XXL\__main__.py", line 802, in iterate_result_alt
  File "D:\whisper-fast\_XXL\__main__.py", line 785, in iterate_subtitles_alt
IndexError: list index out of range
[11796] Failed to execute script '__main__' due to unhandled exception!

The text was updated successfully, but these errors were encountered:

Purfview · 2024-07-17T13:16:46Z

That is when used with "--highlight_words"?
Can you repeatedly reproduce it on some file?

Share whole command used.

nebehr · 2024-07-17T14:33:28Z

No, the only command args I use are --model, --language and file name. And yes, it is consistently reproducible on the file I use.

faster-whisper-xxl.exe --model tiny -l is 101.avi

In fact, I see that some characters in the console output look like question marks (copied here as 很 or ル), which obviously do not occur in the audio and cannot occur in the selected language. Perhaps they break something during output into file?

Purfview · 2024-07-17T14:42:52Z

Can you share the json file produced with --output_format json?

nebehr · 2024-07-17T15:01:46Z

101.json

This time it crashed AFTER producing the output and "Operation finished in:" ... line. Apparently this last crash is a case of SYSTRAN/faster-whisper#71 or something similar, but seems to be unrelated to this issue.

Purfview · 2024-07-17T15:03:32Z

Can you share the message of this new crash?

nebehr · 2024-07-17T15:16:06Z

There is no message in the console, it's just a standard Windows popup saying that "program has stopped working". For each of these new crashes Windows Event Viewer contains pairs of error messages like these:

Faulting application name: faster-whisper-xxl.exe, version: 192.3.4.0, time stamp: 0x6626da66
Faulting module name: KERNELBASE.dll, version: 10.0.17763.6054, time stamp: 0xc9a93043
Exception code: 0xe06d7363
Fault offset: 0x0000000000041b39

Faulting application name: faster-whisper-xxl.exe, version: 192.3.4.0, time stamp: 0x6626da66
Faulting module name: ucrtbase.dll, version: 10.0.17763.1490, time stamp: 0x48ac8393
Exception code: 0xc0000409
Fault offset: 0x000000000006e77e

Note that, by the time it happens everything is already done and the program is exiting, and at no point it maxes out on memory. For this reason this new crash is not so bad, just inconvenient.

Purfview · 2024-07-17T15:44:02Z

IndexError: list index out of range

Can reproduce it with faster-whisper-xxl.exe 101.json command, I'll investigate it later.

This time it crashed AFTER producing the output and "Operation finished in:" ... line. Apparently this last crash is a case of SYSTRAN/faster-whisper#71 or something similar, but seems to be unrelated to this issue.

There is "beep" sound code after "Operation finished in:" ... line.
Could you try --beep_off? Do you get this crash only on this file or on all files?

nebehr · 2024-07-17T16:13:57Z

By default this second crash comes after the beep. With --beep_off it just happens in silence. The crash is reproducible with many other files, and with larger models. I have not found the pattern yet. I am running it with CUDA 12.5, not sure if it is related.

qscwdv65 · 2024-08-01T06:09:54Z

I encountered a similar error message on Ubuntu 22.04 using Faster-Whisper-XXL_r192.3.1_linux.
This is the command i use and the output:

mis@ai-ai:~/下載/Faster-Whisper-XXL_r192.3.1_linux/Whisper-Faster-XXL$ sudo ./whisper-faster-xxl "2024-08-01 09-32-20.mkv" --language Chinese --initial_prompt "這是一段主要是繁體中文(台灣)的影片：" --model large-v2
[sudo] mis 的密碼：

Standalone Faster-Whisper-XXL r192.3.1 running on: CUDA

Starting work on: 2024-08-01 09-32-20.mkv

[00:00.520 --> 00:02.800] 但是其實呢
[00:03.560 --> 00:04.520] 然後呢
(skip......)
[01:32:05.960 --> 01:32:06.540] 好
[01:32:06.540 --> 01:32:06.940] 拜拜

Transcription speed: 36.67 audio seconds/s

Traceback (most recent call last):
File "main.py", line 1633, in
File "main.py", line 1617, in cli
File "main.py", line 310, in call
File "main.py", line 849, in write_result
File "main.py", line 792, in iterate_result_alt
File "main.py", line 775, in iterate_subtitles_alt
IndexError: list index out of range
[34535] Failed to execute script 'main' due to unhandled exception!

Additional information
I was able to successfully generate an SRT file without errors using the same command but with a different, shorter (2-minute) MP4 file.

ClaireCJS · 2024-10-31T15:24:41Z

I've been randomly getting these too.

I think one was reproduceable, but a power failure made me lose track of it.

I'll keep my eye out

Pasted post from an another thread:

I'm wondering why I get these errors when I run whisper-faster-xxl.exe

Particularly since I don't have a ``d:\whisper-fast_XXL``` folder

They happen... for certain songs (1 out of 10-15), but not for others.

I can't say the exact cause, that i also can't fathom why it would be referencing a folder that doesn't exist on my D: drive ...

Transcription speed: 6.66 audio seconds/s

Traceback (most recent call last):
  File "D:\whisper-fast\_XXL\__main__.py", line 1668, in <module>
  File "D:\whisper-fast\_XXL\__main__.py", line 1652, in cli
  File "D:\whisper-fast\_XXL\__main__.py", line 320, in __call__
  File "D:\whisper-fast\_XXL\__main__.py", line 859, in write_result
  File "D:\whisper-fast\_XXL\__main__.py", line 802, in iterate_result_alt
  File "D:\whisper-fast\_XXL\__main__.py", line 785, in iterate_subtitles_alt
IndexError: list index out of range
[17684] Failed to execute script '__main__' due to unhandled exception!

Purfview · 2024-11-03T01:42:16Z

Particularly since I don't have a D:\whisper-fast\_XXL\__main__.py folder

@ClaireCJS Those are internal paths inside exe, not on your PC.

ClaireCJS · 2024-11-03T03:18:45Z

Particularly since I don't have a D:\whisper-fast\_XXL\__main__.py folder

@ClaireCJS Those are internal paths inside exe, not on your PC.

I know. It's just weird. I don't even have whisper on my D: ... I understand it's not real, it's just... weird. It's failing and knowing why would be nice? Sorry 😅

Purfview · 2024-11-06T19:34:46Z

Fixed in v193.1

nebehr · 2024-11-09T17:32:49Z

Unfortunately, it is still reproducible in v193.1, albeit with a slightly different stacktrace, but the error appears to be the same.

  File "D:\whisper-fast\_XXL\__main__.py", line 1765, in <module>
  File "D:\whisper-fast\_XXL\__main__.py", line 1732, in cli
  File "D:\whisper-fast\_XXL\__main__.py", line 750, in write_all
  File "D:\whisper-fast\_XXL\__main__.py", line 365, in __call__
  File "D:\whisper-fast\_XXL\__main__.py", line 689, in write_result
  File "D:\whisper-fast\_XXL\__main__.py", line 529, in iterate_result
IndexError: string index out of range
[4460] Failed to execute script '__main__' due to unhandled exception!

This is on attempt to use --output_format all, apparently it failed half-way through the vtt (otherwise it fails at the same point in srt). The media file is rather big though and takes long to process, which isn't conducive to more detailed investigation. I will see if I can get more details.

Purfview · 2024-11-09T17:35:40Z

Can you share json file?

nebehr · 2024-11-09T17:41:28Z

I was actually hoping to do that by asking for all formats, to save time on transcription, but apparently the "bad" one comes earlier in the queue. In what sequence are they processed with --output_format all?

Purfview · 2024-11-09T17:48:42Z

I think json is the last, I'll put it as first in the next release.

Purfview · 2024-11-09T17:56:18Z

Unfortunately, it is still reproducible in v193.1

It's not, because it's not the same bug.
Try faster-whisper-xxl.exe 101.json -f all

nebehr · 2024-11-09T18:36:42Z

Indeed, this may be related to the length of produced chunks. The model I am using does not split the text into sentences properly for some reason, therefore I am using --max_line_width with some other parameters. So, conversion from JSON to SRT fails with values of --max_line_width up to 128 (I wonder if the boundary being a power of 2 plays a factor here), but passes without it or with higher ones. The chunk where it fails (at [25:36.630 --> 26:04.010]) does appear to be the longest of the lot.

x.zip

Do you want me to create a separate issue for this?

Purfview · 2024-11-09T18:54:51Z

Share your command.

Do you want me to create a separate issue for this?

Nah.

nebehr · 2024-11-09T18:59:36Z

The one to reproduce with the attached JSON file is faster-whisper-xxl.exe x.json --max_line_width 35 -f srt.

The one where I encountered it originally in this release is faster-whisper-xxl.exe --model <CUSTOM_MODEL> -l is --max_line_width 35 --max_line_count 2 --sentence --max_comma_cent 50 <FILE_NAME>.

Purfview changed the title ~~Crash before subtitle generation~~ Crash in subtitle generation - IndexError: list index out of range Jul 17, 2024

Purfview closed this as completed Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash in subtitle generation - IndexError: list index out of range #282

Crash in subtitle generation - IndexError: list index out of range #282

nebehr commented Jul 17, 2024

Purfview commented Jul 17, 2024 •

edited

Loading

nebehr commented Jul 17, 2024 •

edited

Loading

Purfview commented Jul 17, 2024 •

edited

Loading

nebehr commented Jul 17, 2024

Purfview commented Jul 17, 2024

nebehr commented Jul 17, 2024

Purfview commented Jul 17, 2024 •

edited

Loading

nebehr commented Jul 17, 2024

qscwdv65 commented Aug 1, 2024 •

edited

Loading

ClaireCJS commented Oct 31, 2024 •

edited by Purfview

Loading

Purfview commented Nov 3, 2024 •

edited

Loading

ClaireCJS commented Nov 3, 2024 •

edited

Loading

Purfview commented Nov 6, 2024

nebehr commented Nov 9, 2024

Purfview commented Nov 9, 2024

nebehr commented Nov 9, 2024

Purfview commented Nov 9, 2024

Purfview commented Nov 9, 2024

nebehr commented Nov 9, 2024 •

edited

Loading

Purfview commented Nov 9, 2024

nebehr commented Nov 9, 2024

Crash in subtitle generation - IndexError: list index out of range #282

Crash in subtitle generation - IndexError: list index out of range #282

Comments

nebehr commented Jul 17, 2024

Purfview commented Jul 17, 2024 • edited Loading

nebehr commented Jul 17, 2024 • edited Loading

Purfview commented Jul 17, 2024 • edited Loading

nebehr commented Jul 17, 2024

Purfview commented Jul 17, 2024

nebehr commented Jul 17, 2024

Purfview commented Jul 17, 2024 • edited Loading

nebehr commented Jul 17, 2024

qscwdv65 commented Aug 1, 2024 • edited Loading

ClaireCJS commented Oct 31, 2024 • edited by Purfview Loading

Pasted post from an another thread:

Purfview commented Nov 3, 2024 • edited Loading

ClaireCJS commented Nov 3, 2024 • edited Loading

Purfview commented Nov 6, 2024

nebehr commented Nov 9, 2024

Purfview commented Nov 9, 2024

nebehr commented Nov 9, 2024

Purfview commented Nov 9, 2024

Purfview commented Nov 9, 2024

nebehr commented Nov 9, 2024 • edited Loading

Purfview commented Nov 9, 2024

nebehr commented Nov 9, 2024

Purfview commented Jul 17, 2024 •

edited

Loading

nebehr commented Jul 17, 2024 •

edited

Loading

Purfview commented Jul 17, 2024 •

edited

Loading

Purfview commented Jul 17, 2024 •

edited

Loading

qscwdv65 commented Aug 1, 2024 •

edited

Loading

ClaireCJS commented Oct 31, 2024 •

edited by Purfview

Loading

Purfview commented Nov 3, 2024 •

edited

Loading

ClaireCJS commented Nov 3, 2024 •

edited

Loading

nebehr commented Nov 9, 2024 •

edited

Loading