Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resetting stream without resetting encoder states resulting in deletion errors during streaming decoding #1700

Open
srinivasakm opened this issue Jan 12, 2025 · 0 comments

Comments

@srinivasakm
Copy link

https://github.com/k2-fsa/sherpa-onnx/blob/ecc653871d305c79002d2630c7cf0d0e1d6bf1ed/sherpa-onnx/csrc/online-recognizer-transducer-impl.h#L382C1-L383C53
@csukuangfj
We have observed an issue since version 1.10.0 of sherpa-onnx. Specifically, the encoder state reset functionality was commented out in the following file:
Path: sherpa-onnx/sherpa-onnx/csrc/online-recognizer-transducer-impl.h

// reset encoder states
// s->SetStates(model_->GetEncoderInitStates());

(resetting encoder states was included as part of 1.9.26 issue924 and commented as part of 1.10.0 (after 1.9.29))

In our experiments using models trained with the Zipformer-2 encoder and stateless transducer across multiple Indic languages, we noticed the following behavior:

  1. When we reset the stream object during streaming decoding (link), the commented-out encoder states reset code (as in link) results in deletion errors in transcriptions after the end-point detection
  2. This issue occurs while decoding real-time conversational audio with different silence gaps in between.

However, when we uncomment the encoder states reset code, the issue is resolved/reduced, and transcription accuracy improves.

Questions:

  1. Is there a specific reason for commenting out the encoder reset functionality?
  2. Was this change driven by certain experimental results?
  3. Are there any recommended workarounds to handle this scenario effectively?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant