You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to reproduce the example with hotwords for std::unique_ptr CreateStream(const std::string &hotwords) const;
* @param The hotwords for this string, it might contain several hotwords,
* the hotwords are separated by "/". In each of the hotwords, there
* are cjkchars or bpes, the bpe/cjkchar are separated by space (" ").
* For example, hotwords I LOVE YOU and HELLO WORLD, looks like:
*
* "▁I ▁LOVE ▁YOU/▁HE LL O ▁WORLD"
Second question is what am i doing wrong if i want to pass an abbreviation when creating online stream like "▁K F C", tokens for each separate characters is in tokens file.
Third question is are you going to simplify the api for SherpaOnnxCreateOnlineStreamWithHotwords? I would assume that OnlineRecognizer does the same transformation with the hotwords file to bring it to the form "▁I ▁LOVE ▁YOU/▁HE LL O ▁WORLD" using passed vocab or tokens file. Is it possible to encapsulate this logic inside SherpaOnnxCreateOnlineStreamWithHotwords?
thank you.
The text was updated successfully, but these errors were encountered:
#1647
Hello again.
I tried to reproduce the example with hotwords for std::unique_ptr CreateStream(const std::string &hotwords) const;
* @param The hotwords for this string, it might contain several hotwords,
* the hotwords are separated by "/". In each of the hotwords, there
* are cjkchars or bpes, the bpe/cjkchar are separated by space (" ").
* For example, hotwords I LOVE YOU and HELLO WORLD, looks like:
*
* "▁I ▁LOVE ▁YOU/▁HE LL O ▁WORLD"
with those parameters:
Encoder : encoder-epoch-99-avg-1.onnx
Decoder : decoder-epoch-99-avg-1.onnx
Joiner : joiner-epoch-99-avg-1.onnx
from https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2
-Vocab : *converted to .vocab bpe.model
Tokens : tokens.txt
from https://huggingface.co/desh2608/icefall-asr-librispeech-pruned-transducer-stateless7-streaming-small/tree/main/data/lang_bpe_500
Provider : "cpu"
DecodingMethod : "modified_beam_search"
EnableEndpoint : 1
Rule1MinTrailingSilence : 2.4f
Rule2MinTrailingSilence : 0.8f
Rule3MinUtteranceLength : 20f
ModelingUnit : "bpe"
HotwordsScore : 4.0f;
ModelType : "zipformer"
other model config parameters are default.
hotwords string - "▁I ▁LOVE ▁YOU/▁HE LL O ▁WORLD" .
First question is how can i detect that it working with hotwords? It recognizes phares "I LOVE YOU" and "HELLO WORLD" with or without setting hotwords. Dunno, maybe it's a matter of accent or pronunciation, english is not my native language.
Second question is what am i doing wrong if i want to pass an abbreviation when creating online stream like "▁K F C", tokens for each separate characters is in tokens file.
Third question is are you going to simplify the api for SherpaOnnxCreateOnlineStreamWithHotwords? I would assume that OnlineRecognizer does the same transformation with the hotwords file to bring it to the form "▁I ▁LOVE ▁YOU/▁HE LL O ▁WORLD" using passed vocab or tokens file. Is it possible to encapsulate this logic inside SherpaOnnxCreateOnlineStreamWithHotwords?
thank you.
The text was updated successfully, but these errors were encountered: