Reduce model initialization time for online speech recognition #215

w11wo · 2023-07-14T11:56:02Z

Implemented model_type parameter for online transducer models as suggested by @csukuangfj in this issue and based largely on this PR.

Add a new argument --model-type so that it only needs to load the model once.
Otherwise, it needs to load the model twice, where the first loading is to determine the model type.

I have tested csukuangfj/sherpa-onnx-streaming-zipformer-en-2023-06-26 via the Python API on Linux, by specifying model_type="zipformer2" during initialization.

The model loading time is reduced from ~6 seconds to ~3 seconds for the fp32 model; while for the int8 model the time is reduced from 4.4 seconds to 2.1 seconds.

csukuangfj · 2023-07-14T13:18:58Z

Thanks!

w11wo and others added 2 commits July 14, 2023 11:49

Reduce model initialization time for online speech recognition

dfb5007

Fixed Styling

e8dfdfe

csukuangfj merged commit 5a6b55c into k2-fsa:master Jul 14, 2023
20 checks passed

csukuangfj mentioned this pull request Jul 14, 2023

Fix model_type for jni, c# and iOS. #216

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce model initialization time for online speech recognition #215

Reduce model initialization time for online speech recognition #215

w11wo commented Jul 14, 2023

csukuangfj commented Jul 14, 2023

Reduce model initialization time for online speech recognition #215

Reduce model initialization time for online speech recognition #215

Conversation

w11wo commented Jul 14, 2023

csukuangfj commented Jul 14, 2023