can we training emformer transducer stateless with phoneme base? #585

Answered by yfyeung

trangtv57 asked this question in Q&A

trangtv57
Sep 26, 2022

I can see all script of you training transducer using sentence piece model. But As traditional model like kaldi using tdnn-lstm, I have trained model using phoneme, It's can make my system can decode with LG later, It's help me can control output using ngram+lexicon like old kaldi. But i'm not sure it's good or not. can give me some advice

Answered by yfyeung

Actually, icefall supports model training with phoneme lexicon.
Take LibriSpeech for example:
To prepare data: https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/prepare.sh#L161-L189
To generate a unique lexicon: https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/local/generate_unique_lexicon.py
To use unique lexicon duration training or decoding: https://github.com/k2-fsa/icefall/blob/master/icefall/lexicon.py#L200

View full answer

Replies: 1 comment

yfyeung
Apr 3, 2023
Collaborator

Actually, icefall supports model training with phoneme lexicon.
Take LibriSpeech for example:
To prepare data: https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/prepare.sh#L161-L189
To generate a unique lexicon: https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/local/generate_unique_lexicon.py
To use unique lexicon duration training or decoding: https://github.com/k2-fsa/icefall/blob/master/icefall/lexicon.py#L200

0 replies

Answer selected by marcoyang1998

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment