Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add modified beam search for pruned rnn-t. #248

Merged

Conversation

csukuangfj
Copy link
Collaborator

@csukuangfj csukuangfj commented Mar 12, 2022

Training command

./pruned_transducer_stateless/train.py \
  --world-size 8 \
  --num-epochs 60 \
  --start-epoch 0 \
  --exp-dir pruned_transducer_stateless/exp \
  --full-libri 1 \
  --max-duration 300 \
  --prune-range 5 \
  --lr-factor 5 \
  --lm-scale 0.25

Tensorboard log https://tensorboard.dev/experiment/WKRFY5fYSzaVBHahenpNlA/

Decoding command

for epoch in 42; do
  for avg in 11; do
    ./pruned_transducer_stateless/decode.py \
      --epoch $epoch \
      --avg $avg \
      --exp-dir ./pruned_transducer_stateless/exp \
      --max-duration 100 \
      --decoding-method greedy_search \
      --max-sym-per-frame 1
  done
done

for epoch in 42; do
  for avg in 11; do
    ./pruned_transducer_stateless/decode.py \
      --epoch $epoch \
      --avg $avg \
      --exp-dir ./pruned_transducer_stateless/exp \
      --max-duration 100 \
      --decoding-method modified_beam_search \
      --beam-size 4
  done
done

for epoch in 42; do
  for avg in 11; do
    ./pruned_transducer_stateless/decode.py \
      --epoch $epoch \
      --avg $avg \
      --exp-dir ./pruned_transducer_stateless/exp \
      --max-duration 100 \
      --decoding-method beam_search \
      --beam-size 4
  done
done

Decoding results:

decoding method test-clean test-other comment
greedy search (--max-sym-per-frame 1) 2.62 6.37 --epoch 42 --avg 11 --max-duration 100
greedy search (--max-sym-per-frame 2) 2.62 6.37 --epoch 42 --avg 11 --max-duration 100
greedy search (--max-sym-per-frame 3) 2.62 6.37 --epoch 42 --avg 11 --max-duration 100
modified beam search (--beam-size 4) 2.56 6.27 --epoch 42 --avg 11 --max-duration 100
beam search (--beam-size 4) 2.57 6.27 --epoch 42 --avg 11 --max-duration 100

Note:

  1. The model is not trained using modified transducer.
  2. By modified beam search, it means it hardcodes --max-sym-per-frame=1 during beam search.
  3. The current implementation of beam search is super slow and we recommend using only modified beam search.
  4. For the decoding time of test-clean and test-other, see the table listed as follows:
decoding method test-clean (seconds) test-other (seconds)
greedy search (--max-sym-per-frame=1) 160 159
greedy search (--max-sym-per-frame=2) 184 177
greedy search (--max-sym-per-frame=3) 210 213
modified beam search (--beam-size 4) 273 269
beam search (--beam-size 4) 2741 2221

Will update RESULTS.md and upload the pre-trained model to hugging face later today.

@pkufool
Copy link
Collaborator

pkufool commented Mar 12, 2022

Does this model contain extra nn.Linear()?

@csukuangfj
Copy link
Collaborator Author

Does this model contain extra nn.Linear()?

Yes. It uses the code from the master.

@csukuangfj
Copy link
Collaborator Author

A pre-trained model, the decoding logs, and the decoding results are uploaded to
https://huggingface.co/csukuangfj/icefall-asr-librispeech-pruned-transducer-stateless-2022-03-12

@csukuangfj csukuangfj added ready and removed ready labels Mar 12, 2022
@csukuangfj csukuangfj merged commit bb7f6ed into k2-fsa:master Mar 12, 2022
@csukuangfj csukuangfj deleted the modified-beam-search-for-pruned-rnnt branch March 12, 2022 08:16
@danpovey
Copy link
Collaborator

A great feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants