Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring #4

Merged
merged 6 commits into from
Aug 4, 2021
Merged

Refactoring #4

merged 6 commits into from
Aug 4, 2021

Conversation

csukuangfj
Copy link
Collaborator

TODOs

  • Add tests and documentation to transformer.py and conformer.py; Fix its style issues.

@danpovey
Copy link
Collaborator

That was fast! Thanks!

)

# TODO: Use eos_id as ignore_id.
# tgt_key_padding_mask = decoder_padding_mask(ys_in_pad, ignore_id=eos_id)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is commented out since existing models are trained with it disabled.
If it is enabled, the WER becomes worse.
We should enable it when we start to train a new model.

@csukuangfj csukuangfj changed the title WIP: Refactoring Refactoring Aug 3, 2021
@csukuangfj
Copy link
Collaborator Author

csukuangfj commented Aug 3, 2021

The following is the WER from the model trained by #3 and decoded with this pull-request:
(With n-gram LM rescoring and attention decoder. The model is trained for 26 epochs)

For test-clean, WER of different settings are:
ngram_lm_scale_0.7_attention_scale_0.6  2.96    best for test-clean
ngram_lm_scale_0.9_attention_scale_0.5  2.96
ngram_lm_scale_0.7_attention_scale_0.5  2.97
ngram_lm_scale_0.7_attention_scale_0.7  2.97
ngram_lm_scale_0.9_attention_scale_0.6  2.97
ngram_lm_scale_0.9_attention_scale_0.7  2.97
ngram_lm_scale_0.9_attention_scale_0.9  2.97
ngram_lm_scale_1.0_attention_scale_0.7  2.97
ngram_lm_scale_1.0_attention_scale_0.9  2.97
ngram_lm_scale_1.0_attention_scale_1.0  2.97
ngram_lm_scale_1.0_attention_scale_1.1  2.97
ngram_lm_scale_1.0_attention_scale_1.2  2.97
ngram_lm_scale_1.0_attention_scale_1.3  2.97
ngram_lm_scale_1.1_attention_scale_0.9  2.97

---

For test-other, WER of different settings are:
ngram_lm_scale_1.0_attention_scale_0.9  6.65    best for test-other
ngram_lm_scale_1.1_attention_scale_1.1  6.65
ngram_lm_scale_0.9_attention_scale_0.7  6.66
ngram_lm_scale_1.0_attention_scale_1.0  6.66
ngram_lm_scale_1.0_attention_scale_1.1  6.66
ngram_lm_scale_0.9_attention_scale_1.0  6.67
ngram_lm_scale_1.0_attention_scale_0.7  6.67
ngram_lm_scale_1.0_attention_scale_1.2  6.67
ngram_lm_scale_1.0_attention_scale_1.3  6.67
ngram_lm_scale_0.9_attention_scale_0.5  6.68
ngram_lm_scale_0.9_attention_scale_0.6  6.68
ngram_lm_scale_0.9_attention_scale_0.9  6.68
ngram_lm_scale_0.9_attention_scale_1.1  6.68
ngram_lm_scale_0.9_attention_scale_1.3  6.68
ngram_lm_scale_0.9_attention_scale_1.5  6.68

Epochs 14-26 are used in model averaging.


I have uploaded the above checkpoints to
https://huggingface.co/csukuangfj/conformer_ctc/tree/main

To reproduce the decoding result:

  1. clone the above repo containing checkpoints and put it into conformer_ctc/exp/
  2. after step 1, you should have conformer_ctc/exp/epoch-{14,15,...,26}.pt
  3. run
./prepare.sh
./conformer_ctc/decode.py --epoch 26 --avg 13 --max-duration=50
  1. You should get the above result.

The results are expected to become better if trained with more epochs.
I will rerun the training with the bug in k2-fsa/snowfall#242 fixed.

@danpovey
Copy link
Collaborator

danpovey commented Aug 3, 2021 via email

@pzelasko
Copy link
Collaborator

pzelasko commented Aug 3, 2021

Nice! I'm curious -- did you ever try to run the same thing but with MMI instead of CTC?

@csukuangfj
Copy link
Collaborator Author

Nice! I'm curious -- did you ever try to run the same thing but with MMI instead of CTC?

yes, I am planning to do that with a pretrained P. All the related code can be found in snowfall.

@csukuangfj
Copy link
Collaborator Author

Merging it to avoid conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants