Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training data does not improve initial Custom Speech model compared to the baseline model #58

Open
KatieProchilo opened this issue Jun 3, 2020 · 1 comment
Assignees
Labels
P1 Good to fix.

Comments

@KatieProchilo
Copy link
Contributor

KatieProchilo commented Jun 3, 2020

Using data provided by the Custom Speech team, the first Custom Speech model that users train will not improve compared to the baseline model when we train it using data in the training folder.

This results in the pipeline rightfully failing when the word error rate does not improve, but this means no releases will be created, which is a huge loss for users who want to learn about that.

Solution

The training data at that link should improve recognition against the evaluation test data (audio + human-labeled transcripts) in the testing folder at that link.

@KatieProchilo KatieProchilo added the P1 Good to fix. label Jun 3, 2020
@KatieProchilo KatieProchilo self-assigned this Jun 3, 2020
@KatieProchilo
Copy link
Contributor Author

This has been filed as CSE feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 Good to fix.
Projects
None yet
Development

No branches or pull requests

1 participant