Finetuning/Training code ? #64
-
I see that the code used to train the model is not included in the repo. Is there any plans to publish it? It would be very useful to finetune the model and get benchmarks beyond the 0-shot transfer. Thank you very much! |
Beta Was this translation helpful? Give feedback.
Replies: 13 comments 98 replies
-
Definitely, especially for languages other than English which have had much less training so far! |
Beta Was this translation helpful? Give feedback.
-
We currently don't have plans to release training/fine-tuning code, but there might be an implementation from the community soon. |
Beta Was this translation helpful? Give feedback.
-
Founded some Fine Tuning code on HF: But not tested |
Beta Was this translation helpful? Give feedback.
-
Fine Tuning code in Japanese Kana. https://colab.research.google.com/drive/1P4ClLkPmfsaKn2tBbRp0nVjGMRKR-EWz?usp=sharing I don't know the exact code, but it appears to work. |
Beta Was this translation helpful? Give feedback.
-
I have a doubt on how to train with the prompt. (not implemented by @k-washi as far as I understand) If I understand correctly a training sample of the batch could look like this? decoder_input = [PREV, p[0], p[1], ..., p[n], EN, TRANSCRIBE, NO_TIMESTAMPS, t[0], t[1], ... t[m]]
label = [ -1, -1, ...., -1, EN, TRANSCRIBE, NO_TIMESTAMPS, t[0], t[1], ... t[m], EOT] Where:
|
Beta Was this translation helpful? Give feedback.
-
Check-out this blog for fine-tuning Whisper for multilingual ASR with Hugging Face Transformers: https://huggingface.co/blog/fine-tune-whisper It provides a step-by-step guide to fine-tuning, right from data preparation to evaluation 🤗 There'a Google Colab so you can also run it as a notebook 😉 |
Beta Was this translation helpful? Give feedback.
-
@sanchit-gandhi |
Beta Was this translation helpful? Give feedback.
-
How is possible to add a new layer above whisper it self? And change the task to be more specific, I still can’t understand how to do so. |
Beta Was this translation helpful? Give feedback.
-
Whisper recognizes audio to text almost perfect for me! But it doesn't know some very specific termins, names and abbreviations from my domain. How can I suggest to Whisper the vocabulary list or train(fine-tune) it to recognize special terms as nice as general English words? How can this "initial_prompt" parameter facilitate me in that? |
Beta Was this translation helpful? Give feedback.
-
Hi! |
Beta Was this translation helpful? Give feedback.
-
@sanchit-gandhi How to make sure that sequential finetuning works (finetuning with lang x followed by finetuning with lang y)? I have tried finetuning on Language 1 and then on Language 2. As we are only learning to update specific tokens for the given language during finetuning, it shouldn't affect the performance of other languages if they don't share a similar script. On evaluation with test data, it seems that the WER for language 1 got increased after finetuning its last checkpoint with language 2 and I can see that the predictions are based on language 2 instead of language 1.
|
Beta Was this translation helpful? Give feedback.
-
Regarding the Japanese Kana notebook: Anyone have some small dataset representation? (I won't be training Japanese so downloading that entire dataset just for reference....) |
Beta Was this translation helpful? Give feedback.
-
@sanchit-gandhi , is it posisble to finetune for Japanese if my dataset is Japanese audio to English translation? Can one finetune just for task=translation? Or am I totaly missing the right flow :) Thanks |
Beta Was this translation helpful? Give feedback.
Fine Tuning code in Japanese Kana.
https://colab.research.google.com/drive/1P4ClLkPmfsaKn2tBbRp0nVjGMRKR-EWz?usp=sharing
I don't know the exact code, but it appears to work.