Model parallelism of pretrained models. #4452
Unanswered
ryuryu18yaki
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I try to do fine-tuning of whisper-large-model with AWS sagemaker model parallelism. But I cannot understand how to code a training script. Could you teach me about 2 topics under here.
1, How to make pretrained model with wrapped "smp.DistributedModel".
I want to load model from OpenAI's GitHub like this.
import whisper
model = whisper.load_model("large")
However, if I need to do another library, I am willing to do.
2, Model parallelism have three methods (sharded data parallel, pipeline parallel and tensor parallel).
Which methods should I use?
Now, I think to use sharded data parallel.
My English is poor, but I hope you can answer me.
Beta Was this translation helpful? Give feedback.
All reactions