Model parallelism of pretrained models. #4452

ryuryu18yaki · 2023-10-08T10:51:31Z

ryuryu18yaki
Oct 8, 2023

Hi, I try to do fine-tuning of whisper-large-model with AWS sagemaker model parallelism. But I cannot understand how to code a training script. Could you teach me about 2 topics under here.

1, How to make pretrained model with wrapped "smp.DistributedModel".

I want to load model from OpenAI's GitHub like this.

import whisper
model = whisper.load_model("large")

However, if I need to do another library, I am willing to do.

2, Model parallelism have three methods (sharded data parallel, pipeline parallel and tensor parallel).
Which methods should I use?
Now, I think to use sharded data parallel.

My English is poor, but I hope you can answer me.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model parallelism of pretrained models. #4452

{{title}}

Replies: 0 comments

Select a reply

Model parallelism of pretrained models. #4452

ryuryu18yaki Oct 8, 2023

Replies: 0 comments

ryuryu18yaki
Oct 8, 2023