You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
python ./tests/transformers/models/mbart/test_training.py Reusing dataset glue (/root/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)
100%|███████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 682.67it/s]
100%|██████████████████████████████████████████████████████████████████████████████| 68/68 [00:01<00:00, 52.15ba/s]
100%|████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 43.15ba/s]
100%|████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 42.94ba/s]
You are using a model of type bart to instantiate a model of type mbart. This is not supported for all configurations of models and can yield errors.
Some weights of MBartForConditionalGeneration were not initialized from the model checkpoint at facebook/bart-base and are newly initialized: ['encoder.layer_norm.bias', 'decoder.layer_norm.weight', 'encoder.layer_norm.weight', 'decoder.layer_norm.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
You are using a model of type bart to instantiate a model of type mbart. This is not supported for all configurations of models and can yield errors.
Some weights of MBartForConditionalGeneration were not initialized from the model checkpoint at facebook/bart-base and are newly initialized: ['encoder.layer_norm.bias', 'decoder.layer_norm.weight', 'encoder.layer_norm.weight', 'decoder.layer_norm.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
PyTorch: setting up devices
Traceback (most recent call last):
File "./tests/transformers/models/mbart/test_training.py", line 94, in <module>
fp16=False,
File "./tests/transformers/models/mbart/test_training.py", line 44, in train
eval_dataset=dataset["validation"],
File "/opt/conda/lib/python3.7/site-packages/oslo_core-3.0.0-py3.7.egg/oslo/transformers/trainer.py", line 186, in __init__
if len(args.parallel_mode) > 0:
AttributeError: 'TrainingArguments' object has no attribute 'parallel_mode'
The problem seems to be the parallel_mode property in training_args.py is commented, line 989
# @property # def parallel_mode(self): # """ # The current mode used for parallelism if multiple GPUs/TPU cores are available. One of: # # -ParallelMode.NOT_PARALLEL: no parallelism (CPU or one GPU). # - ParallelMode.NOT_DISTRIBUTED: several GPUs in one single process (uses torch.nn.DataParallel). # - ParallelMode.DISTRIBUTED: several GPUs, each having its own process (uses # torch.nn.DistributedDataParallel). # - ParallelMode.TPU: several TPU cores. # """ # # if is_torch_tpu_available(): # # return ParallelMode.TPU # # elif is_sagemaker_mp_enabled(): # # return ParallelMode.SAGEMAKER_MODEL_PARALLEL # # elif is_sagemaker_dp_enabled(): # # return ParallelMode.SAGEMAKER_DATA_PARALLEL # if self.local_rank != -1: # return ParallelMode.DISTRIBUTED # elif self.n_gpu > 1: # return ParallelMode.NOT_DISTRIBUTED # else: # return ParallelMode.NOT_PARALLEL
The text was updated successfully, but these errors were encountered:
How to reproduce
Environment
The problem seems to be the
parallel_mode
property intraining_args.py
is commented, line 989# @property # def parallel_mode(self): # """ # The current mode used for parallelism if multiple GPUs/TPU cores are available. One of: # # -
ParallelMode.NOT_PARALLEL: no parallelism (CPU or one GPU). # -
ParallelMode.NOT_DISTRIBUTED: several GPUs in one single process (uses
torch.nn.DataParallel). # -
ParallelMode.DISTRIBUTED: several GPUs, each having its own process (uses #
torch.nn.DistributedDataParallel). # -
ParallelMode.TPU: several TPU cores. # """ # # if is_torch_tpu_available(): # # return ParallelMode.TPU # # elif is_sagemaker_mp_enabled(): # # return ParallelMode.SAGEMAKER_MODEL_PARALLEL # # elif is_sagemaker_dp_enabled(): # # return ParallelMode.SAGEMAKER_DATA_PARALLEL # if self.local_rank != -1: # return ParallelMode.DISTRIBUTED # elif self.n_gpu > 1: # return ParallelMode.NOT_DISTRIBUTED # else: # return ParallelMode.NOT_PARALLEL
The text was updated successfully, but these errors were encountered: