You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
Thank you for releasing the codes for pretraining MPNet!
I am trying to continue training of the language model task on a custom dataset from the released checkpoint using the --restore-file argument. However, I am not able to successfully load the checkpoint. It fails with the following error: MPNet/pretraining/fairseq/checkpoint_utils.py", line 307, in _upgrade_state _dict registry.set_defaults(state['args'], tasks.TASK_REGISTRY[state['args'].task]) KeyError: 'mixed_position_lm'
In case it helps, here is the details of the training command :
WARMUP_UPDATES=50000 # Warmup the learning rate over this many updates
PEAK_LR=0.0005 # Peak learning rate, adjust as needed
TOKENS_PER_SAMPLE=512 # Max sequence length
MAX_POSITIONS=512 # Num. positional embeddings (usually same as above)
MAX_SENTENCES=35 # Number of sequences per batch (batch size)
UPDATE_FREQ=16 # Increase the batch size 16x
DATA_DIR=data-bin
fairseq-train --fp16 $DATA_DIR \
--task masked_permutation_lm --criterion masked_permutation_cross_entropy \
--arch mpnet_base --sample-break-mode none --tokens-per-sample $TOKENS_PER_SAMPLE \
--optimizer adam --adam-betas '(0.9,0.98)' --adam-eps 1e-6 --clip-norm 0.0 \
--lr-scheduler polynomial_decay --lr $PEAK_LR --warmup-updates $WARMUP_UPDATES \
--total-num-update $TOTAL_UPDATES --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 \
--max-sentences $MAX_SENTENCES --update-freq $UPDATE_FREQ --skip-invalid-size-inputs-valid-test \
--max-update $TOTAL_UPDATES --log-format simple --log-interval 1 --input-mode 'mpnet'\
--restore-file mpnet.base/mpnet.pt --save-interval-updates 10 --ddp-backend no_c10d
I will appreciate insights on what to do to resolve this error. Thank you!
The text was updated successfully, but these errors were encountered:
@ast123 I got hold of the checkpoint file. Give this a try. If you are still trying to pretrain.
The error is because the state saves the name of the registered task, which in this checkpoint is 'mixed_position_lm'
As a quick fix, I popped both the task and criterion from the state['args'] loaded by this checkpoint, before line 307
that throws the error, so that I can use a different task and criterion to pass through the arguments.
Hello,
Thank you for releasing the codes for pretraining MPNet!
I am trying to continue training of the language model task on a custom dataset from the released checkpoint using the
--restore-file
argument. However, I am not able to successfully load the checkpoint. It fails with the following error:MPNet/pretraining/fairseq/checkpoint_utils.py", line 307, in _upgrade_state _dict registry.set_defaults(state['args'], tasks.TASK_REGISTRY[state['args'].task]) KeyError: 'mixed_position_lm'
In case it helps, here is the details of the training command :
I will appreciate insights on what to do to resolve this error. Thank you!
The text was updated successfully, but these errors were encountered: