Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prefix-tuning does not with pytorch >= 1.5.0 #358

Closed
2 of 4 tasks
alexanderhanboli opened this issue Jun 3, 2022 · 0 comments · Fixed by #359
Closed
2 of 4 tasks

prefix-tuning does not with pytorch >= 1.5.0 #358

alexanderhanboli opened this issue Jun 3, 2022 · 0 comments · Fixed by #359
Labels
bug Something isn't working

Comments

@alexanderhanboli
Copy link

Environment info

  • adapter-transformers version: 3.0.1
  • Platform: Linux-4.14.256-197.484.amzn2.x86_64-x86_64-with-glibc2.10
  • Python version: 3.8.5
  • PyTorch version (GPU?): 1.11.0+cu102 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: parallel

Information

Model I am using (Bert, XLNet ...): T5

Language I am using the model on (English, Chinese ...): English

Adapter setup I am using (if any): prefix-tuning

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

  1. export CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/pytorch/summarization/run_summarization.py --model_name_or_path "google/t5-v1_1-base" --do_train --train_adapter --adapter_config prefix_config.json

This is the prefix_config.json

{
    "architecture": "prefix_tuning",
    "encoder_prefix": true,
    "cross_prefix": true,
    "leave_out": [],
    "flat": false,
    "prefix_length": 30,
    "bottleneck_size": 512,
    "non_linearity": "tanh",
    "dropout": 0.0
}

This line (https://github.com/adapter-hub/adapter-transformers/blob/master/src/transformers/adapters/prefix_tuning.py#L49) does not work for data parallel. We need to update it to be something similar to (https://github.com/huggingface/transformers/blob/1c220ced8ecc5f12bc979239aa648747411f9fc4/src/transformers/modeling_utils.py#L121).

@alexanderhanboli alexanderhanboli added the bug Something isn't working label Jun 3, 2022
@calpt calpt linked a pull request Jun 14, 2022 that will close this issue
@calpt calpt closed this as completed in #359 Jul 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant