Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoRA] Adds support for bias in LoRA #5733

Open
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

followumesh
Copy link

Motivation
PEFT, https://github.com/foundation-model-stack/fms-hf-tuning includes support for tuning LoRA bias. This PR enables bias for lora, so the adapters with bias will work with vLLM.

Changes Included

  • LoRA bias support for different types of modules.
  • LoRA bias support for fully sharded LoRA.
  • Test file test-lora-bias.py

Copy link
Collaborator

@Yard1 Yard1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add an argument to the engine enable_lora_bias and avoid initializing the bias tensors if it's false? If the user knows none of their loras will have bias, we can save memory.

@followumesh
Copy link
Author

@Yard1 Thanks for reviewing the PR. I have added the enable_lora_bias flag (default set to false), which prevents the allocation of lora bias tensors when false.

@njhill
Copy link
Member

njhill commented Jun 27, 2024

Related: #5930

Copy link
Collaborator

@Yard1 Yard1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, can we also add an e2e test?

@DarkLight1337
Copy link
Member

To speed up the CI queue for #5905, I've cancelled the distributed tests for the latest CI run in this PR since they won't pass anyway until #5905 has been merged. Please merge main into your branch after that happens so that the CI can pass once again.

@followumesh
Copy link
Author

@Yard1 Thanks for reviewing. I've added an e2e test for the lora_bias support.

@njhill
Copy link
Member

njhill commented Jul 29, 2024

@followumesh you need to run ./format.sh to fix the linting errors

@njhill
Copy link
Member

njhill commented Aug 1, 2024

@followumesh apologies, this needs another conflict resolution!

@followumesh
Copy link
Author

@Yard1 @njhill I have included the e2e test and merged the recent changes. Can you please review the commit? Thanks

@@ -64,6 +64,64 @@ def dec(*args, **kwargs):
return dec


def apply_bias(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add this inside PunicaWrapper.add_lora()? There could be an optional bias argument in add_lora() and then the logic of testing whether the bias is None and doing the index computation could be moved inside this function. It seems to me that it would eliminate repeated code lines in this file, but I don't know all the details.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxdebayser I don't have a strong opinion, but this was to keep the changes out of punica wrapper because of it not being directly releated to punica.

assert parts[0] == "base_model"
assert parts[1] == "model"
if parts[-1] == "weight":
assert parts[-2] == "lora_A" or parts[-2] == "lora_B"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion is failing a couple of lora_modules that we have:

>>> lora_modules[0].split(".")
['base_model', 'model', 'lm_head', 'weight']
>>> lora_modules[1].split(".")
['base_model', 'model', 'model', 'embed_tokens', 'weight']

Still investigating if this would have thrown an error with the previous code as well...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@followumesh actually it looks like this was a bad merge .. it's reverting a recent change that was made to improve the error message

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prashantgupta24 Can you point to a lora module I can test with?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try to find one, also reverting this change gives me the error

ValueError: base_model.model.lm_head.weight is unsupported LoRA weight

Copy link
Contributor

@prashantgupta24 prashantgupta24 Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize, since vLLM expects only LoRA weights in the safetensors file, it was actually an error in our lora adapter that we had lm_head within it. But I think the original error is now being reverted because of this change. Ideally, the error should be

ValueError: base_model.model.lm_head.weight is unsupported LoRA weight

Instead, an assertion error assert parts[-2] == "lora_A" or parts[-2] == "lora_B" is now being thrown, which doesn't provide the right detail

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, @followumesh I think the changes here are not actually related to the main PR changes, wonder if you could revert (and any other changes in same category)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prashantgupta24 Can you check now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this looks better. I'll let you know when I get a chance to test it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants