Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExtractAdapters: Extract lora adapters and use them as model inputs or external initializers #1064

Merged
merged 10 commits into from
Apr 10, 2024

Conversation

jambayk
Copy link
Contributor

@jambayk jambayk commented Apr 9, 2024

Describe your changes

  • Introduce new pass ExtractAdapters which extracts the lora adapters (float or static quantized) weights and saves them in a separate file. The model graph is also modified in one of the following ways:
    1. Adapters weights are set as external tensors pointing to a non-existent file. The onnx model is thus invalid by itself as it cannot be loaded. In order to create an inference session using this model, the adapter weights must be added to a sessions options object using add_initializer or add_external_initializers.
    2. Adapter weights are converted into model inputs. The onnx model is valid. During inference, the adapter weights must be provided as part of the inputs. We call them constant inputs here since these weights don't change between runs when using the one set of adapters.
  • olive.scripts.export_adapters provided as a standalone script to directly export pre-existing adapter weights into the same format as ExtractAdapters pass.
  • OnnxDAG utils added to perform onnx graph manipulations. The methods are generic and can be used in other new passes.
  • ONNXModelHandler has two corresponding new attributes that must be names of files in the model_path directory:
    1. external_initializers_file_name
    2. constant_inputs_file_name
  • Both olive.common.ort_inference.get_ort_inference_session and OrtInferenceSession can handle the external initializers and constant inputs.

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
  • Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

@guotuofeng
Copy link
Collaborator

the external_initializers_name description need to be updated.

@guotuofeng
Copy link
Collaborator

LGTM, I will let other guys put comments.

@classmethod
def _default_config(cls, accelerator_spec: AcceleratorSpec) -> Dict[str, PassConfigParam]:
return {
"make_inputs": PassConfigParam(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to extend it with external data config?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the external related config would be enabled always, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will always be saved with external data as it is right now so we don't provide the user config options.

# raw_data is required for set_external_data
if not new_initializer.HasField("raw_data"):
new_initializer.raw_data = b""
set_external_data(new_initializer, location="dummy-location.bin")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this external data be stored under the path user running olive?
Seems it is dummy stuff? Where will we remove it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us assume the pass flows as

  1. model M -> Finetuning A -> conversion -> extract external lora weights A1
  2. model M -> Finetuning B -> conversion -> extract external lora weights B1
    Will the dummy-location.bin be overwrote?

Copy link
Contributor Author

@jambayk jambayk Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually just a place holder. Since the raw_data field is cleared, during model save, it doesn't save anything at this location. https://github.com/onnx/onnx/blob/main/onnx/external_data_helper.py#L304

This is why the description of this PR says the onnx model produced by extract as initializers is invalid because it points to non-existent file.
This is okay since we only intend to use this model using ort inference session with the missing initializers already added to session options.

python -m olive.scripts.export_adapters --adapter_path Mikael110/llama-2-7b-guanaco-qlora --dtype float16 --pack_weights --output_path models/guanaco_fp16_packed.npz
```

Snippet below shows an example runs of the generated fine-tuned model using two different adapters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a follow up PR, we want to move this snippet into a working Jypter notebook.

olive/passes/onnx/onnx_dag.py Show resolved Hide resolved
@@ -0,0 +1,100 @@
import argparse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to cover this script in Olive docs.

@jambayk jambayk merged commit fc189f9 into main Apr 10, 2024
33 checks passed
@jambayk jambayk deleted the jambayk/adapters branch April 10, 2024 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants