Add encoder decoder model #851

mht-sharma · 2023-03-03T14:37:14Z

What does this PR do?

Support encoder-decoder export and inference in ORT

Fixes #367

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

HuggingFaceDocBuilderDev · 2023-03-03T14:53:42Z

The documentation is not available anymore as the PR was closed or merged.

optimum/exporters/onnx/config.py

michaelbenayoun · 2023-03-07T09:20:53Z

optimum/exporters/onnx/config.py

@@ -304,7 +304,7 @@ def generate_dummy_inputs_for_validation(self, reference_model_inputs: Dict[str,
        return reference_model_inputs


-class EncoderDecoderOnnxConfig(OnnxSeq2SeqConfigWithPast):
+class DummyEncoderDecoderOnnxConfig(OnnxSeq2SeqConfigWithPast):


Why "Dummy"?

I want this as the base class all encoder-decoder type models inherit from. do you have other naming suggestions. Maybe EncoderDecoderBaseOnnxConfig

tests/exporters/onnx/test_exporters_onnx_cli.py

tests/exporters/onnx/test_onnx_export.py

hljjjmssyh · 2023-03-14T05:09:39Z

This feature is very useful.
I'm looking forward to this feature can be merged into master as soon as possible.

tests/onnxruntime/utils_onnxruntime_tests.py

fxmarty · 2023-04-11T12:42:22Z

tests/exporters/onnx/test_exporters_onnx_cli.py

+                    if model_type == "encoder-decoder" and task == "seq2seq-lm-with-past":
+                        # The model uses bert as decoder and does not support past key values
+                        continue


Not sure I understand this. This means that pkv is not tested, but may be supported depending on which arch is used as decoder?

fxmarty · 2023-04-11T12:43:01Z

tests/exporters/onnx/test_exporters_onnx_cli.py

+                "vision-encoder-decoder",
+                "encoder-decoder",


Can you add a comment for those as well (like above for segformer, etc.)?

tests/onnxruntime/test_modeling.py

fxmarty · 2023-04-11T12:44:44Z

tests/onnxruntime/test_modeling.py

+        if model_arch == "encoder-decoder" and use_cache is True:
+            return


tests/onnxruntime/test_modeling.py

tests/onnxruntime/utils_onnxruntime_tests.py

michaelbenayoun

Left a few comments, LGTM otherwise!

optimum/exporters/onnx/config.py

michaelbenayoun · 2023-04-25T09:13:29Z

tests/onnxruntime/test_modeling.py

@@ -3172,6 +3175,8 @@ def test_merge_from_onnx_and_save(self, model_arch):

    @parameterized.expand(grid_parameters(FULL_GRID))
    def test_compare_to_transformers(self, test_name: str, model_arch: str, use_cache: bool, use_merged: bool):
+        if model_arch == "encoder-decoder" and use_cache is True:


Suggested change

if model_arch == "encoder-decoder" and use_cache is True:

if model_arch == "encoder-decoder" and use_cache:

michaelbenayoun · 2023-04-25T09:13:40Z

tests/onnxruntime/test_modeling.py

@@ -3232,6 +3238,9 @@ def test_compare_to_transformers(self, test_name: str, model_arch: str, use_cach

    @parameterized.expand(grid_parameters(FULL_GRID))
    def test_pipeline_text_generation(self, test_name: str, model_arch: str, use_cache: bool, use_merged: bool):
+        if model_arch == "encoder-decoder" and use_cache is True:


Suggested change

if model_arch == "encoder-decoder" and use_cache is True:

if model_arch == "encoder-decoder" and use_cache:

Co-authored-by: fxmarty <[email protected]>

fxmarty · 2023-09-01T10:43:20Z

optimum/exporters/onnx/config.py

+                # TODO: validate the axis name for attention_mask
+                # common_inputs["attention_mask"][1] = "past_encoder_sequence_length + sequence_length"


This was copied from class TextSeq2SeqOnnxConfig(OnnxSeq2SeqConfigWithPast): so if in future change is made it is done in both places.L167

fxmarty · 2023-09-01T10:44:52Z

tests/onnxruntime/test_modeling.py

+        if model_arch == "encoder-decoder":
+            self.skipTest("encoder-decoder model type with use_merged=True is not supported for bert as a decoder")


Does this mean that encoder-decoder is not tested for merged onnx?

The test uses bert-bert model for testing, so only use_cache=False is used, which cannot work with use_merged=True

My question remains. There are some encoder-decoder that support past KV. I am wondering if this is tested anywhere.

Ok got it. There was no suitable model to add in testing for such a model type. I will create a custom model and add a new pr now

fxmarty · 2023-09-01T10:45:09Z

tests/onnxruntime/test_modeling.py

+        if model_arch == "encoder-decoder" and use_cache is True:
+            self.skipTest("encoder-decoder model type with use_cache=True is not supported for bert as a decoder")


same question with cache

same as above, check this comment discussion_r1162759972

fxmarty · 2023-09-01T10:46:01Z

tests/onnxruntime/test_modeling.py

+        if model_arch == "encoder-decoder":
+            use_cache = False


There should rather be a skipTest for the use_cache case.

In this particular test,use_cache=True ids only tested so the model was never going to be tested. Hence, for the particular model I changed it to False

@parameterized.expand( grid_parameters({"model_arch": SUPPORTED_ARCHITECTURES, "use_cache": [True], "use_merged": [False, True]}) )

mht-sharma requested review from fxmarty and michaelbenayoun and removed request for fxmarty March 6, 2023 11:49

michaelbenayoun reviewed Mar 7, 2023

View reviewed changes

mht-sharma force-pushed the add_encoder_decoder branch from 4792a4f to 5ff651d Compare March 21, 2023 10:30

ydshieh reviewed Mar 21, 2023

View reviewed changes

tests/onnxruntime/utils_onnxruntime_tests.py Outdated Show resolved Hide resolved

mht-sharma requested a review from michaelbenayoun March 21, 2023 12:23

fxmarty reviewed Apr 11, 2023

View reviewed changes

mht-sharma force-pushed the add_encoder_decoder branch from 0c09fd4 to 1101f56 Compare April 20, 2023 07:04

michaelbenayoun approved these changes Apr 25, 2023

View reviewed changes

mht-sharma force-pushed the add_encoder_decoder branch from 863447c to 22f9c45 Compare July 19, 2023 11:51

mht-sharma and others added 13 commits August 28, 2023 14:05

add encoder decoder model

5e6de91

update tests

8b44014

update docs and tests

6e68035

fixed tests

aa74a82

update tests

5ad004e

update tests

a88eee3

update tests

2a0abff

update tests

bbafef6

update tests

9933f1a

update tests

fb9b1af

Update tests/onnxruntime/test_modeling.py

5e3193e

Co-authored-by: fxmarty <[email protected]>

Apply suggestions from code review

5f63759

Co-authored-by: fxmarty <[email protected]>

udpate testt

6efa5d2

mht-sharma force-pushed the add_encoder_decoder branch from b635be4 to 6efa5d2 Compare August 28, 2023 12:13

mht-sharma added 2 commits August 28, 2023 14:52

change seq2seq-lm to text-generation

b5180cc

fix task

9b56a50

fic tests

72af866

mht-sharma merged commit a39b1f5 into huggingface:main Sep 1, 2023
64 of 68 checks passed

fxmarty reviewed Sep 1, 2023

View reviewed changes

This was referenced Sep 1, 2023

Add support for EncoderDecoderModel #462

Closed

Add text2text-generation-with-past test for encoder-decoder model #1338

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add encoder decoder model #851

Add encoder decoder model #851

mht-sharma commented Mar 3, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 3, 2023 •

edited

Loading

michaelbenayoun Mar 7, 2023

mht-sharma Mar 21, 2023

fxmarty Apr 11, 2023

hljjjmssyh commented Mar 14, 2023

fxmarty Apr 11, 2023

mht-sharma Apr 20, 2023

fxmarty Apr 11, 2023

fxmarty Apr 11, 2023

michaelbenayoun left a comment

michaelbenayoun Apr 25, 2023

michaelbenayoun Apr 25, 2023

fxmarty Sep 1, 2023

mht-sharma Sep 1, 2023

fxmarty Sep 1, 2023

mht-sharma Sep 1, 2023

fxmarty Sep 1, 2023

mht-sharma Sep 1, 2023

fxmarty Sep 1, 2023

mht-sharma Sep 1, 2023

fxmarty Sep 1, 2023

mht-sharma Sep 1, 2023 •

edited

Loading

		if model_arch == "encoder-decoder" and use_cache is True:
		return

		# TODO: validate the axis name for attention_mask
		# common_inputs["attention_mask"][1] = "past_encoder_sequence_length + sequence_length"

		if model_arch == "encoder-decoder":
		self.skipTest("encoder-decoder model type with use_merged=True is not supported for bert as a decoder")

Add encoder decoder model #851

Add encoder decoder model #851

Conversation

mht-sharma commented Mar 3, 2023 • edited Loading

What does this PR do?

Before submitting

HuggingFaceDocBuilderDev commented Mar 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hljjjmssyh commented Mar 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelbenayoun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mht-sharma Sep 1, 2023 • edited Loading

Choose a reason for hiding this comment

mht-sharma commented Mar 3, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 3, 2023 •

edited

Loading

mht-sharma Sep 1, 2023 •

edited

Loading