Add Blip2 model in VQA pipeline #25532

jpizarrom · 2023-08-16T07:59:37Z

What does this PR do?

Add Blip2ForConditionalGeneration model in VisualQuestionAnsweringPipeline.
Fixes part of #21110 and is based on #23348 #21227 .

Who can review?

Hi @NielsRogge what do you think of this??
Thanks!

TODOs

add Blip2 model in VQA pipeline
use require_torch_gpu in test
use can_generate in vqa pipeline for Blip2ForConditionalGeneration
use float16 in the test_large_model_pt_blip2
check if it is necessary to cast the input in torch.float16 inside _forward

ArthurZucker · 2023-08-16T08:33:42Z

cc @amyeroberts and @younesbelkada

jpizarrom · 2023-08-26T09:16:58Z

Hi @amyeroberts and @younesbelkada, this PR is ready for review. Could you please take a look? Thanks :)

younesbelkada

Looking great to me, I left one comment to make sure the slow test will not blow up GPU memory in our daily CI runners!
Let's also wait for amy and @Narsil 's review before merging

tests/pipelines/test_pipelines_visual_question_answering.py

younesbelkada · 2023-08-28T08:27:43Z

tests/pipelines/test_pipelines_visual_question_answering.py

+    @slow
+    @require_torch
+    def test_large_model_pt_blip2(self):
+        vqa_pipeline = pipeline("visual-question-answering", model="Salesforce/blip2-opt-2.7b")


Suggested change

vqa_pipeline = pipeline("visual-question-answering", model="Salesforce/blip2-opt-2.7b")

vqa_pipeline = pipeline("visual-question-answering", model="Salesforce/blip2-opt-2.7b", model_kwargs={"torch_dtype": torch.float16}, device=0)

Make also sure to cast the input in torch.float16 inside _forward if needed

Hi @younesbelkada , thanks for your feedback :)
model_kwargs were updated in test_large_model_pt_blip2, as recommended by you, but i am not sure how to check if casting to torch.float16 inside _forward is needed, could you please give me some hints about what should I check? Thanks

HuggingFaceDocBuilderDev · 2023-08-28T08:50:57Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Narsil

LGTM.

I'm not a big fan of using the names of the configs directly to detect if generative or not, I feel like using the ForXXX should be a better hint.

We also used model.can_generate() as a hint in other pipelines.

Pinging @ylacombe who used that flag. (Just FYI no need to do anything).

jpizarrom · 2023-08-28T16:06:36Z

LGTM.

I'm not a big fan of using the names of the configs directly to detect if generative or not, I feel like using the ForXXX should be a better hint.

We also used model.can_generate() as a hint in other pipelines.

Pinging @ylacombe who used that flag. (Just FYI no need to do anything).

@Narsil, Thanks a lot for your feedback :)

At the moment model.can_generate() return False for Blip2ForConditionalGeneration, that is the reason why I was following the proposal of this non merged PR https://github.com/huggingface/transformers/pull/23348/files#diff-620bada7977c3d0040ed961581379598e53a9ef02fdbb26c570cac738c279c0eR64

Maybe could it be expected that can_generate method returns True for Blip2ForConditionalGeneration? if this is the case, we could use it. (i will take a look on it)

can_generate returns True for another model, does it make sense to do this in Blip2ForConditionalGeneration, or it could affect something else?

transformers/src/transformers/models/speecht5/modeling_speecht5.py

Lines 2782 to 2787 in 50573c6

    
               def can_generate(self) -> bool: 
        
                   """ 
        
                   Returns True. This model can `generate` and must therefore have this property set to True in order to be used 
        
                   in the TTS pipeline. 
        
                   """ 
        
                   return True

Narsil · 2023-08-29T11:28:52Z

I'm not the best person to comment on how can_generate works and what should or shouldn't be done.

The main thing about pipeline:

They should try to be model agnostic as much as possible so when newer models come in they work out of the box.

But the current code is acceptable.

amyeroberts

Thanks for adding this!

Before approving let's get @gante's opinion on the best/canonical way to detect if the model should generate the answer or not within the pipeline

tests/pipelines/test_pipelines_visual_question_answering.py

amyeroberts · 2023-08-29T09:08:22Z

tests/pipelines/test_pipelines_visual_question_answering.py

+        self.assertEqual(
+            outputs,
+            [{"answer": ANY(str)}],
+        )


nit: can go on one line

Suggested change

self.assertEqual(

outputs,

[{"answer": ANY(str)}],

)

self.assertEqual(outputs, [{"answer": ANY(str)}])

tests/pipelines/test_pipelines_visual_question_answering.py

gante

Regarding detection of generative models: can_generate() was built precisely to confirm whether the model can safely call generate() (in theory, all models can do it due to the inheritance structure, in practice only a few can use it). This includes pipelines uses 👍

However, I don't think we should overload the function in the class -- see my comment below, going to open a PR with a more general solution :)

cc @amyeroberts @jpizarrom

src/transformers/models/blip_2/modeling_blip_2.py

gante · 2023-08-29T17:30:09Z

Generalizable solution here ☝️

NielsRogge

Very clean and minimal PR, I like it!

amyeroberts

Thanks for adding this functionality!

NielsRogge · 2023-08-30T15:35:53Z

Feel free to tweet/linkedin about it @jpizarrom and we'll amplify :)

* Add Blip2 model in VQA pipeline * use require_torch_gpu for test_large_model_pt_blip2 * use can_generate in vqa pipeline * test Blip2ForConditionalGeneration using float16 * remove custom can_generate from Blip2ForConditionalGeneration

RainyLayx · 2024-01-16T11:28:28Z

I use the newest library of transformers,but it still reports "The model 'Blip2ForConditionalGeneration' is not supported for vqa. Supported models are ['ViltForQuestionAnswering'].",so how can I use blip2 in pipeline to deal with the vqa task?Are there any test codes of BLIP2 in pipeline?

jpizarrom · 2024-03-02T18:52:42Z

Hi @RainyLayx i was able to run this sample with transformers==4.37.1

from transformers import pipeline
import requests
from PIL import Image

vqa_pipeline = pipeline("visual-question-answering", model="Salesforce/blip2-opt-2.7b")

image =  Image.open(requests.get("https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png", stream=True).raw)
question = "Question: Is there a parrot? Answer:"

print(vqa_pipeline(image, question, top_k=1))

jpizarrom marked this pull request as ready for review August 16, 2023 15:17

younesbelkada reviewed Aug 28, 2023

View reviewed changes

Narsil approved these changes Aug 28, 2023

View reviewed changes

jpizarrom force-pushed the add_blip2_model_to_vqa_pipeline branch 2 times, most recently from f4848e1 to 1754398 Compare August 28, 2023 15:49

jpizarrom force-pushed the add_blip2_model_to_vqa_pipeline branch 3 times, most recently from 9cce838 to ff5db0c Compare August 28, 2023 19:20

amyeroberts reviewed Aug 29, 2023

View reviewed changes

gante reviewed Aug 29, 2023

View reviewed changes

src/transformers/models/blip_2/modeling_blip_2.py Outdated Show resolved Hide resolved

gante mentioned this pull request Aug 29, 2023

Generate: models with custom generate() return True in can_generate() #25838

Merged

jpizarrom added 5 commits August 29, 2023 22:36

Add Blip2 model in VQA pipeline

d79db97

use require_torch_gpu for test_large_model_pt_blip2

353b0be

use can_generate in vqa pipeline

7707f58

test Blip2ForConditionalGeneration using float16

6c41fc6

remove custom can_generate from Blip2ForConditionalGeneration

d309aa4

jpizarrom force-pushed the add_blip2_model_to_vqa_pipeline branch from ff5db0c to d309aa4 Compare August 29, 2023 20:55

NielsRogge approved these changes Aug 29, 2023

View reviewed changes

amyeroberts approved these changes Aug 30, 2023

View reviewed changes

amyeroberts merged commit 09dc995 into huggingface:main Aug 30, 2023
3 checks passed

jpizarrom deleted the add_blip2_model_to_vqa_pipeline branch August 30, 2023 20:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Blip2 model in VQA pipeline #25532

Add Blip2 model in VQA pipeline #25532

jpizarrom commented Aug 16, 2023 •

edited

Loading

ArthurZucker commented Aug 16, 2023

jpizarrom commented Aug 26, 2023 •

edited

Loading

younesbelkada left a comment

younesbelkada Aug 28, 2023 •

edited

Loading

jpizarrom Aug 28, 2023

HuggingFaceDocBuilderDev commented Aug 28, 2023

Narsil left a comment

jpizarrom commented Aug 28, 2023 •

edited

Loading

Narsil commented Aug 29, 2023

amyeroberts left a comment

amyeroberts Aug 29, 2023

gante left a comment

gante commented Aug 29, 2023

NielsRogge left a comment

amyeroberts left a comment

NielsRogge commented Aug 30, 2023 •

edited

Loading

RainyLayx commented Jan 16, 2024

jpizarrom commented Mar 2, 2024

	vqa_pipeline = pipeline("visual-question-answering", model="Salesforce/blip2-opt-2.7b")
	vqa_pipeline = pipeline("visual-question-answering", model="Salesforce/blip2-opt-2.7b", model_kwargs={"torch_dtype": torch.float16}, device=0)

Add Blip2 model in VQA pipeline #25532

Add Blip2 model in VQA pipeline #25532

Conversation

jpizarrom commented Aug 16, 2023 • edited Loading

What does this PR do?

Who can review?

TODOs

ArthurZucker commented Aug 16, 2023

jpizarrom commented Aug 26, 2023 • edited Loading

younesbelkada left a comment

Choose a reason for hiding this comment

younesbelkada Aug 28, 2023 • edited Loading

Choose a reason for hiding this comment

jpizarrom Aug 28, 2023

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Aug 28, 2023

Narsil left a comment

Choose a reason for hiding this comment

jpizarrom commented Aug 28, 2023 • edited Loading

Narsil commented Aug 29, 2023

amyeroberts left a comment

Choose a reason for hiding this comment

amyeroberts Aug 29, 2023

Choose a reason for hiding this comment

gante left a comment

Choose a reason for hiding this comment

gante commented Aug 29, 2023

NielsRogge left a comment

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

NielsRogge commented Aug 30, 2023 • edited Loading

RainyLayx commented Jan 16, 2024

jpizarrom commented Mar 2, 2024

jpizarrom commented Aug 16, 2023 •

edited

Loading

jpizarrom commented Aug 26, 2023 •

edited

Loading

younesbelkada Aug 28, 2023 •

edited

Loading

jpizarrom commented Aug 28, 2023 •

edited

Loading

NielsRogge commented Aug 30, 2023 •

edited

Loading