Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for GIT model in VQA pipelines #23348

Conversation

marechaux
Copy link

What does this PR do?

This PR implement support for generative model in VQA pipeline (and more precisely GIT model).

Fixes part of #21110

This is my first contribution here, I am uncertain if my approach is correct. Please advise me if any modifications are necessary 😃

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@NielsRogge

@marechaux marechaux force-pushed the add-support-for-git-in-vqa-pipeline branch from c174002 to 0ec5d88 Compare May 13, 2023 12:22
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@marechaux marechaux force-pushed the add-support-for-git-in-vqa-pipeline branch from 0ec5d88 to 0cb5fb5 Compare May 15, 2023 10:05
@marechaux
Copy link
Author

I have two remarks here :

@marechaux marechaux force-pushed the add-support-for-git-in-vqa-pipeline branch 2 times, most recently from 8512048 to f799040 Compare May 16, 2023 15:50
@marechaux marechaux requested a review from NielsRogge May 16, 2023 16:14
@marechaux marechaux force-pushed the add-support-for-git-in-vqa-pipeline branch from f799040 to c3c9b03 Compare May 22, 2023 08:51
@marechaux
Copy link
Author

marechaux commented May 22, 2023

Thanks for the review, I updated my changes following your comments. However, I have several doubt on my approach.

Beam search for scores

I use beam scores to provide a score to follow the "signature" of the pipeline described here. Is it a correct ?

The beam search is so slow that it makes the pipeline test timeout, it runs locally but in more than 120s, which make me think I'm not in the right way here

Tokenizer padding

Also when I use the pipeline with microsoft/git-base-textvqa in batch mode, I have this warning :

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.

This warning is legitimate, but I don't know how to fix it as padding_side can only be set at tokenizer init.

Broken Vilt tests

Due to this change , the model used in unit test for Vilt model is now hf-internal-testing/tiny-random-ViltForQuestionAnswering instead of hf-internal-testing/tiny-vilt-random-vqa.

It seems like the new model image processor can't process images, I am not sure about how to fix it

Should I fix the model directly in the hub ?

@marechaux marechaux requested a review from NielsRogge May 22, 2023 12:59
@marechaux
Copy link
Author

Gently ping here @NielsRogge :-)

Comment on lines +136 to +157
generate_kwargs = copy.deepcopy(generate_kwargs)
generate_kwargs["return_dict_in_generate"] = True
generate_kwargs["output_scores"] = True
if "num_beams" in generate_kwargs:
if top_k > generate_kwargs["num_beams"]:
pass # raise
elif top_k > 1:
generate_kwargs["num_beams"] = top_k
else:
# activate beam search with two beam to compute scores
generate_kwargs["num_beams"] = 2
generate_kwargs["num_return_sequences"] = top_k
if "max_new_tokens" not in generate_kwargs:
# defaulting max_new_tokens to 100
generate_kwargs["max_new_tokens"] = 100
generate_outputs = self.model.generate(**model_inputs, **generate_kwargs)
model_outputs = {
"sequences_scores": generate_outputs.sequences_scores.reshape(
(model_inputs["input_ids"].shape[0], top_k)
),
"sequences": generate_outputs.sequences.reshape((model_inputs["input_ids"].shape[0], top_k, -1)),
}
Copy link
Contributor

@NielsRogge NielsRogge Jun 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for all of this, I'd just run a generate as done here. No need to check things like num_beams, num_return_sequences, output_scores, etc.

For a generative model, it will just generate an answer so no need for any top k scores. For a classifier, the top k scores make sense as we can for instance take the top 5 predicted class

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the forward here should only add **generate_kwargs

Comment on lines +166 to +174
if self.model_type == ModelType.CLASSIFIER:
probs = model_outputs.logits.sigmoid()[0]
scores, ids = probs.topk(top_k)
ids = ids.tolist()
answers = [self.model.config.id2label[_id] for _id in ids]
elif self.model_type == ModelType.GENERATIVE:
scores = model_outputs["sequences_scores"][0]
decoded_outputs = self.tokenizer.batch_decode(model_outputs["sequences"][0], skip_special_tokens=False)
answers = [self.postprocess_git_output_single(o) for o in decoded_outputs]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks OK

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants