Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for batching support #29297

Merged
merged 31 commits into from
Mar 12, 2024
Merged

Conversation

zucchini-nlp
Copy link
Member

@zucchini-nlp zucchini-nlp commented Feb 26, 2024

What does this PR do?

Fixes #29179 . Currently the tests are passing for all models, and batched and unbatched outputs have nearly same outputs.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@gante

@zucchini-nlp zucchini-nlp marked this pull request as draft February 26, 2024 13:23
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@gante
Copy link
Member

gante commented Feb 26, 2024

@zucchini-nlp Running your test script on my setup I get

tensor(0., device='cuda:0')
tensor(4.7684e-07, device='cuda:0')

Could it be due to different hardware and/or versions? 🤔 For context, I'm running with CUDA 12.1, torch==2.3.0.dev20240208+cu121, and an nvidia RTX3090

Copy link
Member

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this important test case 🔥

src/transformers/models/tvp/modeling_tvp.py Show resolved Hide resolved
tests/models/timm_backbone/test_modeling_timm_backbone.py Outdated Show resolved Hide resolved
tests/test_modeling_common.py Outdated Show resolved Hide resolved
tests/test_modeling_common.py Outdated Show resolved Hide resolved
tests/test_modeling_common.py Outdated Show resolved Hide resolved
tests/test_modeling_common.py Outdated Show resolved Hide resolved
tests/test_modeling_common.py Outdated Show resolved Hide resolved
@zucchini-nlp
Copy link
Member Author

Using cosine distance to check cv/audio models helped to solve the issue. The way we check if the model is conv-based or not might not be the best option, I will see if I can make it better.

Now everything is passing the test, yay!

Copy link
Member

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The end result is very cool! Thank you for iterating on this 🔥

tests/models/vilt/test_modeling_vilt.py Outdated Show resolved Hide resolved
@gante
Copy link
Member

gante commented Feb 28, 2024

Next steps:

  • make fixup :D
  • LayoutLMv2ModelTest seems to be failing. Perhaps skip it as well?
  • Update the PR header (we no longer have issues with CV models)
  • Hit ready to review so it is no longer a draft PR

Then:

  • Tag both core maintainers (Arthur to review since he's on rotation, Amy to have a look on the CV model skips)

@zucchini-nlp zucchini-nlp marked this pull request as ready for review February 28, 2024 13:49
@zucchini-nlp
Copy link
Member Author

@amyeroberts @ArthurZucker ready for review

@gante
Copy link
Member

gante commented Feb 28, 2024

@amyeroberts: you're not on rotation, but this PR includes a new test that, to make CI green, @zucchini-nlp had to a) make minor CV model changes; b) add skips on CV models 🤗

Let us know if you find anything wrong! If the skips are not supposed to be skips, then I'd like to ask to keep them as TODOs so as to be tended by someone with CV expertise 🙏

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this - this is a great addition to making sure our models are well behaved 💪

Just a few small comments

tests/models/timm_backbone/test_modeling_timm_backbone.py Outdated Show resolved Hide resolved
tests/models/vilt/test_modeling_vilt.py Outdated Show resolved Hide resolved
tests/models/mra/test_modeling_mra.py Outdated Show resolved Hide resolved
tests/models/layoutlmv2/test_modeling_layoutlmv2.py Outdated Show resolved Hide resolved
tests/test_modeling_common.py Outdated Show resolved Hide resolved
tests/test_modeling_common.py Show resolved Hide resolved
Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding - great work! 💪

tests/test_modeling_common.py Outdated Show resolved Hide resolved
tests/test_modeling_common.py Outdated Show resolved Hide resolved
tests/test_modeling_common.py Show resolved Hide resolved
Comment on lines 767 to 768
if "head_mask" in key:
single_row_input[key] = value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this condition given the if/elif structure and L776

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, fixed the cases when batch_size == num_hidden_layers in certain testers, so there's no need for extra check now. Removed this if

@gante
Copy link
Member

gante commented Mar 5, 2024

@zucchini-nlp the change in batch size has made the whisper tests unhappy :D after that is cleared up, we can merge this PR 🤗

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice PR! Thanks

FYI @ydshieh 😉

@zucchini-nlp
Copy link
Member Author

This PR is now ready to be merged. An unexpected bug was found yesterday, but it's fixed already. I will also merge main to resolve conflicts
@gante

@gante
Copy link
Member

gante commented Mar 8, 2024

@zucchini-nlp needs a make fixup and it is ready to be merged

@gante gante merged commit 8e64ba2 into huggingface:main Mar 12, 2024
18 checks passed
@ydshieh
Copy link
Collaborator

ydshieh commented Mar 13, 2024

Hi @zucchini-nlp Thank you for your work. Our CI reports 2 failing tests:

tests/models/mamba/test_modeling_mamba.py::MambaModelTest::test_batching_equivalence
(line 726)  AttributeError: 'MambaCache' object has no attribute 'dim'

and

tests/models/seggpt/test_modeling_seggpt.py::SegGptModelTest::test_batching_equivalence
(line 765)  AssertionError: tensor(False, device='cuda:0') is not true : Batched and Single row outputs are not equal in SegGptModel for key=hidden_states. Difference=0.1510644555091858.

If you can take a look on these, that would be great 🙏 Thank you.

@gante
Copy link
Member

gante commented Mar 13, 2024

@ydshieh (and @amyeroberts, since you're working on the test fetcher): in this case, the new failures arise because the branch merged in this PR, which adds a new mixin test, was out of sync with main with respect to new models. Is there any way we can prevent this without too much effort?

I'm not seeing an automated way to do it. The only thing that comes to mind is to ask to rebase before merging this type of PRs :D

@ydshieh
Copy link
Collaborator

ydshieh commented Mar 14, 2024

@gante Yeah, I don't see an satisfying way neither.

One possibility is to change circleCI config file to rebase (on the fly) on the latest main during the CI runs.

@gante
Copy link
Member

gante commented Mar 14, 2024

One possibility is to change circleCI config file to rebase (on the fly) on the latest main during the CI runs.

@ydshieh That could be... interesting! Try to rebase with main, roll back if a conflict arises, then run tests. If an informative message is added in the report (e.g. "the tests were run after rebasing on main"), we might save ourselves a few situations like this :D

@amyeroberts
Copy link
Collaborator

One possibility is to change circleCI config file to rebase (on the fly) on the latest main during the CI runs.

I'm not sure this is a great idea, for two reasons:

  • the code version being used to run tests is not the version on the branch, which can lead to all sorts of confusion
  • It's pretty common in development to not rebase for a few commits. Even though technically PRs should be opened when the branch is review ready, this isn't what happens. It's going to end up forcing people to constantly rebase and force push, even if there aren't meaningful commits being added. As someone who gets a notification for every commit being pushed, I'd really rather avoid this.

To test this out, I'd rather have an optional feature, similar to the run all tests command we can have in commits e.g. run-with-base, than we can comment on github, or ask users to run asa final step.

@gante
Copy link
Member

gante commented Mar 14, 2024

than we can comment on github

That would also be quite cool -- and could save some time from the core maintainers, if the other reviewers diligently test against main on wide-reaching changes before merging 🤗

@ydshieh
Copy link
Collaborator

ydshieh commented Mar 14, 2024

Totally agree with @amyeroberts that's why I say I don't see an satisfying way

itazap pushed a commit that referenced this pull request May 14, 2024
* add tests for batching support

* Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py

Co-authored-by: Joao Gante <[email protected]>

* Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py

Co-authored-by: Joao Gante <[email protected]>

* Update tests/test_modeling_common.py

Co-authored-by: Joao Gante <[email protected]>

* Update tests/test_modeling_common.py

Co-authored-by: Joao Gante <[email protected]>

* Update tests/test_modeling_common.py

Co-authored-by: Joao Gante <[email protected]>

* fixes and comments

* use cosine distance for conv models

* skip mra model testing

* Update tests/models/vilt/test_modeling_vilt.py

Co-authored-by: Joao Gante <[email protected]>

* finzalize  and make style

* check model type by input names

* Update tests/models/vilt/test_modeling_vilt.py

Co-authored-by: amyeroberts <[email protected]>

* fixed batch size for all testers

* Revert "fixed batch size for all testers"

This reverts commit 525f3a0.

* add batch_size for all testers

* dict from model output

* do not skip layoutlm

* bring back some code from git revert

* Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <[email protected]>

* Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <[email protected]>

* clean-up

* where did minus go in tolerance

* make whisper happy

* deal with consequences of losing minus

* deal with consequences of losing minus

* maskformer needs its own test for happiness

* fix more models

* tag flaky CV models from Amy's approval

* make codestyle

---------

Co-authored-by: Joao Gante <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test batched equivalence
6 participants