Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DONOTMERGE] add CI tests for torch.compile'ing the transforms.v2 kernels #8127

Draft
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

pmeier
Copy link
Collaborator

@pmeier pmeier commented Nov 21, 2023

I don't intend to merge this. Rather, this should serve as base to get a feeling how far away we are from achieving our goal in #8056.

Copy link

pytorch-bot bot commented Nov 21, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/8127

Note: Links to docs will display an error until the docs builds have been completed.

❌ 8 New Failures, 1 Unrelated Failure

As of commit 90ab254 with merge base 6e18cea (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pmeier
Copy link
Collaborator Author

pmeier commented Nov 21, 2023

With torch-2.2.0.dev20231121+cpu and torchvision @ 893b4ab:

  • backend='eager', dynamic=True: 259 failed
  • backend='inductor', dynamic=True: 307 failed
  • backend='eager', dynamic=False: 98 failed
  • backend='inductor', dynamic=False: 110 failed

@pmeier
Copy link
Collaborator Author

pmeier commented Nov 23, 2023

Failures on pad_mask with fullgraph are a test error and fixed in #8132.

@pmeier
Copy link
Collaborator Author

pmeier commented Nov 23, 2023

In c5c72ab and 29ea48a, I've added atol=1, rtol=0 for uint8 / bilinear resize. With this, the most lenient setting, i.e. static shapes, eager backend, and graphbreaks being allowed, the tests are now passing: https://github.com/pytorch/vision/actions/runs/6971806789/job/18972632003?pr=8127 🎉

@pmeier
Copy link
Collaborator Author

pmeier commented Dec 18, 2023

With torch-2.3.0.dev20231218+cpu and torchvision @ 6c2e0ae:

dynamic backend fullgraph failing tests
False eager False 0
False eager True 63
False inductor False 8
False inductor True 71
True eager False 72
True eager True 192
True inductor False 86
True inductor True 206

@pmeier
Copy link
Collaborator Author

pmeier commented Dec 19, 2023

I've factored out e6a54bf into #8171.

@pmeier
Copy link
Collaborator Author

pmeier commented Dec 22, 2023

With torch-2.3.0.dev20231222+cpu and torchvision @ 26fb5ef:

dynamic backend fullgraph failing tests diff to previous
False eager False 0 0
False eager True 23 -40
False inductor False 8 0
False inductor True 31 -40
True eager False 72 0
True eager True 95 -97
True inductor False 80 -6
True inductor True 103 -103

@NicolasHug NicolasHug marked this pull request as ready for review January 15, 2024 10:12
@NicolasHug NicolasHug marked this pull request as draft January 15, 2024 10:12
@NicolasHug
Copy link
Member

Some great progress with torch-2.3.0.dev20240117+cpu and torchvision @ 1de7a74 where 72 failing tests were resolved from all of the dynamic=True jobs.

test/test_transforms_v2.py Outdated Show resolved Hide resolved
@vfdev-5 vfdev-5 closed this Apr 2, 2024
@vfdev-5 vfdev-5 reopened this Apr 2, 2024
@vfdev-5
Copy link
Collaborator

vfdev-5 commented Apr 2, 2024

With torch-2.4.0.dev20240401+cpu and torchvision 5181a85 we have the following failures:

  • test.test_transforms_v2.TestPerspective: image/bbox - due to torch._dynamo.exc.Unsupported: dynamic shape operator: aten.linalg_lstsq.default - can't be fixed now
  • test.test_transforms_v2.TestSanitizeBoundingBoxes.test_kernel - due to torch._dynamo.exc.Unsupported: dynamic shape operator: aten.nonzero.default - probably, can't be fixed neither
  • test.test_transforms_v2.TestJPEG: image/video - due to torch._dynamo.exc.TorchRuntimeError: Failed running call_function image.encode_jpeg(*(FakeTensor(..., size=(3, 17, 11), dtype=torch.uint8), 5), **{}):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants