-
Notifications
You must be signed in to change notification settings - Fork 22.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make index_add_ error if input source shape is wrong #100321
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/100321
Note: Links to docs will display an error until the docs builds have been completed. ✅ 1 Unrelated FailureAs of commit 6f2954e: UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
The test fallback_masked_fill might need to be changed -- the implementation looks roughly something like this (excluding the batching using vmap).
However, when we compute the backward mode gradient computation, we see the same error as described in the original issue (all done in a fresh colab notebook).
Since we'd like to do this check in the forward, I think we ought to change this test to expect the runtime error this pr throws. I suspect whatever's failing in the onnx tests is similar, so I'll look at those too. Would greatly appreciate your input @lezcano (apologies as well for the previous pr which was botched, but I'm dedicated to getting this issue fixed). |
Those errors are failing in CUDA. You modified the cpu version of the function. Now, you just need to modify accordingly the CUDA version of the function. You can either repeat the code or have a function that's shared between the two. Many functions have the checks shared, but yeah, this one repeats them, which is not great. |
Okay, just added -- hoping this works, but am of course happy to iterate on it if not. |
Re: failing tests. I think the remaining ones still fall under this analysis, in that they violate the invariant we're checking for. In the case of fallback_max_fill, we are given a self of shape (7,11,13) and source (3, 13). Omitting self[dim] and source[dim] gives shapes (11,13) and (13), which aren't broadcastable. This issue exists for the onnx tests as well, where we try to broadcast shape of (5,4,3) with shape (1,3,3) along dim = 1 in index_add_dim_size_differ and shape (5,4,3) with shape (1,3,3) along dim = 1 in index_add_dynamic_axes. I'm not sure if we should rewrite them to throw the expected error, or if there's a better course of action here. |
These shapes are broadcastable. This means that the check needs to be a bit more generic to account for broadcasting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes should just be done in the meta impl. The meta impl is common to both the CPU and CUDA parts
@pytorchbot rebase |
@pytorchbot successfully started a rebase job. Check the current status here |
Successfully rebased |
@pytorchbot rebase |
@pytorchbot successfully started a rebase job. Check the current status here |
Successfully rebased |
auto source_sizes = source.sizes().vec(); | ||
self_sizes.erase(self_sizes.begin() + dim); | ||
source_sizes.erase(source_sizes.begin() + dim); | ||
TORCH_CHECK(!(!self_sizes.empty() && source_sizes.empty()), "source tensor shape must match self tensor shape (excluding the specified dimension)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the check that we want. This does not check for broadcasting of any kind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be mistaken, but I believe infer_size_impl in ExpandUtils.cpp checks for broadcasting before reaching TensorAdvancedIndexing. Therefore, I believe this should check the only potential failure case with dimensionality, for if we reach this point in the program flow, we can assume the shapes are broadcastable.
If I'm incorrect, I'll need help on how to vectorize the check for broadcastability as you requested earlier (in which case we should update the code for infer_size_impl as it checks using a for loop).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called form
TORCH_PRECOMPUTE_META_FUNC(index_add)
(const Tensor& self, int64_t dim, const Tensor& index, const Tensor& source, const Scalar& alpha) {
dim = maybe_wrap_dim(dim, self.dim());
index_func_meta_impl(*this, self, dim, index, source, "index_add");
return TORCH_PRECOMPUTE_STRUCT(index_add)().set_dim(dim);
}
(among other places), so there is no check here that source has the correct size.
You don't need to implement the broadcasting check yourself. There are utility functions to do this. I don't remember the name of them though, but shouldn't be difficult to find.
@pytorchbot rebase |
@pytorchbot successfully started a rebase job. Check the current status here |
Successfully rebased |
auto self_sizes_tensor = at::tensor(self_sizes, index.options()); | ||
auto source_sizes_tensor = at::tensor(source_sizes, index.options()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why creating a tensor just to call .sizes()
? Pass the sizes directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried converting the sizes into IntArrayRefs with a constructor, but this wasn't working well. Is there a better way to go from a vector representation to the one calling .sizes() on a tensor produces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you just pass the vectors, they should be casted automatically to IntArrayRef
s via the implicit constructor.
The changes LGTM so far. Now there are a number of other tests that are failing. These are failing because linked to a given function there are many other functions that are implemented in terms of it (e.g. those associated with
Again, run the tests locally and make sure they pass before pushing if possible. Good luck! |
Thank you for your review. I've fixed the regex issue, and am having some trouble with the broadcasting check (see comment above). I've been looking into the batching rule fix, and I've narrowed it down to index_add_batch_rule in BatchRulesScatterOps. I've been erasing the batching dimension of self and other along with the given dimension and then conducting a check for broadcastability, but this doesn't seem to work. Erasing just the given dimension doesn't trigger the error. I've added an xfail to catch the bugs for this CI run after seeing the same existed for index_put, perhaps this approach is acceptable. EDIT: This doesn't seem to be working, I feel I'll need the majority of your input here. Finally, I might need to go to this Friday's office hours as my run_test.py isn't showing the same errors as the CI -- I'm trying to be mindful of CI runs, but feel as if my feedback loop is faulty without them. |
Successfully rebased |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Fixes #92576 , checking the following as described in the documentation:
"source.shape[dim] == len(index) and source.shape[i] == self.shape[i] for i != dim"
Would be happy to iterate on this if there are any issues, and would be happy to implement the checking for the CUDA and MPS implementations of index_add_.