Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add specifications for returning upper (triu) and lower (tril) triangular matrices #243

Merged
merged 4 commits into from
Sep 20, 2021

Conversation

kgryte
Copy link
Contributor

@kgryte kgryte commented Aug 23, 2021

This PR

  • resolves gh-237 by adding specifications for returning the lower (tril) and upper (triu) triangular matrix for one or more matrices.
  • includes a minor textual change to improve grammar.

Notes

  • The function signatures were derived by comparing the signatures across array libraries (tril, triu). The signatures across libraries are consistent, with the exception that PyTorch uses diagonal as the keyword argument, rather than k as used in NumPy et al. This PR chose k given its preponderance.
  • TF was the sole library without tril and triu APIs; however, these are included in the experimental numpy namespace.
  • The proposed APIs deviate from NumPy in allowing providing a batch of matrices (as discussed in gh-237). Supporting batches is part of a more general theme in this specification (see linear algebra) and follows prior art in the PyTorch API.
  • Both tril and triu were included in the list of potential APIs to standardize for v2 of the specification (see gh-187); however, these APIs were found to be needed in the current specification (see gh-226 and discussion gh-217).
  • Once merged, this should allow merging of gh-226.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kgryte. I see two issues here:

  • the k keyword is not named k in all libraries. E.g. PyTorch uses diagonal=0.
  • these aren't really creation functions - the input is already an array and this uses a subset of the data. So it looks to me like these don't quite belong in creation_functions.md nor should have a device keyword.

@kgryte
Copy link
Contributor Author

kgryte commented Aug 23, 2021

@rgommers NumPy classifies both tril and triu as creation functions. Presumably, this is because NumPy chooses to return a copy of an array for both functions, thus "creating" new arrays. I suppose an array library might choose to return a view, but I am not sure how likely this would be.

@rgommers
Copy link
Member

@rgommers NumPy classifies both tril and triu as creation functions. Presumably, this is because NumPy chooses to return a copy of an array for both functions, thus "creating" new arrays. I suppose an array library might choose to return a view, but I am not sure how likely this would be.

Hmm. Yes a copy is likely needed, because the view on an upper or lower triangular would be ragged. I'm still not sure that that means a device keyword needs to be added though. It seems like one would almost always want the data on the same device that the input array is on, and not on the default device.

@kgryte
Copy link
Contributor Author

kgryte commented Aug 30, 2021

Re: copy. Agreed. I think that, in most cases, a copy would be required; although, I could imagine, for certain values of k, where a sparse view could be used. In order to avoid confusion, we could be explicit in requiring that a copy be returned.

Re: device. We could change the default device behavior from None (the default device) to x.device, so that the device is inherited from the input array. I do think, however, that tril and triu should support the device keyword for consistency with the other array creation APIs.

My sense is that a similar argument for the default device being x.device could also apply to the *_like creation functions, where you'd want to also inherit the device of the input array; however, the current default behavior is to use the default device.

@rgommers
Copy link
Member

Re: copy. Agreed. I think that, in most cases, a copy would be required; although, I could imagine, for certain values of k, where a sparse view could be used. In order to avoid confusion, we could be explicit in requiring that a copy be returned.

We don't do that anywhere else I believe (unless there's an explicit copy= keyword), and something like y = triu(x)[0, :] could fairly easily be optimized to be a view by any library that has a JIT compiler. So I'd suggest to not do that.

Re: device. We could change the default device behavior from None (the default device) to x.device, so that the device is inherited from the input array. I do think, however, that tril and triu should support the device keyword for consistency with the other array creation APIs.

Hmm, isn't that a bit circular? I'm not quite seeing it as a creation function, and the most similar function (the one that also takes array input) in https://data-apis.org/array-api/latest/API_specification/creation_functions.html is meshgrid, which also does not have a device keyword. My preference would be for any function that takes array input to not have a device keyword.

My sense is that a similar argument for the default device being x.device could also apply to the *_like creation functions, where you'd want to also inherit the device of the input array; however, the current default behavior is to use the default device.

Did we do that on purpose? E.g. https://pytorch.org/docs/stable/generated/torch.ones_like.html uses x.device.

@kgryte
Copy link
Contributor Author

kgryte commented Sep 13, 2021

As discussed during the consortium meeting held 09-02-2021, updated this PR to remove the device keyword argument. If users wish to allocate an upper or lower triangular array on a different device, they can use the .to_device method specified in #171.

@kgryte
Copy link
Contributor Author

kgryte commented Sep 20, 2021

As review comments have now been addressed, will merge. If further refinements are necessary, we can do so in a follow-up PR.

@kgryte kgryte merged commit 5593253 into main Sep 20, 2021
@kgryte kgryte deleted the triu-tril branch September 20, 2021 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API extension Adds new functions or objects to the API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: add support for returning upper and lower triangles of an array
2 participants