Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve description of device handling, add array.to_device #171

Merged
merged 1 commit into from
Sep 14, 2021

Conversation

rgommers
Copy link
Member

Closes gh-157

@rgommers rgommers added the Narrative Content Narrative documentation content. label Apr 26, 2021
Copy link
Contributor

@kgryte kgryte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@leofang
Copy link
Contributor

leofang commented Apr 27, 2021

LGTM, but I guess we need someone from TensorFlow to give a green light? IIRC TF allows implicit transfer.

@rgommers
Copy link
Member Author

LGTM, but I guess we need someone from TensorFlow to give a green light? IIRC TF allows implicit transfer.

Right, this PR is meant to be just a clarification based on what we already discussed earlier. Quoting from #39 (comment):

Also, I am not sure we should enforce constraints on where inputs and outputs can be placed for an operation. Such constraints can make it harder to write portable library code where you don't control the inputs and may have to start by copying all inputs to the same device. Tensorflow runtime is allowed to copy inputs to the correct device if needed.

That is a good question, should it be enforced or just recommended? Having device transfers be explicit is usually better (implicit transfers can make for hard to track down performance issues), but perhaps not always.

I think in the end this is an execution rather than a syntax/semantics question, so maybe the "with the convention ..." phrase in this PR should be replaced with a phrase that's more like "strong recommendation".

Cc @edloper, @agarwal-ashish

@agarwal-ashish
Copy link

agarwal-ashish commented Apr 29, 2021

LGTM.

One nit is that the device property on operations likely specifies where the kernel runs, which is typically but not necessarily, the same as the device where outputs are placed.

@rgommers
Copy link
Member Author

One nit is that the device=None property on operations likely specifies where the kernel runs, which is typically but not necessarily, the same as the device where outputs are placed.

Thanks @token. device=None is a keyword for array creation functions, so if it's not specified I think then the device that the array is placed on is "whatever the default device or placement strategy of the library is".

@agarwal-ashish
Copy link

One nit is that the device=None property on operations likely specifies where the kernel runs, which is typically but not necessarily, the same as the device where outputs are placed.

Thanks @token. device=None is a keyword for array creation functions, so if it's not specified I think then the device that the array is placed on is "whatever the default device or placement strategy of the library is".

Sorry I meant even when the device is specified, a particular op may choose to output on a different device than where the kernel is executed (e.g. copy, shape, rpc related ops, etc). If this property is restricted to array creation functions only then the two devices will likely match.

@rgommers
Copy link
Member Author

If this property is restricted to array creation functions only then the two devices will likely match.

It is. Thanks for clarifying. Are the cases where an array creation call would still end up on a different device restricted to avoiding an out-of-memory error? If so, we can explicitly add that as a note.


This standard chooses to add support for method 3 (local control), because it's the most explicit and granular, with its only downside being verbosity. A context manager may be added in the future - see {ref}`device-out-of-scope` for details.
This standard chooses to add support for method 3 (local control), with the convention that execution takes place on the same device where all argument arrays are allocated. The rationale for choosing method 3 is because it's the most explicit and granular, with its only downside being verbosity. A context manager may be added in the future - see {ref}`device-out-of-scope` for details.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify: for libraries that follow this convention, it would be an error to perform an operation with tensors that are not all allocated on the same device?


- **out**: _<array>_

- an array with the same data and dtype, located on the specified device.
Copy link
Contributor

@leofang leofang May 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If self is already on device, do we expect a no-op (returning self) or a copy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, should specify that. I'd say no-op.

@leofang
Copy link
Contributor

leofang commented Sep 14, 2021

Let's get this PR in, as no body objects and we don't want to drag indefinitely. We can continue the discussion and pick up any loose ends in the follow-up PR #259.

@leofang leofang merged commit e810a81 into data-apis:main Sep 14, 2021
@rgommers rgommers deleted the device-support-tweak branch September 14, 2021 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Narrative Content Narrative documentation content.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Device support statement refinement
6 participants