RFC: Adding Pluggable Device For TensorFlow #262

jzhoulon · 2020-06-24T15:43:49Z

This RFC will be open for comment until Monday, July 20th, 2020.

Pluggable device for TensorFlow

Status	Accepted
RFC #	262
Author(s)	Zhoulong Jiang ([email protected]), Yiqiang Li ([email protected]), Eric Lin ([email protected]), Jianhui Li ([email protected])
Sponsor	Anna Revinskaya ([email protected])
Updated	2020-08-13

Objective

Implement a pluggable device mechanism which allows to run existing TensorFlow programs on a new device without user changing the code. Users only need to install a plugin in a specified directory, and the mechanism is able to discover and plug in the capabilities offered by the plugin.

This RFC is based on the Modular TensorFlow RFC, which aims to extend the TensorFlow design to plugin capabilities like adding a new device support. The modular device interface is based on StreamExecutor C API RFC.

penpornk · 2020-06-24T17:44:32Z

Thank you for the RFC! I think the links to the images are broken?

jzhoulon · 2020-06-25T02:04:49Z

Thank you for the RFC! I think the links to the images are broken?

@penpornk Thanks for the review! I put image relative path in the md file, it seems can be displayed normally in "view file", but broken in the md diff. Seems that the image is linked to the master branch and indeed it was not there yet. Do you have any suggestions? thanks.
view file:
https://github.com/tensorflow/community/blob/97d805fb6c122ba69b239d468b80247a33350fd1/rfcs/20200624-pluggable-device-for-tensorflow.md
images:
https://github.com/tensorflow/community/tree/97d805fb6c122ba69b239d468b80247a33350fd1/rfcs/20200624-pluggable-device-for-tensorflow

annarev

Thank you for writing the RFC!

rfcs/20200624-pluggable-device-for-tensorflow.md

penpornk · 2020-06-26T06:46:20Z

@jzhoulon "View file" works for me. Sorry I didn't try that before asking!

Now that I think about it, I don't think there is a good way to make the images display in the diff mode here, since the images are not in this repo yet. If you use absolute paths (URLs), you'll need to change them back to relative paths before the PR is merged. -- Not worth the hassle.

wchao1115 · 2020-07-03T22:43:45Z

Thanks for the RFC. We're working on adding a new device for DirectML (#243) so this RFC along with #257 are of great interest to us. I have a few high-level questions as follow:

Is there a plan to encourage a refactor of the existing GPU-CUDA device as a pluggable device in the future? The DirectML device is designed to be a replacement of the existing GPU device. If the GPU device continues to be part of TensorFlow proper, the total payload size of the package with the DirectML device, which includes the rest of TensorFlow as its dependency will never be smaller than the package that it depends on. The current TensorFlow-GPU wheel size is over 450 MB while the current TensorFlow-DirectML wheel size (pre-release) is about 65 MB.
Is there a particular reason to require all pluggable devices to implement the stream and the stream executor interface (RFC: StreamExecutor C API #257)? Why not give the device implementer the freedom to implement the device as they see fit by defining a C API for the pluggable device interface instead? As an example, the DirectML device doesn't use stream or the stream executor. Since one of the main purposes of any device is to manage the state for all the kernels it supports, it seems there is little benefit to regulate the implementation's choices of how a device and its corresponding kernels may be implemented.
What is the runtime mechanism that ensures that the version of TensorFlow supporting the plug-in is compatible with the pluggable device implemented in the plug-in? One of the design benefits of a plug-in architecture is that it allows the host and the plug-in to evolve independently as long as there is a versioning policy in place to ensure their compatibility to one another. It's not clear to me from the device discovery code sample how that mechanism is achieved.

penpornk · 2020-07-07T00:14:25Z

@wchao1115 Great questions!

Is there a plan to encourage a refactor of the existing GPU-CUDA device as a pluggable device in the future? The DirectML device is designed to be a replacement of the existing GPU device. If the GPU device continues to be part of TensorFlow proper, the total payload size of the package with the DirectML device, which includes the rest of TensorFlow as its dependency will never be smaller than the package that it depends on. The current TensorFlow-GPU wheel size is over 450 MB while the current TensorFlow-DirectML wheel size (pre-release) is about 65 MB.

For the current TensorFlow stack, no. Would linking to CPU-only TensorFlow package work for you (~110.MB for 1.15, ~144MB for 2.2.0)? Would love it if you could share your techniques to get the wheel size down to 65MB as well. Looping in @gunan and @annarev for more details.

Is there a particular reason to require all pluggable devices to implement the stream and the stream executor interface (RFC: StreamExecutor C API #257)? Why not give the device implementer the freedom to implement the device as they see fit by defining a C API for the pluggable device interface instead? As an example, the DirectML device doesn't use stream or the stream executor. Since one of the main purposes of any device is to manage the state for all the kernels it supports, it seems there is little benefit to regulate the implementation's choices of how a device and its corresponding kernels may be implemented.

Yes. It’s because we happen to have the StreamExecutor C API already in the works.

The new TFRT/MLIR TensorFlow stack will have much more flexible support for third-party devices. Since the new stack is fundamentally different from the current stack, we can’t guarantee that any code added to the current stack will work with the new stack right away. So the focus for the current stack is to enable basic device integration, for people that need it now, while minimizing throw-away work. We think StreamExecutor C API (covering a small subset of StreamExecutor routines) and PluggableDevice could be sufficient. (Or we’d like to hear why they might not be enough sooner rather than later. -- Hence the RFCs.)

“Stream” is just a keyword meant to represent an ordered sequence of kernels to run. Does DirectML have a kernel queue which would fit this definition? Would DirectML be able to use StreamExecutor C API and PluggableDevice? Is there anything you need that is missing? If so, we can look into adding the missing parts (and making the things you don’t need optional).

Why not give the device implementer the freedom to implement the device as they see fit by defining a C API for the pluggable device interface instead?

Do you mean C APIs for Device and DeviceFactory? Or do you mean different APIs for different devices?

What is the runtime mechanism that ensures that the version of TensorFlow supporting the plug-in is compatible with the pluggable device implemented in the plug-in? One of the design benefits of a plug-in architecture is that it allows the host and the plug-in to evolve independently as long as there is a versioning policy in place to ensure their compatibility to one another. It's not clear to me from the device discovery code sample how that mechanism is achieved.

Functionality can be extended by appending members to the structures in StreamExecutorInterface. TensorFlow will check for the existence of members before accessing them safely. Plugins just needs to initialize struct_size using the macros provided so that TensorFlow knows which members are accessible. @yisitu will give some examples later in a separate reply.

pawelpiskorski · 2020-07-07T12:34:34Z

Thanks for the RFC. I have two questions regarding support for dataset prefetch_to_device in case of PluggableDevice.

When using prefetch to device dataset iterator is on "accelerator" device and resorts to a remote call to the CPU that does the data preparation. One of the obstacles is that remote call mechanism is guarded by ProcessFunctionLibraryRuntime::GetDeviceContext that, as of now, is a bunch of hardcoded device type strings. Do you envision a way to allow PluggableDevice to support remote calls?
Also a few operators (IteratorGetNext to mention one) must be supported on the accelerator and seemingly could reuse TF implementations. This would ask for an access to operator registry such that would allow duplicating registrations of some operators from one device (say "GPU") to another.

Such features would be very welcome. Are these topics in scope of Pluggable Device RFC?

rfcs/20200624-pluggable-device-for-tensorflow.md

jzhoulon · 2020-07-08T09:25:15Z

Thanks for the comment! these are good questions.

as of now, is a bunch of hardcoded device type strings. Do you envision a way to allow PluggableDevice to support remote calls?

Tensorflow seems use lots of hardcoded device string type to do branch, some are device name(front-end name), such as (ProcessFunctionLibraryRuntime::GetDeviceContext), some are device type(back-end name), such as (ProcessMemoryTypes), I think we can extend a new branch path, such as provide an util function (IsPluggableDevice(device_type)), this function maintains a global object that contains every device type/name registered from plugin. @annarev do you have comments here? Thanks

    if (device_type != DEVICE_GPU && device_type != DEVICE_SYCL && !IsPluggableDevice(device_type)) {
     // On non-GPU and non-SYCL devices, HOST_MEMORY and DEVICE_MEMORY are always
     // compatible.
     return Status::OK();
   }

Also a few operators (IteratorGetNext to mention one) must be supported on the accelerator and seemingly could reuse TF implementations.

I'm not sure whether Kernel and Op Implementation C API will support this, maybe Google can expose IteratorGetNext as a utility functions like (TF_NewKernelBuilder) or provide an special API to register an existing implementation with new device type. @annarev

Thanks for the RFC. I have two questions regarding support for dataset prefetch_to_device in case of PluggableDevice.

When using prefetch to device dataset iterator is on "accelerator" device and resorts to a remote call to the CPU that does the data preparation. One of the obstacles is that remote call mechanism is guarded by ProcessFunctionLibraryRuntime::GetDeviceContext that, as of now, is a bunch of hardcoded device type strings. Do you envision a way to allow PluggableDevice to support remote calls?
Also a few operators (IteratorGetNext to mention one) must be supported on the accelerator and seemingly could reuse TF implementations. This would ask for an access to operator registry such that would allow duplicating registrations of some operators from one device (say "GPU") to another.

Such features would be very welcome. Are these topics in scope of Pluggable Device RFC?

yisitu · 2020-07-09T22:04:29Z

What is the runtime mechanism that ensures that the version of TensorFlow supporting the plug-in is compatible with the pluggable device implemented in the plug-in? One of the design benefits of a plug-in architecture is that it allows the host and the plug-in to evolve independently as long as there is a versioning policy in place to ensure their compatibility to one another. It's not clear to me from the device discovery code sample how that mechanism is achieved.

Functionality can be extended by appending members to the structures in StreamExecutorInterface. TensorFlow will check for the existence of members before accessing them safely. Plugins just needs to initialize struct_size using the macros provided so that TensorFlow knows which members are accessible. @yisitu will give some examples later in a separate reply.

Hi @wchao1115, @penpornk! Please find more details examples here: https://github.com/tensorflow/community/pull/257/files#diff-8312b2e074fd41855aa8a03d13e92b91

penpornk · 2020-07-09T23:46:49Z

There will be a public design review meeting for this RFC.
Time: Thursday, July 23rd, 2020 from 3:00-3:45pm PT
Meeting link: Google Meet

Meeting participants are expected to read the RFC ahead of time as the meeting will focus on the remaining issues/questions.

penpornk · 2020-07-09T23:47:01Z

@yisitu Thank you!

…xecutor C API RFC

ppiskorski · 2020-09-10T12:27:03Z

Thanks for the RFC. I have two questions regarding support for dataset prefetch_to_device in case of PluggableDevice.
When using prefetch to device dataset iterator is on "accelerator" device and resorts to a remote call to the CPU that does the data preparation. One of the obstacles is that remote call mechanism is guarded by ProcessFunctionLibraryRuntime::GetDeviceContext that, as of now, is a bunch of hardcoded device type strings. Do you envision a way to allow PluggableDevice to support remote calls?
Also a few operators (IteratorGetNext to mention one) must be supported on the accelerator and seemingly could reuse TF implementations. This would ask for an access to operator registry such that would allow duplicating registrations of some operators from one device (say "GPU") to another.
Such features would be very welcome. Are these topics in scope of Pluggable Device RFC?

@pawelpiskorski @gunan mentioned that the above issue should be solved by registering these kernels as "DEVICE_DEFAULT", following the example in queue_ops.cc (a link). For your case, the IteratorGetNext registration should be modified from "DEVICE_DEFAULT" (a link). For the kernels being registered as "DEVICE_DEFAULT", it can be used for any device. So when a device is plugged, these operations should be assigned to the pluggable device since it has highest priority and backed by the "DEVICE_DEFAULT" kernels.

Thanks for suggestion. I registered a group consisting of IteratorV2, MakeIterator, DeleteIterator, AnonymousIterator, AnonymousIteratorV2, IteratorGetNext, IteratorGetNextSync, IteratorGetNextAsOptional, IteratorToStringHandle, IteratorFromStringHandleV2 on DEVICE_DEFAULT. This somehow places the part of dataset that's designated for CPU on the device. It's a bit surprising because I haven't changed Dataset* registrations. . In short, seems not to be working out of the box for datasets, but I'll try to dig into that.
That said, I'd love to advertise my question on tf.developers regarding device type hardcodes. It so far has gone unnoticed but please have a look. Thanks!!

Updates based on recent changes and discussion on RFC PR tensorflow#262

penpornk · 2020-09-10T18:03:59Z

@wchao1115 We have updated the versioning strategy. Changes include:

Restructuring the contents so the conventions/assumptions/procedures are clearer.
An example showing that the struct size check works with struct members in any position, not just the last member of the struct.
More details on major version bump conventions. Specifying that breaking changes require an RFC.
An explicit mention that nontrivial deprecations (that cannot use zero initialization) needs a major version bump.
Adding a requirement that fields that treat 0 and NULL as invalid values must have explicit comments saying so, to ensure all plug-ins have consistent behavior, e.g., none of them is using 0 or NULL for special cases.
Updating the deprecation examples to use 'producer' and 'consumer' keywords (instead of plug-in and core TensorFlow) to cover more cases.

Could you please help take another look and raise any remaining concerns? Thank you!

Please feel free to comment on the file directly on the StreamExecutor C API RFC PR's file changes review page. (Or we can discuss here as usual.)

annarev · 2020-09-17T00:23:56Z

thanks @annarev , I think unified_memory_allocate/unified_memory_deallocate is still need, AllocatorHandle is a higher level api, which allows plugin authors to implement a specific cached allocator, which is counter part of BFCAllocator. But for those plugin authors who want to reuse existing tensorflow BFC allocator, I think the low level allocation api in StreamExecutor(AllocateArray/UnifiedMemoryAllocate) is still need, We can make it: if plugin register a allocator handle, then proper will use allocate handle as its allocator, it not , BFC Allocator will be the default Allocator and it will be forward to StreamExecutor Allocation API through sub-allocator.

I wonder if it will get confusing with 2 separate ways to set allocator functionality. Need to think of a clear way to annotate which one to use when in the API. But I will try it out and see how it looks.

I added a proposed implementation here:
21be6e9
This adds 2 sets of functions to SP_PlatformFns create_allocator/destroy_allocator and create_custom_allocator/destroy_custom_allocator. I thought keeping the two options near each other would make it less confusing.
@kulinseth, @jzhoulon ptal.

kulinseth · 2020-09-17T15:35:36Z

I wonder if it will get confusing with 2 separate ways to set allocator functionality. Need to think of a clear way to annotate which one to use when in the API. But I will try it out and see how it looks.

I added a proposed implementation here:
21be6e9
This adds 2 sets of functions to SP_PlatformFns create_allocator/destroy_allocator and create_custom_allocator/destroy_custom_allocator. I thought keeping the two options near each other would make it less confusing.
@kulinseth, @jzhoulon ptal.

Thanks Anna, this looks good.!.

penpornk · 2020-09-17T16:47:12Z

Questions we got offline (cc: @kulinseth, @wchao1115, @jzhoulon, @Jianhui-Li).

Q1: What are the next steps?

We are trying to wrap up both RFCs (StreamExecutor C API and this).
- Wrapping up doesn’t mean we can’t make changes anymore. It’s more of a confirmation that the proposals are really happening. We can continue to discuss here and make adjustments.
PluggableDevice implementation PRs will be submitted by @jzhoulon and his team. (The PRs can be submitted for review anytime, but they likely will be merged after the RFC itself is accepted.)
Interested parties can try the experimental API and give feedback.
Reiterate until the API is somewhat stable, then consider moving it out of experimental.

Q2: How will the API be released? Will we have a branch where the API will be released in the experimental folder?

StreamExecutor C API is already in the experimental folder in the master branch. It will be released with the next TensorFlow release.
PluggableDevice will be too once it is checked in. It will be released with the soonest TensorFlow release after that.

Q3: What is the high-level timeline?

The StreamExecutor C API is ready-to-use.
@jzhoulon @Jianhui-Li Would you mind giving a rough timeline of when you plan to submit PluggableDevice implementation PRs?

Q4: Will this be in TF 2.4?

StreamExecutor C API: Yes.
PluggableDevice: Depends on when the implementation gets in. If it’s before r2.4 branch cut, then yes. I’ll post the branch cut date here once it is announced on [email protected].

@wchao1115: Friendly ping. Have you had a chance to look at the updated versioning strategy yet? :)

penpornk · 2020-09-22T01:36:58Z

cc: @jzhoulon @Jianhui-Li @wchao1115 @kulinseth

I'd like to wrap up this RFC (i.e., merge this PR) to pave the way for the upcoming implementation PRs.

Here are the changes since the last open design review meeting:

PluggableDevice:
- Renamed PluggableDeviceAllocator to PluggableDeviceMemAllocator and added PluggableDeviceProcessState. 492ca0b
StreamExecutor C API
- Added more explanation of the overall functionality.
- Added more clarifications on versioning strategy.
- Corrected typos. Minor functionality additions such as block_host_until_done, unified_memory_allocate/deallocate, etc.

If anyone has anything else that needs to be discussed/resolved before we can move forward (i.e., fundamental issues), please explicitly say so here by this Thursday, 9/24 at 11:59pm PT.

@wchao1115 I believe the versioning strategy is not a blocking issue. We can continue discussing and refining the strategy here throughout the experimental phase.

Q4: Will this be in TF 2.4?

TensorFlow 2.4 branch cut has been announced. If PluggableDevice implementation makes it into TF by 10/21/20 at 5pm PT, it will be in TF 2.4. Otherwise, it won't.

wchao1115 · 2020-09-22T05:54:54Z

@penpornk Sorry for the delay. The revised versioning doc looks fine for now. We can iterate more when we start implementing the V1 plug-in.

A note on deprecation by the producer leaving an attribute zero-initialized: I'm not sure this is safe and can be done in a backward compatible way even when zero is defined as an invalid value for the attribute because the consumer may reject the zero value and fail the call instead.

If in the later minor version the producer setting the value to zero assumes the attribute will be safely ignored by the older plug-ins, the plug-in may fail.

I believe it is only safe to do so when the attribute is also considered optional by the producer. A required attribute must not be left zero initialized in the later version without a major version bump.

kulinseth · 2020-09-22T18:50:52Z

Questions we got offline (cc: @kulinseth, @wchao1115, @jzhoulon, @Jianhui-Li).

Q1: What are the next steps?

We are trying to wrap up both RFCs (StreamExecutor C API and this).

Wrapping up doesn’t mean we can’t make changes anymore. It’s more of a confirmation that the proposals are really happening. We can continue to discuss here and make adjustments.

PluggableDevice implementation PRs will be submitted by @jzhoulon and his team. (The PRs can be submitted for review anytime, but they likely will be merged after the RFC itself is accepted.)

Interested parties can try the experimental API and give feedback.

Reiterate until the API is somewhat stable, then consider moving it out of experimental.

Q2: How will the API be released? Will we have a branch where the API will be released in the experimental folder?

StreamExecutor C API is already in the experimental folder in the master branch. It will be released with the next TensorFlow release.

PluggableDevice will be too once it is checked in. It will be released with the soonest TensorFlow release after that.

Thanks! for clarifying on these questions. There is no outstanding design items from our side. There will be few iterations and adjustments as we get to the implementation PRs and start using the API.

@jzhoulon @Jianhui-Li Would you mind giving a rough timeline of when you plan to submit PluggableDevice implementation PRs?
If PluggableDevice implementation makes it into TF by 10/21/20 at 5pm PT, it will be in TF 2.4. Otherwise, it won't.

@jzhoulon, @Jianhui-Li it will be great if you can elaborate on it. We would like to know if you are also targeting TF 2.4. Also it will be great if there are any implementation PRs we can try out to adapt our implementation and provide any feedback.

Jianhui-Li · 2020-09-22T19:26:32Z

@jzhoulon, @Jianhui-Li it will be great if you can elaborate on it. We would like to know if you are also targeting TF 2.4. Also it will be great if there are any implementation PRs we can try out to adapt our implementation and provide any feedback.

@kulinseth We are targeting initial PRs for TF2.4 as a stretch goal and incremental PRs on TF2.5. At experimental stage, the initial version wont' be able to run full model, but useful for us to test various plugin and further improve the interface.

kulinseth · 2020-09-25T06:15:35Z

@kulinseth We are targeting initial PRs for TF2.4 as a stretch goal and incremental PRs on TF2.5. At experimental stage, the initial version wont' be able to run full model, but useful for us to test various plugin and further improve the interface.

Thanks for the update. I was also hoping to iron out and test the interface in the experimental stage. It will become more clear after looking at the PR and testing with more plugins.

penpornk · 2020-09-25T17:10:57Z

@wchao1115 @kulinseth Thank you for the quick confirmation! :)
@Jianhui-Li Thank you for the timeline!

@ematejska @alextp I believe there is no outstanding issue now. Could we move forward?

alextp · 2020-09-25T18:27:19Z

Yes, let's merge this.

…

On Fri, Sep 25, 2020 at 10:11 AM Penporn Koanantakool < ***@***.***> wrote: @wchao1115 <https://github.com/wchao1115> @kulinseth <https://github.com/kulinseth> Thank you for the quick confirmation! :) @Jianhui-Li <https://github.com/Jianhui-Li> Thank you for the timeline! @ematejska <https://github.com/ematejska> @alextp <https://github.com/alextp> I believe there is no outstanding issue now. Could we move forward? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#262 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABHRMIGWDKABDV7TMOZX3SHTFLBANCNFSM4OGZH3RA> .

-- - Alex

theadactyl

LGTM

kulinseth · 2020-10-03T02:36:53Z

Hi,

@penpornk and others
I had a general question about how the plugin modules will be packaged for the user. Will it be a regular package or a namespace package ? Also I would like to understand the workflow for installing the pluggable module:

install the core tensorflow package
install the pluggable device package (which will install it at ../site-packages/tensorflow-plugins/M)

Please let me know if there are other ways and if I am missing something.

yiqianglee · 2020-10-04T02:15:11Z

Hi @kulinseth ,
Pluggable device plugin installation is following modular TF python design (https://github.com/tensorflow/community/blob/master/rfcs/20190305-modular-tensorflow.md#python), it's backend's choice to distribute as regular package or namespace package, the only requirement is to follow the convention to install library under "../site-packages/tensorflow-plugins/M" so that plugin discovery can load it successfully.

install the core tensorflow package
install the pluggable device package (which will install it at ../site-packages/tensorflow-plugins/M)

Yes, that's expected installation process.

penpornk · 2020-11-05T01:29:00Z

Hi all, quick updates:

The initial PluggableDevice implementation couldn't make it into TF 2.4, but it seems on track to be in TF 2.5.
Current PluggableDevice PRs under review, in case anyone would like a quick look: 1, 2, 3
The RFC for custom graph optimizer C API is now up: RFC: Modular TensorFlow Graph C API #318
- There will be an open design review on Thursday, November 19th, 2020.

kevint324 · 2021-01-18T01:57:31Z

Hi guys,

Is there any demo like a pseudo pluggable device to show what need to be done to add a device to tensorflow?

Thanks

yiqianglee · 2021-01-18T02:20:25Z

Hi @kevint324 ,
We are working on a tutorial on how to implement TensorFlow plugin, will release in near future.

kevint324 · 2021-01-18T04:25:55Z

Sounds great. Looking forward to it.

mmaor123 · 2021-08-08T15:45:55Z

Hi @kevint324 ,
We are working on a tutorial on how to implement TensorFlow plugin, will release in near future.
@yiqianglee

reiterating the question. is there a detailed mock pluggable device example or other document on this?
thanks

yiqianglee · 2021-08-08T22:52:33Z

Hi @kevint324 ,
We are working on a tutorial on how to implement TensorFlow plugin, will release in near future.
@yiqianglee

reiterating the question. is there a detailed mock pluggable device example or other document on this?
thanks

Please see here for details tutorial and example for pluggable device development.
#352

Adding Pluggable Device For TensorFlow RFC

3cdeca6

jzhoulon requested review from ematejska, ewilderj, martinwicke and theadactyl as code owners June 24, 2020 15:43

googlebot added the cla: yes label Jun 24, 2020

Update RFC PR Number

97d805f

annarev reviewed Jun 26, 2020

View reviewed changes

rfcs/20200624-pluggable-device-for-tensorflow.md Outdated Show resolved Hide resolved

rfcs/20200624-pluggable-device-for-tensorflow.md Outdated Show resolved Hide resolved

rfcs/20200624-pluggable-device-for-tensorflow.md Outdated Show resolved Hide resolved

penpornk mentioned this pull request Jun 27, 2020

RFC: TensorFlow on DirectML #243

Merged

Update 20200624-pluggable-device-for-tensorflow.md

72c4589

alextp suggested changes Jul 7, 2020

View reviewed changes

ematejska added the RFC: Proposed RFC Design Document label Jul 7, 2020

ematejska changed the title ~~Adding Pluggable Device For TensorFlow RFC~~ RFC: Adding Pluggable Device For TensorFlow Jul 8, 2020

update StreamExecutor C API

8fec775

annarev mentioned this pull request Jul 9, 2020

RFC: StreamExecutor C API #257

Merged

jzhoulon added 3 commits July 11, 2020 21:25

replace SE_RegisterPlatform->SE_InitializePlugin according to StreamE…

a60fb80

…xecutor C API RFC

add user example

3317b4b

update time

a4b9120

annarev added a commit to annarev/community that referenced this pull request Sep 10, 2020

Merge pull request #5 from penpornk/master

2c53747

Updates based on recent changes and discussion on RFC PR tensorflow#262

theadactyl approved these changes Sep 29, 2020

View reviewed changes

alextp approved these changes Sep 29, 2020

View reviewed changes

theadactyl merged commit ddbdeac into tensorflow:master Sep 29, 2020

theadactyl added RFC: Accepted RFC Design Document: Accepted by Review and removed RFC: Proposed RFC Design Document labels Sep 29, 2020

penpornk mentioned this pull request Nov 19, 2020

[PluggableDevice] Add pluggable device load mechanism tensorflow/tensorflow#43610

Merged

kulinseth mentioned this pull request Mar 10, 2021

Add use_bfc_allocator parameter to StreamExecutor C API tensorflow/tensorflow#47598

Merged

penpornk mentioned this pull request Dec 4, 2023

[RFC] Intel GPU integration openxla/community#99

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Adding Pluggable Device For TensorFlow #262

RFC: Adding Pluggable Device For TensorFlow #262

jzhoulon commented Jun 24, 2020 •

edited by theadactyl

Loading

penpornk commented Jun 24, 2020 •

edited

Loading

jzhoulon commented Jun 25, 2020 •

edited

Loading

annarev left a comment

penpornk commented Jun 26, 2020

wchao1115 commented Jul 3, 2020

penpornk commented Jul 7, 2020

pawelpiskorski commented Jul 7, 2020

jzhoulon commented Jul 8, 2020

yisitu commented Jul 9, 2020

penpornk commented Jul 9, 2020

penpornk commented Jul 9, 2020

ppiskorski commented Sep 10, 2020

penpornk commented Sep 10, 2020

annarev commented Sep 17, 2020

kulinseth commented Sep 17, 2020 •

edited

Loading

penpornk commented Sep 17, 2020

penpornk commented Sep 22, 2020

wchao1115 commented Sep 22, 2020

kulinseth commented Sep 22, 2020

Jianhui-Li commented Sep 22, 2020

kulinseth commented Sep 25, 2020 •

edited

Loading

penpornk commented Sep 25, 2020

alextp commented Sep 25, 2020 via email

theadactyl left a comment

kulinseth commented Oct 3, 2020

yiqianglee commented Oct 4, 2020

penpornk commented Nov 5, 2020 •

edited

Loading

kevint324 commented Jan 18, 2021

yiqianglee commented Jan 18, 2021

kevint324 commented Jan 18, 2021

mmaor123 commented Aug 8, 2021 •

edited

Loading

yiqianglee commented Aug 8, 2021

RFC: Adding Pluggable Device For TensorFlow #262

RFC: Adding Pluggable Device For TensorFlow #262

Conversation

jzhoulon commented Jun 24, 2020 • edited by theadactyl Loading

Pluggable device for TensorFlow

Objective

penpornk commented Jun 24, 2020 • edited Loading

jzhoulon commented Jun 25, 2020 • edited Loading

annarev left a comment

Choose a reason for hiding this comment

penpornk commented Jun 26, 2020

wchao1115 commented Jul 3, 2020

penpornk commented Jul 7, 2020

pawelpiskorski commented Jul 7, 2020

jzhoulon commented Jul 8, 2020

yisitu commented Jul 9, 2020

penpornk commented Jul 9, 2020

penpornk commented Jul 9, 2020

ppiskorski commented Sep 10, 2020

penpornk commented Sep 10, 2020

annarev commented Sep 17, 2020

kulinseth commented Sep 17, 2020 • edited Loading

penpornk commented Sep 17, 2020

penpornk commented Sep 22, 2020

wchao1115 commented Sep 22, 2020

kulinseth commented Sep 22, 2020

Jianhui-Li commented Sep 22, 2020

kulinseth commented Sep 25, 2020 • edited Loading

penpornk commented Sep 25, 2020

alextp commented Sep 25, 2020 via email

theadactyl left a comment

Choose a reason for hiding this comment

kulinseth commented Oct 3, 2020

yiqianglee commented Oct 4, 2020

penpornk commented Nov 5, 2020 • edited Loading

kevint324 commented Jan 18, 2021

yiqianglee commented Jan 18, 2021

kevint324 commented Jan 18, 2021

mmaor123 commented Aug 8, 2021 • edited Loading

yiqianglee commented Aug 8, 2021

jzhoulon commented Jun 24, 2020 •

edited by theadactyl

Loading

penpornk commented Jun 24, 2020 •

edited

Loading

jzhoulon commented Jun 25, 2020 •

edited

Loading

kulinseth commented Sep 17, 2020 •

edited

Loading

kulinseth commented Sep 25, 2020 •

edited

Loading

penpornk commented Nov 5, 2020 •

edited

Loading

mmaor123 commented Aug 8, 2021 •

edited

Loading