Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify bindings in HAL command buffers. #18154

Closed
9 tasks done
benvanik opened this issue Aug 8, 2024 · 0 comments · Fixed by #18366
Closed
9 tasks done

Simplify bindings in HAL command buffers. #18154

benvanik opened this issue Aug 8, 2024 · 0 comments · Fixed by #18366
Assignees
Labels
hal/api IREE's public C hardware abstraction layer API

Comments

@benvanik
Copy link
Collaborator

benvanik commented Aug 8, 2024

In light of reusable/indirect command buffers and increasingly broad support for device addresses I'm thinking about reworking descriptor sets and pipeline layouts in the HAL. The current design achieves 2 major goals: functioning on targets that do not support device addresses (only buffer handles) and compressing command buffer recording (by reusing descriptor sets across dispatches). We still need to support buffer handles on the API and not expose device addresses upwards (into the VM/etc) and still want to efficiently record command buffers. Reusable command buffers obviates the compression benefits on recording performance as we only record them once though it's still important to remain small on disk.

Problem

Reusable command buffers in real models has revealed two new facts: the same dispatches are used with both direct and indirect bindings for the same arguments and in practice we are poorly reusing descriptor sets when parameters are not packed. I spent some time trying to figure out how to increase reuse and couldn't without boxing us into a corner with larger programs where we rely on deduplication of executable functions. Packed parameters (where we gather into discrete buffers while loading or pack ahead of time in the compiler) do tend to get good reuse and that should be the common case however the actual savings are dependent on binding ordering across different dispatches. The original intent was that we'd have a descriptor set for frequently used buffers (constants, parameters, and transients) and a set for infrequently used buffers (usually function I/O) - we'd define the frequently used set once and reuse it for all dispatches and then define the infrequently used ones on-demand. In practice this works well when things line up: a single descriptor update can be reused for 1000+ dispatches, reducing the resource tracking, validation, and call overheads 1000x. For this to work it requires that all of those dispatches use the same descriptor set layout (set 0 binding 3 is constants, binding 7 is the first chunk of parameters, etc) in all dispatch sites in the whole program. The same was the intention with push constants: using the same constants (usually shape dimensions or buffer offsets) for multiple dispatches requires that all dispatches involved share a layout.

Doing this global analysis isn't difficult and it's mostly what happens in MaterializeInterfaces today. One issue that has shown up in practice is that individual dispatches once deduplicated will end up with both direct and indirect bindings and since these are handled differently throughout the stack we would have to duplicate entry points just to vary the ABI. Another issue is that the cost of padding the push constants and descriptor sets is very high: if a particular dispatch uses 1 binding and 1 push constant it may still need to have a layout using 29 declared bindings and 18 declared push constants and deal with shipping those to the device during execution even when unused. Our HIP/CUDA/CPU implementations compress bindings to avoid sending sparsely used tables over to the device as a way to avoid the bloat but they use the descriptor set layout in order to do this expecting it to be authoritative. If we started using sparse descriptor sets with partially used slots we'd need a secondary target-specific layout indicating which bindings and constants were used.

Having two layout mechanisms (one generic one for recording and one target-specific one for execution) is not great. Neither is duplicating entry points to support different types of bindings used during recording and execution. And neither is globally assigning descriptor set+binding/push constant ordinals when entry points may be used in multiple locations in the program.

The critical aspect of the current mechanism to preserve is the separation of push constants from bindings unlike CUDA-style void* memory that may contain device addresses and parameters interleaved. All of the descriptor set stuff is just for recording efficiency. Since we aren't reaping the benefits and likely won't be able to without making larger and worse tradeoffs a simplification would be to pass push constants and bindings per dispatch.

Proposed Changes

Descriptor set and pipeline layouts as a concept would be entirely removed from the HAL and as part of their executable definition targets would ship a full layout as today (Vulkan/WebGPU/etc) or minimal layout of just sizes/counts (CUDA/HIP/CPU) based on what was required. Command buffer recording - which now with reuse is less concerned with recording performance - would just take the hit of resource tracking and marshaling arguments as needed. API-wise, the command buffer would change to take the constants and bindings as arguments directly on dispatches:

IREE_API_EXPORT iree_status_t iree_hal_command_buffer_dispatch2(
    iree_hal_command_buffer_t* command_buffer,
    iree_hal_executable_t* executable, int32_t entry_point,
    uint32_t workgroups[3], iree_const_byte_span_t constants,
    iree_hal_buffer_ref_list_t bindings, iree_hal_dispatch_flags_t flags);
IREE_API_EXPORT iree_status_t iree_hal_command_buffer_dispatch_indirect2(
    iree_hal_command_buffer_t* command_buffer,
    iree_hal_executable_t* executable, int32_t entry_point,
    iree_hal_buffer_ref_t workgroups_ref, iree_const_byte_span_t constants,
    iree_hal_buffer_ref_list_t bindings, iree_hal_dispatch_flags_t flags);

Each target would need to define the metadata it requires to translate the constants and bindings into its own ABI and perform the required validation during execution. The Vulkan/SPIR-V flatbuffer would gain an encoding of pipeline layouts for each entry point (a set per executable and then each entry point referencing an entry in that set), CPU would add a fields to iree_hal_executable_dispatch_attrs_v0_t for constant and binding counts, etc.

The compiler would simplify the HAL interface ABI to only include the push constant count and binding count (no longer sets) and assign ordinals to each. All of the information is available for producing the combined dispatch calls on stream.cmd.dispatch during lowering and conversion will be simplified by not having to issue the extra ops. Targets backends would need to produce the updated metadata when serializing executables.

CPU

iree_hal_executable_dispatch_attrs_v0_t would get a field for the number of push constants (or maybe just size) and the number of bindings.

CUDA/HIP

This would be a good chance to completely rewrite the ExecutableDef flatbuffers to support multiple PTX/HSACO blobs, a proper export table (instead of standalone tables), and now the additional information to verify constants and bindings.

Vulkan/WebGPU

The flatbuffers would gain their appropriate pipeline layout definitions and upon executable creation would create all of the runtime resources. Since we are supposed to be linking most (if not all) dispatch entry points into the same executable this amortizes the cost of creation/retaining these runtime resources to the same level as it is today. The only cases where we'd regress is when there are multiple executables that share layouts in the same compiled module but the runtime implementation can cache and reuse them if desired.

Metal

We should evaluate argument buffers as part of the #17875 work - whatever is done there will decide if we want to use the Vulkan/WebGPU approach of bindings or CUDA/HIP approach of parameter blobs.

Interactions with Indirect Bindings

#17875 mentions some approaches to supporting indirect bindings in reusable command buffers on various targets as each will have its own way of doing so. Some targets may want to partition direct from indirect bindings while others will keep them interleaved. Since dispatches to the same entry point may have different sets of direct and indirect buffers targets may want to always treat everything as if it could be indirect, use the metadata provided on the HAL interface ops to opt in only a subset of bindings to indirect usage, or do nothing at all and rely on emulation forever.

I suspect we'll end up with constants/direct bindings in native mechanisms (push constants/descriptor sets/binding groups/kernel args/kernel param buffers) and indirect bindings in their own device memory. Basically: what's known at recording time goes into native mechanisms and what's known at submission time goes in our own data structures referenced from those. This is marginally more expensive (one extra pointer indirection at execution time) but given that's it uniform caching should take care of it. Targets could also do things like push constants to specify binding table slots and then dereference a binding table or slice off and suballocate a parameter buffer with offsets into it. We'd probably have to evaluate per-target what's cheapest to submit and execute.

Since the compiled binaries change behavior we'd still be in a regime where the compiler would need to annotate the HAL interface ABI with which bindings may be indirect and possibly then have the runtime always pass direct bindings as indirect. This is no different then today and the only change is that now there aren't also descriptor sets to reason about. The best solution here is likely to export multiple entry points that handle the different modes and select appropriately during recording time.

Implementation

I've thought about ways to land things incrementally but it'd take quite a bit of work. Since we're still ok breaking things this is likely to happen in a branch that maintains compatibility at head until each target is converted. Once all targets are converted the branch can be merged as one larger breaking change. The upside is that there's a single breaking change and a single clean merge instead of piecemeal breakages or effectively copy/pasting entire HAL drivers to update the current dependencies on pipeline layouts and descriptor sets.

Happen on main with no breakages:

  • Add compiler flag --iree-hal-experimental-dispatch2 to emit the new ops
  • Add new vtable methods to iree_hal_command_buffer_t and a new executable create2 that doesn't take pipelines
  • Remove set from HAL interface ops (it's always 0 today) and adapt to the old ABI in HAL-to-VM

Happen on branch, merge is a breakage:

  • Add attributes to the CPU executable_library (compiler/runtime), unfortunately breaking
  • Rework CUDA/HIP flatbuffers completely as 2 (compiler/runtime)
  • Rework Vulkan/SPIR-V flatbuffer completely as 2 (compiler/runtime)
  • Add metadata to WGSL and Metal flatbuffers - maybe rework, it's been awhile (compiler/runtime)
  • Other pending HAL changes
  • Rename 2 -> default during merge, bump versions

Any other changes to support indirect bindings?

NO: Now would be a good time to add the right metadata for indirect bindings as needed by targets but until implementation starts it'll likely be difficult to know exactly what should be added. Part of the cleanup to the flatbuffers here will be adding the appropriate placeholders to make adding such metadata non-breaking changes in the future. Metadata added for binding layout should include flags bits to indicate indirect usage or not per binding.

@benvanik benvanik added the hal/api IREE's public C hardware abstraction layer API label Aug 8, 2024
@benvanik benvanik self-assigned this Aug 8, 2024
benvanik added a commit that referenced this issue Aug 10, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 10, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 10, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 10, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 11, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 11, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 12, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 12, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 12, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 12, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 12, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 12, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 12, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 12, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 12, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 12, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new ops
though no targets currently implement them. Since executables no longer
require pipeline layouts in this simplified model the
`--iree-hal-experimental-executable-create2` flag can be used to stop
passing them.

Progress on #18154.

Signed-off-by: Ben Vanik <[email protected]>
benvanik added a commit that referenced this issue Aug 12, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new
ops. Since executables no longer require pipeline layouts in this
simplified model the `--iree-hal-experimental-executable-create2` flag
can be used to stop passing them; targets that support dispatch2 will
ignore them if provided. Future changes will start to add support on
targets for the simplified bindings and then remove the existing
pipeline layout-based binding model as a breaking ABI change.

Current target status:
* [x] Local/CPU: executable-create2 and executable-dispatch2 supported
(backward compat)
* [x] CUDA: executable-dispatch2 supported (backward compat)
* [x] HIP: executable-dispatch2 supported (backward compat)
* [x] Metal: executable-dispatch2 supported (backward compat)
* [x] Vulkan: executable-dispatch2 supported (backward compat)
* [x] WebGPU: executable-dispatch2 supported (backward compat)

Reworking the CUDA/HIP/Metal/Vulkan/WebGPU flatbuffers to support
executable-create2 will be done in a follow-up.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 21, 2024
This does not yet rename the methods and is just stripping all of the
legacy ops and methods.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 21, 2024
This does not yet rename the methods and is just stripping all of the
legacy ops and methods.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 22, 2024
This does not yet rename the methods and is just stripping all of the
legacy ops and methods.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 22, 2024
This does not yet rename the methods and is just stripping all of the
legacy ops and methods.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 26, 2024
* Renamed `push_constants` to `constants` (as there is no longer a
  `push_constants` API)
* Dropped `#hal.descriptor_set.layout`
* Removed ordinal from `#hal.descriptor_set.binding` (as ordinals are
  now implicit)
* Renamed `#hal.descriptor_set.binding` to `#hal.pipeline.binding`
* Removed `set` from `hal.interface.binding.subspan`
* Removed `#hal.interface.binding` and the spooky action at a distance
  `hal.interface.binding` attr now that ordinals are implicit

Progress on #18154.
benvanik added a commit that referenced this issue Aug 26, 2024
* Renamed `push_constants` to `constants` (as there is no longer a
  `push_constants` API)
* Dropped `#hal.descriptor_set.layout`
* Removed ordinal from `#hal.descriptor_set.binding` (as ordinals are
  now implicit)
* Renamed `#hal.descriptor_set.binding` to `#hal.pipeline.binding`
* Removed `set` from `hal.interface.binding.subspan`
* Removed `#hal.interface.binding` and the spooky action at a distance
  `hal.interface.binding` attr now that ordinals are implicit

Progress on #18154.
benvanik added a commit that referenced this issue Aug 26, 2024
* Renamed `push_constants` to `constants` (as there is no longer a
  `push_constants` API)
* Dropped `#hal.descriptor_set.layout`
* Removed ordinal from `#hal.descriptor_set.binding` (as ordinals are
  now implicit)
* Renamed `#hal.descriptor_set.binding` to `#hal.pipeline.binding`
* Removed `set` from `hal.interface.binding.subspan`
* Removed `#hal.interface.binding` and the spooky action at a distance
  `hal.interface.binding` attr now that ordinals are implicit

Progress on #18154.
benvanik added a commit that referenced this issue Aug 26, 2024
* Renamed `push_constants` to `constants` (as there is no longer a
  `push_constants` API)
* Dropped `#hal.descriptor_set.layout`
* Removed ordinal from `#hal.descriptor_set.binding` (as ordinals are
  now implicit)
* Renamed `#hal.descriptor_set.binding` to `#hal.pipeline.binding`
* Removed `set` from `hal.interface.binding.subspan`
* Removed `#hal.interface.binding` and the spooky action at a distance
  `hal.interface.binding` attr now that ordinals are implicit

Progress on #18154.
benvanik added a commit that referenced this issue Aug 26, 2024
* Renamed `push_constants` to `constants` (as there is no longer a
  `push_constants` API)
* Dropped `#hal.descriptor_set.layout`
* Removed ordinal from `#hal.descriptor_set.binding` (as ordinals are
  now implicit)
* Renamed `#hal.descriptor_set.binding` to `#hal.pipeline.binding`
* Removed `set` from `hal.interface.binding.subspan`
* Removed `#hal.interface.binding` and the spooky action at a distance
  `hal.interface.binding` attr now that ordinals are implicit

Progress on #18154.
benvanik added a commit that referenced this issue Aug 26, 2024
* Renamed `push_constants` to `constants` (as there is no longer a
  `push_constants` API)
* Dropped `#hal.descriptor_set.layout`
* Removed ordinal from `#hal.descriptor_set.binding` (as ordinals are
  now implicit)
* Renamed `#hal.descriptor_set.binding` to `#hal.pipeline.binding`
* Removed `set` from `hal.interface.binding.subspan`
* Removed `#hal.interface.binding` and the spooky action at a distance
  `hal.interface.binding` attr now that ordinals are implicit

Progress on #18154.
benvanik added a commit that referenced this issue Aug 26, 2024
This does not yet rename the methods and is just stripping all of the
legacy ops and methods.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 26, 2024
The pass is running on HAL IR and now that descriptor sets are being
removed as part of #18154 needs to be rewritten. A new version would
operate on SPIR-V IR in order to replace push constants with special
binding loads instead.
benvanik added a commit that referenced this issue Aug 26, 2024
benvanik added a commit that referenced this issue Aug 26, 2024
* Renamed `push_constants` to `constants` (as there is no longer a
  `push_constants` API)
* Dropped `#hal.descriptor_set.layout`
* Removed ordinal from `#hal.descriptor_set.binding` (as ordinals are
  now implicit)
* Renamed `#hal.descriptor_set.binding` to `#hal.pipeline.binding`
* Removed `set` from `hal.interface.binding.subspan`
* Removed `#hal.interface.binding` and the spooky action at a distance
  `hal.interface.binding` attr now that ordinals are implicit

Progress on #18154.
benvanik added a commit that referenced this issue Aug 27, 2024
This does not yet rename the methods and is just stripping all of the
legacy ops and methods.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 27, 2024
The pass is running on HAL IR and now that descriptor sets are being
removed as part of #18154 needs to be rewritten. A new version would
operate on SPIR-V IR in order to replace push constants with special
binding loads instead.
benvanik added a commit that referenced this issue Aug 27, 2024
benvanik added a commit that referenced this issue Aug 27, 2024
* Renamed `push_constants` to `constants` (as there is no longer a
  `push_constants` API)
* Dropped `#hal.descriptor_set.layout`
* Removed ordinal from `#hal.descriptor_set.binding` (as ordinals are
  now implicit)
* Renamed `#hal.descriptor_set.binding` to `#hal.pipeline.binding`
* Removed `set` from `hal.interface.binding.subspan`
* Removed `#hal.interface.binding` and the spooky action at a distance
  `hal.interface.binding` attr now that ordinals are implicit

Progress on #18154.
benvanik added a commit that referenced this issue Aug 27, 2024
This does not yet rename the methods and is just stripping all of the
legacy ops and methods.

Progress on #18154.
benvanik added a commit that referenced this issue Aug 27, 2024
The pass is running on HAL IR and now that descriptor sets are being
removed as part of #18154 needs to be rewritten. A new version would
operate on SPIR-V IR in order to replace push constants with special
binding loads instead.
benvanik added a commit that referenced this issue Aug 27, 2024
benvanik added a commit that referenced this issue Aug 27, 2024
* Renamed `push_constants` to `constants` (as there is no longer a
  `push_constants` API)
* Dropped `#hal.descriptor_set.layout`
* Removed ordinal from `#hal.descriptor_set.binding` (as ordinals are
  now implicit)
* Renamed `#hal.descriptor_set.binding` to `#hal.pipeline.binding`
* Removed `set` from `hal.interface.binding.subspan`
* Removed `#hal.interface.binding` and the spooky action at a distance
  `hal.interface.binding` attr now that ordinals are implicit

Progress on #18154.
banach-space added a commit to banach-space/iree that referenced this issue Aug 30, 2024
I've just landed an update for the affected test (see iree-org#18369), but
unfortunately forgot to re-base after the recent changes to HAL by Ben,
see iree-org#18366 and iree-org#18154.

This simply updates the test to align with the recent changes to HAL.
banach-space added a commit to banach-space/iree that referenced this issue Aug 30, 2024
I've just landed an update for the affected test (see iree-org#18369), but
unfortunately forgot to re-base after the recent changes to HAL by Ben,
see iree-org#18366 and iree-org#18154.

This simply updates the test to align with the recent changes to HAL.

Signed-off-by: Andrzej Warzynski <[email protected]>
rohan-tan-bhowmik pushed a commit to rohan-tan-bhowmik/iree that referenced this issue Sep 4, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new
ops. Since executables no longer require pipeline layouts in this
simplified model the `--iree-hal-experimental-executable-create2` flag
can be used to stop passing them; targets that support dispatch2 will
ignore them if provided. Future changes will start to add support on
targets for the simplified bindings and then remove the existing
pipeline layout-based binding model as a breaking ABI change.

Current target status:
* [x] Local/CPU: executable-create2 and executable-dispatch2 supported
(backward compat)
* [x] CUDA: executable-dispatch2 supported (backward compat)
* [x] HIP: executable-dispatch2 supported (backward compat)
* [x] Metal: executable-dispatch2 supported (backward compat)
* [x] Vulkan: executable-dispatch2 supported (backward compat)
* [x] WebGPU: executable-dispatch2 supported (backward compat)

Reworking the CUDA/HIP/Metal/Vulkan/WebGPU flatbuffers to support
executable-create2 will be done in a follow-up.

Progress on iree-org#18154.
rohan-tan-bhowmik pushed a commit to rohan-tan-bhowmik/iree that referenced this issue Sep 4, 2024
These combine push constants and push descriptor sets into the dispatch
calls as in practice we have a near 1:1 relationship anyway. Pipeline
layouts are still used in HAL interfaces to allow the compiler to map
the information but are otherwise not used by the new ops.

The `--iree-hal-experimental-dispatch2` flag enables emitting the new
ops. Since executables no longer require pipeline layouts in this
simplified model the `--iree-hal-experimental-executable-create2` flag
can be used to stop passing them; targets that support dispatch2 will
ignore them if provided. Future changes will start to add support on
targets for the simplified bindings and then remove the existing
pipeline layout-based binding model as a breaking ABI change.

Current target status:
* [x] Local/CPU: executable-create2 and executable-dispatch2 supported
(backward compat)
* [x] CUDA: executable-dispatch2 supported (backward compat)
* [x] HIP: executable-dispatch2 supported (backward compat)
* [x] Metal: executable-dispatch2 supported (backward compat)
* [x] Vulkan: executable-dispatch2 supported (backward compat)
* [x] WebGPU: executable-dispatch2 supported (backward compat)

Reworking the CUDA/HIP/Metal/Vulkan/WebGPU flatbuffers to support
executable-create2 will be done in a follow-up.

Progress on iree-org#18154.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hal/api IREE's public C hardware abstraction layer API
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant