Changing HAL dialect syntax to express all types. #5239

benvanik · 2021-03-27T22:36:52Z

The previous HAL ops inferred the types of the values they were working on (such as !hal.buffer or !hal.device); this prevented the specialization of those types required for buffer analysis and static device feature detection.

The new syntax uses op_name<%value : !hal.type> on the op name indicating that the op is templated on the given %value. Parameters are now mostly encoded in named parens like linalg to remove a lot of the parsing ambiguity that existed when they were comma separated. There's a few types that still need some cleanup (particularly hal.buffer_view), however those need a bit more bake time with the other changes in order to figure out the right way to handle them so they've been left unchanged here.

Future changes for allocation will use a !hal.buffer<device, type, access, etc> and changes for device feature
detection will use a !hal.device<@id>. Other types like !hal.command_buffer may also be specialized per-device.

There's some partially-updated enum support in here that will be getting improved in the follow-ups; the enums will move into the type specifiers and many of the enums used on ops will go away as well.

iree/compiler/Dialect/HAL/Transforms/test/resolve_entry_point_ordinals.mlir

iree/compiler/Dialect/HAL/IR/HALBase.td

iree/compiler/Dialect/HAL/IR/HALOps.td

ScottTodd · 2021-03-30T22:41:58Z

iree/compiler/Dialect/HAL/IR/test/command_buffer_ops.mlir

+  %cmd = hal.command_buffer.create device(%device : !hal.device)
+                                     mode(OneShot)
+                               categories("Transfer|Dispatch") : !hal.command_buffer


Hm.. I'm wondering about the syntax (and printer/parser) of this op, and others like execution_barrier. Typically there would be commas or some other separator between arguments, instead of newlines, and some wrapper around the parameters like [ ], ( ), { }, < >. This certainly reads easily as a human though. WDYT?

the newlines are optional (it just needs whitespace) - the syntax here matches linalg which has things like:

linalg.generic #attrs ins(%A, %B: memref<?x?xf32>, memref<?x?xf32>) outs(%C: memref<?x?xf32>) { ^bb0(%a: f32, %b: f32, %c: f32): %d = addf %a, %b : f32 linalg.yield %d : f32 }

The commas and simple separators have ambiguity when parsing the nested types (which is why linalg does this too). Future PRs will change some of these enums to being type parameters instead, so the op will be %cmd = hal.command_buffer.create device(%device : !hal.device) : !hal.command_buffer<..., "OneShot", "Transfer|Dispatch"> (or something like that). Same with memory types and such for buffers - those will be on the return types instead of as op attributes.

SG as long as there is precedent and a good reason :)

iree/compiler/Dialect/HAL/IR/test/allocator_op_folding.mlir

iree/compiler/Dialect/HAL/Conversion/FlowToHAL/test/stream_ops.mlir

ScottTodd · 2021-03-30T22:54:27Z

iree/compiler/Dialect/HAL/IR/HALTypes.cpp

+  for (char c : shortString) {
+    switch (c) {
+      case 'T':
+        memoryType = memoryType | MemoryTypeBitfield::Transient;
+        break;
+      case 'h':
+        memoryType = memoryType | MemoryTypeBitfield::HostVisible;
+        break;


Ah, this answers my question about the single character comments, I think. Could still add more context to the .td though.

Yeah this is pulling some stuff from the future where buffer types use these - I'll continue refining them when we can produce IR that uses them and make some judgements about what makes sense. This first pass was done during my previous buffer allocation work when having 100k !hal.buffer<"HostVisible|DeviceLocal|DeviceVisible", "Constant|Transfer|Mapping", "Read|Write"> made my eyes bleed vs !hal.buffer<hD:CTM:RW>.

Ok cool. Still seems worth a comment in the .td, perhaps with just that example you included

(I still think a comment or some more documentation on the short string versions would be useful, but that can be added as this gets refined)

ScottTodd · 2021-03-30T23:00:05Z

iree/modules/check/test/success.mlir

-  %dev = hal.ex.shared_device : !hal.device
-  %allocator = hal.device.allocator %dev : !hal.allocator
-  %all_true = hal.buffer_view.const %allocator, "HostLocal|DeviceVisible", "All" : !hal.buffer_view = dense<1> : tensor<2x2xi32>
+  %device = hal.ex.shared_device : !hal.device


_{(just commenting on this file to start a thread)}

Lots of test files are updated here. I hope that wasn't too much trouble. Do you think some of these tests were brittle / checking too many things unrelated to their core purpose, or is that just the nature of changing the syntax for widely used ops?

Yeah these are all low level - ideally they'd be flow and above such that the HAL is an implementation detail. The tensorlist/strings modules are also touching the HAL directly for the same reasons.

ScottTodd

LGTM when builds pass

https://source.cloud.google.com/results/invocations/0fac8a3c-8912-4f01-8cee-30741057d4d3/targets/iree%2Fgcp_ubuntu%2Fbazel%2Flinux%2Fx86-swiftshader%2Fcore%2Fpresubmit/log

//iree/compiler/Conversion/LinalgToNVVM/test:distribute_to_thread.mlir.test FAILED in 6.7s

RUNLINE: // RUN: iree-opt -pass-pipeline="hal.executable(hal.executable.target(iree-codegen-cuda-tile-and-distribute))" %s | IreeFileCheck %s
RUNNING TEST: iree-opt -pass-pipeline="hal.executable(hal.executable.target(iree-codegen-cuda-tile-and-distribute))" iree/compiler/Conversion/LinalgToNVVM/test/distribute_to_thread.mlir | IreeFileCheck iree/compiler/Conversion/LinalgToNVVM/test/distribute_to_thread.mlir
----------------
iree/compiler/Conversion/LinalgToNVVM/test/distribute_to_thread.mlir:5:3: error: 'hal.executable.entry_point' op attribute 'ordinal' failed to satisfy constraint: size_t
  hal.executable.entry_point @add_dispatch_0 attributes {interface = @legacy_io, ordinal = 0 : i32, signature = (!flow.dispatch.tensor<readonly:1024xf32>, !flow.dispatch.tensor<readonly:1024xf32>, !flow.dispatch.tensor<writeonly:1024xf32>) -> ()}
  ^

benvanik · 2021-03-31T21:41:32Z

ah darn, new tests :)
I'll get it fixed/landed tonight after everyone heads out so I don't cause churn for people

@id

The previous HAL ops inferred the types of the values they were working on (such as !hal.buffer or !hal.device); this prevented the specialization of those types required for buffer analysis and static device feature detection. The new syntax uses `op_name<%value : !hal.type>` on the op name indicating that the op is templated on the given `%value`. Parameters are now mostly encoded in named parens like linalg to remove a lot of the parsing ambiguity that existed when they were comma separated. Future changes for allocation will use a `!hal.buffer<device, type, access, etc>` and changes for device feature detection will use a `!hal.device<@id>`. Other types like `!hal.command_buffer` may also be specialized per-device. There's some partially-updated enum support in here that will be getting improved in the follow-ups; the enums will move into the type specifiers and many of the enums used on ops will go away as well.

* 6bd5658 Merge google -> main (#5319) * 2e5257d Merge branch 'main' into google-to-main * 6936ee7 Patch VMLA performance by reserving vector size before pushing to it. (#5316) * f2f0041 NFC: Cleanup ConcretizeTileAmongstWorkgroupsPass. (#5297) * f96726a Add tests to run few other (smaller) models with Linalg on tensors path. (#5306) * fd64070 Revert "Add wasm-micro-runtime submodule and get building with CMake." (#5312) * ce0285f Continue pruning abseil usage: switch from absl::InlinedVector to std::vector... * 71e24b6 Removing hal.buffer.fill and hal.buffer.copy. (#5307) * 3c611d3 Add Mako benchmark config template file. (#5200) * 4d1a394 Fix RFFT bugs in VMLA. (#5308) * 0d55c95 Add configure_bazel.py step to TensorFlow getting started doc. * 1386d2c Switch simple_embedding_test to include drivers explicitly. (#5304) * 402550b Add StripAsserts pass and handle tf.Identity ops on tensor lists. (#5294) * fbdb4ef Add new metrics to MobileNetV2 benchmarks. (#5301) * 99c8eac Implementing Vulkan dispatch tracing. (#5287) * 2681dff Insert clones prior to mutation and not where it originates. (#5292) * aeafd9e Fix CUDA HAL bug and enable more execution tests (#5296) * 2801780 [CUDA Codegen] Enable tiling and vectorization for MatMulOp (#5293) * c61fefe Extend AffineMin canonicalization to support scf.parallel (#5289) * e0ee3f3 Add directory for microbenchmarking (#5260) * b8da32c Set wasm-export-name attributes on exported functions again. (#5286) * e2a2f81 Canonicalize affine min before applying tile-and-vecotrize passes (#5285) * 23861f7 [CUDA codegen] add vectorization infrastructure (#5278) * 6f443c4 Drop deps on Abseil's core_headers, synchronization, macros. (#5275) * e5b9e8a Actually run MobileNet with fake weights to check correctness (#5284) * e56db9a Remove dead code in LinalgToSPIRV (#5281) * 8863aa1 [NFC] Fix typos in variable names. (#5279) * 9cd93ba Turn vectorization on by default for linalg on tensors path (#5280) * 894dac6 Merge google -> main #5276 * b738162 Changing HAL dialect syntax to express all types. (#5239) * 1ba4e88 Merge branch 'main' into google-to-main * 531c73e Fix yml syntax (#5274) * 494fe32 Bumping the tracy version to 0.7.7 (WIP). (#5272) * 3616323 Disable Vulkan float16 tests on Pixel4 (#5273) * ade7ff1 Disable running BERT on Vulkan (see Issue #5268) (#5269) * 25ddc10 Add tracing to allocations made from VMA. (#5271) * df454f4 Changing iree_vm_list_resize to grow by 2x. (#5270) * bd9a113 Adding command buffer queue affinity. (#5265) * de834ae Make status matcher print the message when it fails. (#5266) * 10f5eaf Add f16 e2e tests for vulkan (#5257) * 1bdc3a4 Actually make MobileBERT run in the test. (#5264) * 2e05313 Add support for module almost_eq check for f16 type (#5261) COPYBARA_INTEGRATE_REVIEW=#5321 from NatashaKnk:main-to-google 6bd5658 PiperOrigin-RevId: 366926967

* 6bd5658 Merge google -> main (#5319) * 2e5257d Merge branch 'main' into google-to-main * 6936ee7 Patch VMLA performance by reserving vector size before pushing to it. (#5316) * f2f0041 NFC: Cleanup ConcretizeTileAmongstWorkgroupsPass. (#5297) * f96726a Add tests to run few other (smaller) models with Linalg on tensors path. (#5306) * fd64070 Revert "Add wasm-micro-runtime submodule and get building with CMake." (#5312) * ce0285f Continue pruning abseil usage: switch from absl::InlinedVector to std::vector... * 71e24b6 Removing hal.buffer.fill and hal.buffer.copy. (#5307) * 3c611d3 Add Mako benchmark config template file. (#5200) * 4d1a394 Fix RFFT bugs in VMLA. (#5308) * 0d55c95 Add configure_bazel.py step to TensorFlow getting started doc. * 1386d2c Switch simple_embedding_test to include drivers explicitly. (#5304) * 402550b Add StripAsserts pass and handle tf.Identity ops on tensor lists. (#5294) * fbdb4ef Add new metrics to MobileNetV2 benchmarks. (#5301) * 99c8eac Implementing Vulkan dispatch tracing. (#5287) * 2681dff Insert clones prior to mutation and not where it originates. (#5292) * aeafd9e Fix CUDA HAL bug and enable more execution tests (#5296) * 2801780 [CUDA Codegen] Enable tiling and vectorization for MatMulOp (#5293) * c61fefe Extend AffineMin canonicalization to support scf.parallel (#5289) * e0ee3f3 Add directory for microbenchmarking (#5260) * b8da32c Set wasm-export-name attributes on exported functions again. (#5286) * e2a2f81 Canonicalize affine min before applying tile-and-vecotrize passes (#5285) * 23861f7 [CUDA codegen] add vectorization infrastructure (#5278) * 6f443c4 Drop deps on Abseil's core_headers, synchronization, macros. (#5275) * e5b9e8a Actually run MobileNet with fake weights to check correctness (#5284) * e56db9a Remove dead code in LinalgToSPIRV (#5281) * 8863aa1 [NFC] Fix typos in variable names. (#5279) * 9cd93ba Turn vectorization on by default for linalg on tensors path (#5280) * 894dac6 Merge google -> main #5276 * b738162 Changing HAL dialect syntax to express all types. (#5239) * 1ba4e88 Merge branch 'main' into google-to-main * 531c73e Fix yml syntax (#5274) * 494fe32 Bumping the tracy version to 0.7.7 (WIP). (#5272) * 3616323 Disable Vulkan float16 tests on Pixel4 (#5273) * ade7ff1 Disable running BERT on Vulkan (see Issue #5268) (#5269) * 25ddc10 Add tracing to allocations made from VMA. (#5271) * df454f4 Changing iree_vm_list_resize to grow by 2x. (#5270) * bd9a113 Adding command buffer queue affinity. (#5265) * de834ae Make status matcher print the message when it fails. (#5266) * 10f5eaf Add f16 e2e tests for vulkan (#5257) * 1bdc3a4 Actually make MobileBERT run in the test. (#5264) * 2e05313 Add support for module almost_eq check for f16 type (#5261) PiperOrigin-RevId: 366926967

benvanik added compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm) hal/api IREE's public C hardware abstraction layer API labels Mar 27, 2021

google-cla bot added the cla: yes label Mar 27, 2021

benvanik force-pushed the benvanik-hal-cleanup branch 7 times, most recently from c615b58 to a0c3f0d Compare March 30, 2021 15:13

benvanik marked this pull request as ready for review March 30, 2021 15:19

benvanik requested a review from ScottTodd March 30, 2021 15:19

ScottTodd reviewed Mar 30, 2021

View reviewed changes

benvanik force-pushed the benvanik-hal-cleanup branch from a0c3f0d to aa3ae09 Compare March 31, 2021 18:09

benvanik requested a review from ScottTodd March 31, 2021 18:09

ScottTodd approved these changes Mar 31, 2021

View reviewed changes

benvanik force-pushed the benvanik-hal-cleanup branch from aa3ae09 to 813f7c7 Compare April 1, 2021 03:02

benvanik force-pushed the benvanik-hal-cleanup branch from 813f7c7 to e39526c Compare April 1, 2021 03:10

benvanik merged commit b738162 into main Apr 1, 2021

benvanik deleted the benvanik-hal-cleanup branch April 1, 2021 04:16

This was referenced Apr 5, 2021

Merge main -> google #5315

Closed

Merge main -> google #5321

Merged

hanhanW mentioned this pull request Apr 6, 2021

i1 cases are not handled in Linalg on tensors pipeline #5153

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changing HAL dialect syntax to express all types. #5239

Changing HAL dialect syntax to express all types. #5239

benvanik commented Mar 27, 2021 •

edited

Loading

ScottTodd Mar 30, 2021

benvanik Mar 30, 2021

ScottTodd Mar 31, 2021

ScottTodd Mar 30, 2021

benvanik Mar 30, 2021

ScottTodd Mar 31, 2021

ScottTodd Mar 31, 2021

ScottTodd Mar 30, 2021

benvanik Mar 30, 2021

ScottTodd left a comment

benvanik commented Mar 31, 2021

Changing HAL dialect syntax to express all types. #5239

Changing HAL dialect syntax to express all types. #5239

Conversation

benvanik commented Mar 27, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ScottTodd left a comment

Choose a reason for hiding this comment

benvanik commented Mar 31, 2021

benvanik commented Mar 27, 2021 •

edited

Loading