Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define executable C API and rework LLVM target to emit it #3580

Closed
benvanik opened this issue Oct 22, 2020 · 2 comments · Fixed by #5195
Closed

Define executable C API and rework LLVM target to emit it #3580

benvanik opened this issue Oct 22, 2020 · 2 comments · Fixed by #5195
Assignees
Labels
codegen Shared code generation infrastructure and dialects compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm) hal/api IREE's public C hardware abstraction layer API hal/cpu Runtime Host/CPU-based HAL backend runtime Relating to the IREE runtime library

Comments

@benvanik
Copy link
Collaborator

With the threading changes I'll be unifying the host-based HAL backends around a common executable API. VMLA, dylib, llvmjit, and in the future emitc/statically linked executables will then all share this API to expose their entry points, exchange library functions, and query information.

The goal is to get down to near-zero bytes consumed that are not in rdata (in the AOT case) and remove the need for the wrapper flatbuffers in cases where they shouldn't need to exist like statically linked executables.

@benvanik benvanik added compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm) runtime Relating to the IREE runtime library codegen Shared code generation infrastructure and dialects labels Oct 22, 2020
@benvanik benvanik added this to the 2020Q4 Core milestone Oct 22, 2020
@benvanik benvanik self-assigned this Oct 22, 2020
@benvanik
Copy link
Collaborator Author

Working version:

//===----------------------------------------------------------------------===//
// Versioning and interface querying
//===----------------------------------------------------------------------===//

// Known valid version values.
enum iree_hal_library_version_e {
  // iree_hal_library_v0_t is used as the API communication structure.
  IREE_HAL_LIBRARY_VERSION_0 = 0u,
};
typedef uint32_t iree_hal_library_version_t;

// The latest version of the library API; can be used to populate the
// iree_hal_library_header_t::version when building libraries.
#define IREE_HAL_LIBRARY_LATEST_VERSION IREE_HAL_LIBRARY_VERSION_0

// A header present at the top of all versions of the library API used by the
// runtime to ensure version compatibility.
typedef struct {
  // Version of the API this library was built with, which was likely the value
  // of IREE_HAL_LIBRARY_LATEST_VERSION.
  iree_hal_library_version_t version;

  // Name used for logging/diagnostics.
  const char* name;
} iree_hal_library_header_t;

// Exported function from dynamic libraries for querying library information.
// The provided |max_version| is the maximum version the caller supports;
// callees must return NULL if their lowest available version is greater
// than the max version supported by the caller.
typedef const iree_hal_library_header_t* (*iree_hal_library_query_fn_t)(
    iree_hal_library_version_t max_version);

// Function name exported from dynamic libraries (pass to dlsym).
#define IREE_HAL_LIBRARY_EXPORT_NAME "iree_hal_library_query"

//===----------------------------------------------------------------------===//
// IREE_HAL_LIBRARY_VERSION_0
//===----------------------------------------------------------------------===//

typedef void* iree_hal_library_binding_ptr_t;

// Read-only per-dispatch state passed to each tile in a dispatch.
typedef struct {
  const uint32_t* push_constants;
  const iree_hal_library_binding_ptr_t* bindings;
  uint32_t workgroup_size[3];
  uint32_t workgroup_count[3];
} iree_hal_library_dispatch_state_v0_t;

// Function signature of exported executable entry points.
// The same |state| is passed to all tiles in a dispatch, with workgroup_[xyz]
// varying per-tile up to the workgroup_count. Each tile represents
// workgroup_size invocations in the global grid.
typedef void (*iree_hal_library_dispatch_v0_t)(
    const iree_hal_library_dispatch_state_v0_t* state, uint32_t workgroup_x,
    uint32_t workgroup_y, uint32_t workgroup_z);

// Structure used for v0 library interfaces.
// The entire structure is designed to be read-only and able to live embedded in
// the binary .rdata section.
typedef struct {
  // Version/metadata header. Will have a version of IREE_HAL_LIBRARY_VERSION_0.
  iree_hal_library_header_t header;

  // The total number of entry points available in the library. Bounds all of
  // the tables below.
  size_t entry_point_count;

  // Table of export function entry points matching the ordinals defined during
  // library generation. The runtime will use this table to map the ordinals to
  // function pointers for execution.
  const iree_hal_library_dispatch_v0_t* entry_points;

  // Optional table of export function entry point names 1:1 with entry_points.
  // These names are only used for tracing/debugging and can be omitted to save
  // binary size.
  const char** entry_point_names;

  // Optional table of entry point tags that describe the entry point in a
  // human-readable format useful for verbose logging. The string values, when
  // present, may be attached to tracing/debugging events related to the entry
  // point.
  const char** entry_point_tags;
} iree_hal_library_v0_t;

this is still easy to build in LLVM IR, and collapses down to effectively just a function pointer table in release builds (the names/tags being optional):

static void dispatch_tile_a(
    const iree_hal_library_dispatch_state_v0_t* __attribute__((nonnull)) state,
    uint32_t workgroup_x, uint32_t workgroup_y, uint32_t workgroup_z) {
  // <do things here, do *not* access mutable globals>
  uint8_t* dst = ((uint8_t*)state->bindings[0]);
  const uint8_t* src = ((const uint8_t*)state->bindings[1]);
  dst[workgroup_x] = src[workgroup_x];
  dst[workgroup_y] = src[workgroup_y];
}

static void dispatch_tile_b(
    const iree_hal_library_dispatch_state_v0_t* __attribute__((nonnull)) state,
    uint32_t workgroup_x, uint32_t workgroup_y, uint32_t workgroup_z) {
  // <do things here, do *not* access mutable globals>
}

static const iree_hal_library_dispatch_v0_t entry_points[2] = {
  dispatch_tile_a, dispatch_tile_b,
};
static const char* entry_point_names[2] = {
  "dispatch_tile_a", "dispatch_tile_b",
};
const char* entry_point_tags[2] = {
  "matmul+div", "conv2d[512x512]",
};

static const iree_hal_library_v0_t library = {
  { IREE_HAL_LIBRARY_LATEST_VERSION, "lib_a" },
  2,
  entry_points,
  entry_point_names,  // optional
  entry_point_tags,   // optional
};

extern const iree_hal_library_v0_t* iree_hal_library_query(iree_hal_library_version_t max_version) {
  return max_version <= 0 ? &library : NULL;
}

I think this is a good enough placeholder for the first version with room to iterate (exposing different entry points based on CPU capabilities, etc).

@benvanik
Copy link
Collaborator Author

LLVM IR here for reference: https://godbolt.org/z/GvvhW8

benvanik added a commit that referenced this issue Feb 9, 2021
This will make it possible to version the ABI independent of the lowerings.

Progress on #3580.
benvanik added a commit that referenced this issue Feb 9, 2021
This will make it possible to version the ABI independent of the lowerings.

Progress on #3580.
benvanik added a commit that referenced this issue Feb 9, 2021
This switches the generated code to using the new executable library
signature while still routing all the outputs through the flatbuffers.

Progress on #3580.
benvanik added a commit that referenced this issue Feb 9, 2021
This switches the generated code to using the new executable library
signature while still routing all the outputs through the flatbuffers.

Progress on #3580.
benvanik added a commit that referenced this issue Feb 9, 2021
This switches the generated code to using the new executable library
signature while still routing all the outputs through the flatbuffers.

Progress on #3580.
benvanik added a commit that referenced this issue Feb 9, 2021
This switches the generated code to using the new executable library
signature while still routing all the outputs through the flatbuffers.

Progress on #3580.
benvanik added a commit that referenced this issue Feb 9, 2021
This switches the generated code to using the new executable library
signature while still routing all the outputs through the flatbuffers.

Progress on #3580.
benvanik added a commit that referenced this issue Feb 11, 2021
This will make it possible to version the ABI independent of the lowerings.

Progress on #3580.
benvanik added a commit that referenced this issue Feb 11, 2021
This switches the generated code to using the new executable library
signature while still routing all the outputs through the flatbuffers.

Progress on #3580.
benvanik added a commit that referenced this issue Feb 11, 2021
This switches the runtime and generated code to using the new executable
library signature while still routing all the outputs through the flatbuffers.
Future changes will start generating the library metadata structures.

Progress on #3580.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen Shared code generation infrastructure and dialects compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm) hal/api IREE's public C hardware abstraction layer API hal/cpu Runtime Host/CPU-based HAL backend runtime Relating to the IREE runtime library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant