Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rocmPackages: extend ISA compatibility #298388

Merged
merged 15 commits into from
Apr 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions pkgs/development/rocm-modules/6/clr/default.nix
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,16 @@ in stdenv.mkDerivation (finalAttrs: {
url = "https://github.com/ROCm/clr/commit/77c581a3ebd47b5e2908973b70adea66891159ee.patch";
hash = "sha256-auBedbd7rghlKav7A9V6l64J7VmtE9GizIdi5gWj+fs=";
})
(fetchpatch {
name = "extend-hip-isa-compatibility-check.patch";
url = "https://salsa.debian.org/rocm-team/rocm-hipamd/-/raw/d6d20142c37e1dff820950b16ff8f0523241d935/debian/patches/0026-extend-hip-isa-compatibility-check.patch";
hash = "sha256-eG0ALZZQLRzD7zJueJFhi2emontmYy6xx8Rsm346nQI=";
})
(fetchpatch {
name = "improve-rocclr-isa-compatibility-check.patch";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit says it is is written by Cordell Bloor and signed-off by you.

Where is this from?
Are we allowed to include this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this from?

I wrote the patch for Debian.
https://salsa.debian.org/rocm-team/rocm-hipamd/-/blob/debian/5.7.1-3/debian/patches/0025-improve-rocclr-isa-compatibility-check.patch

Are we allowed to include this?

Yes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @cgmb! @mschwaig to prevent confusion, I've now switched the patches' url to Debian whenever possible. The only ones that are not switched are those that don't apply cleanly and needs manual fixes/rebase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good.

For the ´rocclr isa compatibility check´, are you aware that the new patch is slightly different?

Just asking to make sure this is intended.

1d0
< From 2783c57b0f225ad8bc553e2d244837d57d8375bc Mon Sep 17 00:00:00 2001
4c3
< Subject: [PATCH] improve rocclr isa compatibility check
---
> Subject: improve rocclr isa compatibility check
17,18d15
< 
< Signed-off-by: Gavin Zhao <[email protected]>
20,21c17,18
<  rocclr/device/device.cpp | 39 ++++++++++++++++++++++++++++++++++++---
<  1 file changed, 36 insertions(+), 3 deletions(-)
---
>  rocclr/device/device.cpp | 45 ++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 42 insertions(+), 3 deletions(-)
23,24d19
< diff --git a/rocclr/device/device.cpp b/rocclr/device/device.cpp
< index 0249f31d6..739bb027c 100644
27c22
< @@ -234,10 +234,43 @@ std::string Isa::isaName() const {
---
> @@ -232,10 +232,49 @@
49,51c44,47
< +        const std::array<uint32_t, 4> equivalent_gfx90x = { 0, 2, 9, 12 };
< +        if (Contains(equivalent_gfx90x, codeObjectIsa.versionStepping()) &&
< +            Contains(equivalent_gfx90x, agentIsa.versionStepping())) {
---
> +      const std::array<uint32_t, 4> gfx900_equivalent = { 0, 2, 9, 12 };
> +      const std::array<uint32_t, 5> gfx900_superset = { 0, 2, 6, 9, 12 };
> +      if (Contains(gfx900_equivalent, codeObjectIsa.versionStepping()) &&
> +          Contains(gfx900_superset, agentIsa.versionStepping())) {
55a52,55
> +        const std::array<uint32_t, 1> gfx1010_equivalent = { 0 };
> +        const std::array<uint32_t, 4> gfx1010_superset = { 0, 1, 2, 3 };
> +        if (Contains(gfx1010_equivalent, codeObjectIsa.versionStepping()) &&
> +            Contains(gfx1010_superset, agentIsa.versionStepping())) {
56a57
> +        }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm aware. The new patch is just being more cautious on compatibility by checking both the code object ISA and the host ISA. The difference is intended.

url = "https://salsa.debian.org/rocm-team/rocm-hipamd/-/raw/d6d20142c37e1dff820950b16ff8f0523241d935/debian/patches/0025-improve-rocclr-isa-compatibility-check.patch";
hash = "sha256-8eowuRiOAdd9ucKv4Eg9FPU7c6367H3eP3fRAGfXc6Y=";
})
];

postPatch = ''
Expand All @@ -124,6 +134,10 @@ in stdenv.mkDerivation (finalAttrs: {

substituteInPlace hipamd/src/hip_embed_pch.sh \
--replace "\''$LLVM_DIR/bin/clang" "${clang}/bin/clang"

# https://lists.debian.org/debian-ai/2024/02/msg00178.html
substituteInPlace rocclr/utils/flags.hpp \
--replace-fail "HIP_USE_RUNTIME_UNBUNDLER, false" "HIP_USE_RUNTIME_UNBUNDLER, true"
'';

postInstall = ''
Expand Down
2 changes: 1 addition & 1 deletion pkgs/development/rocm-modules/6/default.nix
Original file line number Diff line number Diff line change
Expand Up @@ -194,7 +194,7 @@ in rec {
};

rocblas = callPackage ./rocblas {
inherit rocblas rocmUpdateScript rocm-cmake clr tensile;
inherit rocmUpdateScript rocm-cmake clr tensile;
inherit (llvm) openmp;
stdenv = llvm.rocmClangStdenv;
};
Expand Down
5 changes: 5 additions & 0 deletions pkgs/development/rocm-modules/6/miopen/default.nix
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,11 @@ in stdenv.mkDerivation (finalAttrs: {
url = "https://github.com/ROCm/MIOpen/commit/3413d2daaeb44b7d6eadcc03033a5954a118491e.patch";
hash = "sha256-ST4snUcTmmSI1Ogx815KEX9GdMnmubsavDzXCGJkiKs=";
})
(fetchpatch {
name = "Extend-MIOpen-ISA-compatibility.patch";
url = "https://github.com/GZGavinZhao/MIOpen/commit/416088b534618bd669a765afce59cfc7197064c1.patch";
hash = "sha256-OwONCA68y8s2GqtQj+OtotXwUXQ5jM8tpeM92iaD4MU=";
})
];

outputs = [
Expand Down
4 changes: 3 additions & 1 deletion pkgs/development/rocm-modules/6/rccl/default.nix
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,9 @@ stdenv.mkDerivation (finalAttrs: {

# Really strange behavior, `#!/usr/bin/env perl` should work...
substituteInPlace CMakeLists.txt \
--replace "\''$ \''${hipify-perl_executable}" "${perl}/bin/perl ${hipify}/bin/hipify-perl"
--replace "\''$ \''${hipify-perl_executable}" "${perl}/bin/perl ${hipify}/bin/hipify-perl" \
--replace-warn "-parallel-jobs=12" "-parallel-jobs=1" \
--replace-warn "-parallel-jobs=16" "-parallel-jobs=1"
Comment on lines +69 to +70
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for those changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To respect NIX_BUILD_CORES. If not changed, the HIP compiler would launch 12 (or 16) threads for each build job to compile GPU architectures in parallel. This cause the actual cores usage to be NIX_BUILD_CORES*12 or *16, which would likely cause memory issues. By setting them to 1 (which is the default), we restore the compiler back to the default behavior and therefore adhere to NIX_BUILD_CORES, preventing surprising CPU and memory usage when building.

'';

postInstall = lib.optionalString buildTests ''
Expand Down
170 changes: 51 additions & 119 deletions pkgs/development/rocm-modules/6/rocblas/default.nix
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please elaborate a bit to explain the changes in this file.

Why is it not necessary anymore to call the tensile build separately for each GPU generation and merge the results?
Is this related to the ISA compatibility changes or is it just a nice refactor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it not necessary anymore to call the tensile build separately for each GPU generation and merge the results?

  1. It shouldn't have been necessary in the first place.
  2. The way of building separately for each architecture breaks when sepArchitectures is false (as probably implied by the name). Building architectures together fixed it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason for this was that without it rocblas is slightly too large for the cache.

$ nix build .#rocmPackages.rocblas
$ du -h --apparent-size --max-depth 0 result/
3.1G	result/

This is something we could address via #305920 or something like e380b53, but I don't know if we need a solution for the size issue before merging this, since we're not making things worse than they are right now anyways.

I have it on my list to check the sizes of other outputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say merge this for now then fix it later, since rocblas doesn't even build on master right now. Then either #305920 will fix it or we can figure out how to do splitting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it broken on master because of cmake? Seems like we don't want to introduce a cache miss just because it's broken on master.

Copy link
Member

@mschwaig mschwaig Apr 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it's fixed on master now.

If we merge this PR as is rocblas will no longer be cached.

I think a better argument might be that not having composable_kernel in cache is a much bigger issue than not having rocblas in cache, since

  1. rocblas only takes 20 minutes to build, orders of magnitude less than composable_kernel.
  2. Whatever solution we come up with to cache composable_kernel, we can also use to cache rocblas.

Probably that argument is also invalid to some degree, because composable_kernel is not a runtime dependency of rocblas, and applications depend on rocblas, not composable_kernel.
So I think as long as rocblas is cached most users do not suffer from composable_kernel not being cached.

Is that right?
EDIT: it's more than 20 min

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am now checking if removing the early-engineering sample only gfx940 and gfx941 targets from just rocblas would get us below 3 GB.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this would get us down to 2.7GB: 89ab15f

Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{ rocblas
, lib
{ lib
, stdenv
, fetchFromGitHub
, fetchpatch
, rocmUpdateScript
, runCommand
, cmake
Expand All @@ -21,57 +21,26 @@
, buildBenchmarks ? false
, tensileLogic ? "asm_full"
, tensileCOVersion ? "default"
, tensileSepArch ? true
, tensileLazyLib ? true
# https://github.com/ROCm/Tensile/issues/1757
# Allows gfx101* users to use rocBLAS normally.
# Turn the below two values to `true` after the fix has been cherry-picked
# into a release. Just backporting that single fix is not enough because it
# depends on some previous commits.
, tensileSepArch ? false
, tensileLazyLib ? false
, tensileLibFormat ? "msgpack"
, gpuTargets ? [ "all" ]
# `gfx940`, `gfx941` are not present in this list because they are early
# engineering samples, and all final MI300 hardware are `gfx942`:
# https://github.com/NixOS/nixpkgs/pull/298388#issuecomment-2032791130
#
# `gfx1012` is not present in this list because the ISA compatibility patches
# would force all `gfx101*` GPUs to run as `gfx1010`, so `gfx101*` GPUs will
# always try to use `gfx1010` code objects, hence building for `gfx1012` is
# useless: https://github.com/NixOS/nixpkgs/pull/298388#issuecomment-2076327152
, gpuTargets ? [ "gfx900;gfx906:xnack-;gfx908:xnack-;gfx90a:xnack+;gfx90a:xnack-;gfx942;gfx1010;gfx1030;gfx1100;gfx1101;gfx1102" ]
}:

let
# NOTE: Update the default GPU targets on every update
gfx80 = (rocblas.override {
GZGavinZhao marked this conversation as resolved.
Show resolved Hide resolved
gpuTargets = [
"gfx803"
];
}).overrideAttrs { pname = "rocblas-tensile-gfx80"; };

gfx90 = (rocblas.override {
gpuTargets = [
"gfx900"
"gfx906:xnack-"
"gfx908:xnack-"
"gfx90a:xnack+"
"gfx90a:xnack-"
];
}).overrideAttrs { pname = "rocblas-tensile-gfx90"; };

gfx94 = (rocblas.override {
gpuTargets = [
"gfx940"
"gfx941"
"gfx942"
];
}).overrideAttrs { pname = "rocblas-tensile-gfx94"; };

gfx10 = (rocblas.override {
gpuTargets = [
"gfx1010"
"gfx1012"
"gfx1030"
];
}).overrideAttrs { pname = "rocblas-tensile-gfx10"; };

gfx11 = (rocblas.override {
gpuTargets = [
"gfx1100"
"gfx1101"
"gfx1102"
];
}).overrideAttrs { pname = "rocblas-tensile-gfx11"; };

# Unfortunately, we have to do two full builds, otherwise we get overlapping _fallback.dat files
fallbacks = rocblas.overrideAttrs { pname = "rocblas-tensile-fallbacks"; };
in stdenv.mkDerivation (finalAttrs: {
stdenv.mkDerivation (finalAttrs: {
pname = "rocblas";
version = "6.0.2";

Expand All @@ -94,6 +63,8 @@ in stdenv.mkDerivation (finalAttrs: {
cmake
rocm-cmake
clr
] ++ lib.optionals buildTensile [
tensile
GZGavinZhao marked this conversation as resolved.
Show resolved Hide resolved
];

buildInputs = [
Expand All @@ -114,80 +85,41 @@ in stdenv.mkDerivation (finalAttrs: {
];

cmakeFlags = [
"-DCMAKE_C_COMPILER=hipcc"
"-DCMAKE_CXX_COMPILER=hipcc"
"-Dpython=python3"
"-DAMDGPU_TARGETS=${lib.concatStringsSep ";" gpuTargets}"
"-DBUILD_WITH_TENSILE=${if buildTensile then "ON" else "OFF"}"
# Manually define CMAKE_INSTALL_<DIR>
# See: https://github.com/NixOS/nixpkgs/pull/197838
"-DCMAKE_INSTALL_BINDIR=bin"
"-DCMAKE_INSTALL_LIBDIR=lib"
"-DCMAKE_INSTALL_INCLUDEDIR=include"
(lib.cmakeFeature "CMAKE_C_COMPILER" "hipcc")
(lib.cmakeFeature "CMAKE_CXX_COMPILER" "hipcc")
(lib.cmakeFeature "python" "python3")
(lib.cmakeFeature "AMDGPU_TARGETS" (lib.concatStringsSep ";" gpuTargets))
(lib.cmakeBool "BUILD_WITH_TENSILE" buildTensile)
(lib.cmakeBool "ROCM_SYMLINK_LIBS" false)
(lib.cmakeFeature "ROCBLAS_TENSILE_LIBRARY_DIR" "lib/rocblas")
(lib.cmakeBool "BUILD_CLIENTS_TESTS" buildTests)
(lib.cmakeBool "BUILD_CLIENTS_BENCHMARKS" buildBenchmarks)
# rocblas header files are not installed unless we set this
(lib.cmakeFeature "CMAKE_INSTALL_INCLUDEDIR" "include")
] ++ lib.optionals buildTensile [
"-DVIRTUALENV_HOME_DIR=/build/source/tensile"
"-DTensile_TEST_LOCAL_PATH=/build/source/tensile"
"-DTensile_ROOT=/build/source/tensile/${python3.sitePackages}/Tensile"
"-DTensile_LOGIC=${tensileLogic}"
"-DTensile_CODE_OBJECT_VERSION=${tensileCOVersion}"
"-DTensile_SEPARATE_ARCHITECTURES=${if tensileSepArch then "ON" else "OFF"}"
"-DTensile_LAZY_LIBRARY_LOADING=${if tensileLazyLib then "ON" else "OFF"}"
"-DTensile_LIBRARY_FORMAT=${tensileLibFormat}"
] ++ lib.optionals buildTests [
"-DBUILD_CLIENTS_TESTS=ON"
] ++ lib.optionals buildBenchmarks [
"-DBUILD_CLIENTS_BENCHMARKS=ON"
(lib.cmakeBool "BUILD_WITH_PIP" false)
(lib.cmakeFeature "Tensile_LOGIC" tensileLogic)
(lib.cmakeFeature "Tensile_CODE_OBJECT_VERSION" tensileCOVersion)
(lib.cmakeBool "Tensile_SEPARATE_ARCHITECTURES" tensileSepArch)
(lib.cmakeBool "Tensile_LAZY_LIBRARY_LOADING" tensileLazyLib)
(lib.cmakeFeature "Tensile_LIBRARY_FORMAT" tensileLibFormat)
(lib.cmakeBool "Tensile_PRINT_DEBUG" true)
] ++ lib.optionals (buildTests || buildBenchmarks) [
"-DCMAKE_CXX_FLAGS=-I${amd-blis}/include/blis"
(lib.cmakeFeature "CMAKE_CXX_FLAGS" "-I${amd-blis}/include/blis")
];

postPatch = lib.optionalString (finalAttrs.pname != "rocblas") ''
# Return early and install tensile files manually
substituteInPlace library/src/CMakeLists.txt \
--replace "set_target_properties( TensileHost PROPERTIES OUTPUT_NAME" "return()''\nset_target_properties( TensileHost PROPERTIES OUTPUT_NAME"
'' + lib.optionalString (buildTensile && finalAttrs.pname == "rocblas") ''
# Link the prebuilt Tensile files
mkdir -p build/Tensile/library

for path in ${gfx80} ${gfx90} ${gfx94} ${gfx10} ${gfx11} ${fallbacks}; do
ln -s $path/lib/rocblas/library/* build/Tensile/library
done

unlink build/Tensile/library/TensileManifest.txt
'' + lib.optionalString buildTensile ''
# Tensile REALLY wants to write to the nix directory if we include it normally
cp -a ${tensile} tensile
chmod +w -R tensile

# Rewrap Tensile
substituteInPlace tensile/bin/{.t*,.T*,*} \
--replace "${tensile}" "/build/source/tensile"

substituteInPlace CMakeLists.txt \
--replace "include(virtualenv)" "" \
--replace "virtualenv_install(\''${Tensile_TEST_LOCAL_PATH})" ""
'';
patches = [
(fetchpatch {
name = "Extend-rocBLAS-HIP-ISA-compatibility.patch";
url = "https://github.com/GZGavinZhao/rocBLAS/commit/89b75ff9cc731f71f370fad90517395e117b03bb.patch";
hash = "sha256-W/ohOOyNCcYYLOiQlPzsrTlNtCBdJpKVxO8s+4G7sjo=";
})
];

postInstall = lib.optionalString (finalAttrs.pname == "rocblas") ''
ln -sf ${fallbacks}/lib/rocblas/library/TensileManifest.txt $out/lib/rocblas/library
'' + lib.optionalString (finalAttrs.pname != "rocblas") ''
mkdir -p $out/lib/rocblas/library
rm -rf $out/share
'' + lib.optionalString (finalAttrs.pname != "rocblas" && finalAttrs.pname != "rocblas-tensile-fallbacks") ''
rm Tensile/library/{TensileManifest.txt,*_fallback.dat}
mv Tensile/library/* $out/lib/rocblas/library
'' + lib.optionalString (finalAttrs.pname == "rocblas-tensile-fallbacks") ''
mv Tensile/library/{TensileManifest.txt,*_fallback.dat} $out/lib/rocblas/library
'' + lib.optionalString buildTests ''
mkdir -p $test/bin
cp -a $out/bin/* $test/bin
rm $test/bin/*-bench || true
'' + lib.optionalString buildBenchmarks ''
mkdir -p $benchmark/bin
cp -a $out/bin/* $benchmark/bin
rm $benchmark/bin/*-test || true
'' + lib.optionalString (buildTests || buildBenchmarks ) ''
rm -rf $out/bin
# Pass $NIX_BUILD_CORES to Tensile
postPatch = ''
substituteInPlace cmake/build-options.cmake \
--replace-fail 'Tensile_CPU_THREADS ""' 'Tensile_CPU_THREADS "$ENV{NIX_BUILD_CORES}"'
'';

passthru.updateScript = rocmUpdateScript {
Expand Down
10 changes: 10 additions & 0 deletions pkgs/development/rocm-modules/6/rocm-runtime/default.nix
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
{ lib
, stdenv
, fetchFromGitHub
, fetchpatch
, rocmUpdateScript
, pkg-config
, cmake
Expand Down Expand Up @@ -42,6 +43,15 @@ stdenv.mkDerivation (finalAttrs: {
libxml2
];

patches = [
(fetchpatch {
name = "extend-isa-compatibility-check.patch";
url = "https://salsa.debian.org/rocm-team/rocr-runtime/-/raw/076026d43bbee7f816b81fea72f984213a9ff961/debian/patches/0004-extend-isa-compatibility-check.patch";
GZGavinZhao marked this conversation as resolved.
Show resolved Hide resolved
hash = "sha256-cC030zVGS4kNXwaztv5cwfXfVwOldpLGV9iYgEfPEnY=";
stripLen = 1;
})
];

postPatch = ''
patchShebangs image/blit_src/create_hsaco_ascii_file.sh
patchShebangs core/runtime/trap_handler/create_trap_handler_header.sh
Expand Down
9 changes: 9 additions & 0 deletions pkgs/development/rocm-modules/6/rocprim/default.nix
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
{ lib
, fetchpatch
, stdenv
, fetchFromGitHub
, rocmUpdateScript
Expand Down Expand Up @@ -31,6 +32,14 @@ stdenv.mkDerivation (finalAttrs: {
hash = "sha256-nWvq26qRPZ6Au1rc5cR74TKArcdUFg7O9djFi8SvMeM=";
};

patches = [
(fetchpatch {
name = "arch-conversion-marco.patch";
url = "https://salsa.debian.org/rocm-team/rocprim/-/raw/70c8aaee3cf545d92685f4ed9bf8f41e3d4d570c/debian/patches/arch-conversion-macro.patch";
hash = "sha256-oXdmbCArOB5bKE8ozDFrSh4opbO+c4VI6PNhljeUSms=";
})
];

nativeBuildInputs = [
cmake
rocm-cmake
Expand Down
20 changes: 17 additions & 3 deletions pkgs/development/rocm-modules/6/tensile/default.nix
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
{ lib
, stdenv
, fetchFromGitHub
, fetchpatch
, rocmUpdateScript
, buildPythonPackage
, pytestCheckHook
Expand Down Expand Up @@ -34,6 +35,19 @@ buildPythonPackage rec {
joblib
];

patches = [
(fetchpatch {
name = "Extend-Tensile-HIP-ISA-compatibility.patch";
url = "https://github.com/GZGavinZhao/Tensile/commit/855cb15839849addb0816a6dde45772034a3e41f.patch";
GZGavinZhao marked this conversation as resolved.
Show resolved Hide resolved
hash = "sha256-d+fVf/vz+sxGqJ96vuxe0jRMgbC5K6j5FQ5SJ1e3Sl8=";
})
(fetchpatch {
name = "Don-t-copy-file-twice-in-copyStaticFiles.patch";
url = "https://github.com/GZGavinZhao/Tensile/commit/9e14d5a00a096bddac605910a0e4dfb4c35bb0d5.patch";
GZGavinZhao marked this conversation as resolved.
Show resolved Hide resolved
hash = "sha256-gOzjJyD1K056OFQ+hK5nbUeBhxLTIgQLoT+0K12SypI=";
})
];

doCheck = false; # Too many errors, not sure how to set this up properly

nativeCheckInputs = [
Expand All @@ -42,9 +56,9 @@ buildPythonPackage rec {
rocminfo
];

preCheck = ''
export ROCM_PATH=${rocminfo}
'';
env = {
ROCM_PATH = rocminfo;
};

pythonImportsCheck = [ "Tensile" ];

Expand Down