Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{ai}[foss/2022a] DGL v1.1.3 w/ CUDA 11.7.0 #20092

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
143 changes: 143 additions & 0 deletions easybuild/easyconfigs/d/DGL/DGL-1.1.3-foss-2022a-CUDA-11.7.0.eb
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
# updated to version 1.1.3, based on DGL-0.9.1-foss-2021a-CUDA-11.3.1
# GKlib-METIS added as module as third-party approach does not build
# libxsmm set to 'off' so it is using the EasyBuild module

easyblock = 'CMakeMake'

name = 'DGL'
version = '1.1.3'
versionsuffix = '-CUDA-%(cudaver)s'

homepage = 'https://www.dgl.ai'
description = """DGL is an easy-to-use, high performance and scalable Python package for deep learning on graphs.
DGL is framework agnostic, meaning if a deep graph model is a component of an end-to-end application, the rest
of the logics can be implemented in any major frameworks, such as PyTorch, Apache MXNet or TensorFlow."""

toolchain = {'name': 'foss', 'version': '2022a'}

github_account = 'dmlc'
source_urls = [GITHUB_LOWER_SOURCE]
sources = [
{
'download_filename': 'v%(version)s.tar.gz',
'filename': '%(namelower)s-%(version)s.tar.gz',
},
{
'source_urls': ['https://github.com/KarypisLab/METIS/archive'],
'download_filename': 'v5.2.1.tar.gz',
'filename': 'metis-5.2.1.tar.gz',
'extract_cmd': "tar -C %(namelower)s-%(version)s/third_party/METIS --strip-components=1 -xf %s",
Comment on lines +26 to +29
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, similar to the rest of the third party things, we do have METIS, nanoflann and such, and i don't think anyone is stopping us from adding a CCCL and the rest as well.. so i'm not sure why these were kept in as sources? @akesandgren comment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree, we got already existing EC for them. However, I could not get that working with external builds. Also, it appears that they nailed it down to specific commits as well. So in the end I decided to fall back to that approach but I am happy to get that working with existing EC if somebody can show me how to do that without unpicking everything.

Copy link
Contributor

@akesandgren akesandgren Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The METIS they use is patched in some incompatible way.
The EC i made does have nanoflann as a dependency.

And since they patch METIS I didn't even consider using an external GKlib-METIS for that reason.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could try and see if nanoflann is still working. At one point, to be honest, I decided to make it working so it is reproducible instead of spending too much time flogging what looked like a dead horse to me.

},
{
'filename': 'pcg-cpp-428802d.tar.gz',
'git_config': {
'url': 'https://github.com/imneme',
'repo_name': 'pcg-cpp',
'commit': '428802d1a5634f96bcd0705fab379ff0113bcf13',
'recursive': True,
},
'extract_cmd': "tar -C %(namelower)s-%(version)s/third_party/pcg --strip-components=1 -xf %s",
},
{
'filename': 'tensorpipe-20230206.tar.gz',
'git_config': {
'url': 'https://github.com/pytorch',
'repo_name': 'tensorpipe',
'commit': '6042f1a4cbce8eef997f11ed0012de137b317361',
'recursive': True,
},
'extract_cmd': "tar -C %(namelower)s-%(version)s/third_party/tensorpipe --strip-components=1 -xf %s",
},
{
'filename': 'cccl-c4eda1a.tar.gz',
'git_config': {
'url': 'https://github.com/NVIDIA',
'repo_name': 'cccl',
'commit': 'c4eda1aea304c012270dbd10235e60eaf47bd06f',
'recursive': True,
},
'extract_cmd': "tar -C %(namelower)s-%(version)s/third_party/cccl --strip-components=1 -xf %s",
},
{
'filename': 'nanoflann-4c47ca2.tar.gz',
'git_config': {
'url': 'https://github.com/jlblancoc',
'repo_name': 'nanoflann',
'commit': '4c47ca200209550c5628c89803591f8a753c8181',
'recursive': True,
},
'extract_cmd': "tar -C %(namelower)s-%(version)s/third_party/nanoflann --strip-components=1 -xf %s",
},
]
patches = [
'%(name)s-%(version)s_use_externals_instead_of_submodules.patch',
]
checksums = [
{'dgl-1.1.3.tar.gz': 'c45021d77ff2b1fed814a8b91260671167fb4e42b7d5fab2d37faa74ae1dc5b4'},
{'metis-5.2.1.tar.gz': '1a4665b2cd07edc2f734e30d7460afb19c1217c2547c2ac7bf6e1848d50aff7a'},
{'DGL-1.1.3_use_externals_instead_of_submodules.patch':
'89a89f8e540824ce483fbaf1750babf9d40826e40763a899d84c753d9ba18c20'},
]

builddependencies = [
('CMake', '3.24.3'),
('googletest', '1.11.0'),
]

dependencies = [
('Python', '3.10.4'),
('SciPy-bundle', '2022.05'),
('networkx', '2.8.4'),
('tqdm', '4.64.0'),
('DLPack', '0.8'),
('DMLC-Core', '0.5'),
('Parallel-Hashmap', '1.36'),
('CUDA', '11.7.0', '', SYSTEM),
('NCCL', '2.12.12', versionsuffix),
('PyTorch', '1.13.1', versionsuffix),
('libxsmm', '1.17'),
('GKlib-METIS', '5.1.1'),
]

_copts = [
'-DBUILD_CPP_TEST=ON',
'-DUSE_CUDA=ON', # Must be "ON", as opposed to "1" or so, due to bad CMake code in DGL
'-DUSE_LIBXSMM=OFF',
]
akesandgren marked this conversation as resolved.
Show resolved Hide resolved
configopts = ' '.join(_copts)

# Must not build shared libs, DGL uses internal versions of, among others, METIS
# but it doesn't install these internal libraries and simply assumes that everything is
# statically linked.
build_shared_libs = False

runtest = 'test'

exts_defaultclass = 'PythonPackage'
exts_default_options = {
'easyblock': 'PythonPackage',
'download_dep_fail': True,
'use_pip': True,
'sanity_pip_check': True,
'runtest': True,
}

exts_list = [
('dgl', version, {
'installopts': "--use-feature=in-tree-build ",
'source_tmpl': '%(namelower)s-%(version)s.tar.gz',
'start_dir': 'python',
'checksums': ['c45021d77ff2b1fed814a8b91260671167fb4e42b7d5fab2d37faa74ae1dc5b4'],
}),
]

sanity_check_paths = {
'files': ['lib/libdgl.%s' % SHLIB_EXT],
'dirs': ['lib/python%(pyshortver)s/site-packages'],
}

modextrapaths = {
'PYTHONPATH': ['lib/python%(pyshortver)s/site-packages'],
}

moduleclass = 'ai'
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
Use external EasyBuild versions of some submodules
Author: Ake Sandgren <[email protected]>
Updated and modified to version 1.1.3
Author: J. Sassmannshausen (Imperial College London/UK)
diff --git a/dgl-1.1.3.orig/CMakeLists.txt b/dgl-1.1.3/CMakeLists.txt
index db39e48..58a1c90 100644
--- a/dgl-1.1.3.orig/CMakeLists.txt
+++ b/dgl-1.1.3/CMakeLists.txt
@@ -253,31 +253,31 @@ endif(USE_CUDA)

# include directories
target_include_directories(dgl PRIVATE "include")
-target_include_directories(dgl PRIVATE "third_party/dlpack/include")
-target_include_directories(dgl PRIVATE "third_party/dmlc-core/include")
-target_include_directories(dgl PRIVATE "third_party/phmap/")
+# target_include_directories(dgl PRIVATE "third_party/dlpack/include")
+# target_include_directories(dgl PRIVATE "third_party/dmlc-core/include")
+# target_include_directories(dgl PRIVATE "third_party/phmap/")
target_include_directories(dgl PRIVATE "third_party/METIS/include/")
target_include_directories(dgl PRIVATE "tensoradapter/include")
target_include_directories(dgl PRIVATE "third_party/nanoflann/include")
-target_include_directories(dgl PRIVATE "third_party/libxsmm/include")
+# target_include_directories(dgl PRIVATE "third_party/libxsmm/include")
target_include_directories(dgl PRIVATE "third_party/pcg/include")

# For serialization
if (USE_HDFS)
option(DMLC_HDFS_SHARED "dgl has to build with dynamic hdfs library" ON)
endif()
-add_subdirectory("third_party/dmlc-core")
+# add_subdirectory("third_party/dmlc-core")
list(APPEND DGL_LINKER_LIBS dmlc)
set(GOOGLE_TEST 0) # Turn off dmlc-core test

# Compile METIS
if(NOT MSVC)
- set(GKLIB_PATH "${CMAKE_CURRENT_SOURCE_DIR}/third_party/METIS/GKlib")
- include(${GKLIB_PATH}/GKlibSystem.cmake)
- include_directories(${GKLIB_PATH})
+ set(GKLIB_PATH ${EBROOTGKLIBMINMETIS})
+ # include(${GKLIB_PATH}/GKlibSystem.cmake)
+ include_directories("${GKLIB_PATH}/include")
include_directories("third_party/METIS/include/")
add_subdirectory("third_party/METIS/libmetis/")
- list(APPEND DGL_LINKER_LIBS metis)
+ list(APPEND DGL_LINKER_LIBS metis GKlib)
endif(NOT MSVC)

# Compile LIBXSMM
@@ -296,7 +296,8 @@ if((NOT MSVC) AND USE_LIBXSMM)
)
endif(REBUILD_LIBXSMM)
add_dependencies(dgl libxsmm)
- list(APPEND DGL_LINKER_LIBS -L${CMAKE_SOURCE_DIR}/third_party/libxsmm/lib/ xsmm.a)
+ list(APPEND DGL_LINKER_LIBS -L${CMAKE_SOURCE_DIR}/third_party/libxsmm/lib/ -lxsmm.a)
+ # list(APPEND DGL_LINKER_LIBS xsmm flexiblas)
endif((NOT MSVC) AND USE_LIBXSMM)

if(NOT MSVC)
@@ -397,12 +398,16 @@ install(TARGETS dgl DESTINATION lib${LIB_SUFFIX})
# Testing
if(BUILD_CPP_TEST)
message(STATUS "Build with unittest")
- add_subdirectory(./third_party/googletest)
+ # add_subdirectory(./third_party/googletest)
enable_testing()
include_directories(${gtest_SOURCE_DIR}/include ${gtest_SOURCE_DIR})
include_directories("include")
- include_directories("third_party/dlpack/include")
- include_directories("third_party/dmlc-core/include")
+ # include_directories("third_party/dlpack/include")
+ if (USE_AVX)
+ include_directories("third_party/xbyak")
+ endif(USE_AVX)
+
+ # include_directories("third_party/dmlc-core/include")
include_directories("third_party/phmap")
include_directories("third_party/libxsmm/include")
include_directories("third_party/pcg/include")
diff --git a/dgl-1.1.3.orig/include/dgl/zerocopy_serializer.h b/dgl-1.1.3/include/dgl/zerocopy_serializer.h
index 0ba962f..78781f6 100644
--- a/dgl-1.1.3.orig/include/dgl/zerocopy_serializer.h
+++ b/dgl-1.1.3/include/dgl/zerocopy_serializer.h
@@ -19,7 +19,7 @@
#include <utility>
#include <vector>

-#include "dmlc/logging.h"
+#include <dmlc/logging.h>

namespace dgl {

diff --git a/dgl-1.1.3.orig/src/graph/serialize/heterograph_serialize.cc b/dgl-1.1.3/src/graph/serialize/heterograph_serialize.cc
index 7872b93..79cc457 100644
--- a/dgl-1.1.3.orig/src/graph/serialize/heterograph_serialize.cc
+++ b/dgl-1.1.3/src/graph/serialize/heterograph_serialize.cc
@@ -50,7 +50,7 @@
#include "../heterograph.h"
#include "./dglstream.h"
#include "./graph_serialize.h"
-#include "dmlc/memory_io.h"
+#include <dmlc/memory_io.h>

namespace dgl {
namespace serialize {
diff --git a/dgl-1.1.3.orig/src/graph/serialize/zerocopy_serializer.cc b/dgl-1.1.3/src/graph/serialize/zerocopy_serializer.cc
index 0cec855..58fc981 100644
--- a/dgl-1.1.3.orig/src/graph/serialize/zerocopy_serializer.cc
+++ b/dgl-1.1.3/src/graph/serialize/zerocopy_serializer.cc
@@ -7,7 +7,7 @@
#include <dgl/zerocopy_serializer.h>

#include "dgl/runtime/ndarray.h"
-#include "dmlc/memory_io.h"
+#include <dmlc/memory_io.h>

namespace dgl {

Loading