-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REREVIEW] Build infrastructure to use RAFT #2125
Conversation
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Only tiny comments
python/setuputils.py
Outdated
|
||
print("-- Third party repositories have not been found so they " + | ||
"will be cloned. To avoid this set the environment " + | ||
"variable CUML_BUILD_PATH, containing the relative " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it have to be relative? Abs paths could be clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed all relative imports, I think everything being absolute is just better for consistency, what do you think?
return repo_path, repo_cloned | ||
|
||
|
||
def get_submodule_dependency(repo, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I'm not too comfortable with parsing cmake files inside python setup.py. You did mention the reason for doing this cloning in python sometime ago, @dantegd and I seem to have forgotten it now! Can you please (re)help me remember the reason for doing this, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is for cloning RAFT if needed by Python (or any other repo), and having only one central place where RAFT is pinned. Otherwise we would need to have 2 places where the pinned commit of RAFT is defined, introducing potential conflicts.
Typically, RAFT will only be cloned by cmake and python uses that RAFT clone, or RAFT will be located in the location pointed by RAFT_PATH and both cmake and python use it from there. But if neither is true and a developer is only building the Python package, then python clones it itself (this was the same situation with treelite and other things that are/were needed by the python package)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To solve the need to parse cmake from setup.py and decouple things, we could create a config file that both cmake and setup.py read and is clearer/easier for devs to modify, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, none of the treelite/faiss/spdlog dependencies should have been exposed to the users who'd only want to build python code (but have libcuml++.so and cpp/include headers installed already)! So, clearly, this seems to be an issue with our C++ layer. I have filed issue #2138 to address this one.
|
||
# Only cloned if RAFT_PATH env variable is not defined | ||
|
||
if(DEFINED ENV{RAFT_PATH}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need to build binaries, then we will need to reopen design discussions and overall will have big implications. It is agreed right now with the cuGraph team to keep it header only, if that changes then this will need to change, and might introduce the need for a conda package distribution and other matters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But thinking about it, I don’t think that, if a binary is needed, it would need a massive change. Mainly having to modify the CMakeLists
in RAFT, then that gets included by libcugraph/libcuml and then (presumably) statically linking the libraft generated, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. How about doing something similar to what you've been doing to share RAFT's python code? In other words, we expose a RAFT_SRC_FILES
cmake variable and it is the responsibility of cuml/cugraph to include those during their build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually there might be an even better/more sustainable way, since ExternalProject_Add
can be used to trigger build processes of local folders (as opposed to just downloading/git cloning). This would allow the cmake code to be the same for working with a local or cloned RAFT, as well as triggering the build from cuML's cmake, let me do the change to see how it looks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@teju85 now both paths are using ExternalProject_Add
so that if we need to eventually configure raft, or build raft (and statically link it or include it in the libcuml package) in the future, the build infra would support that.
Couldn't pull it from the if since the DOWNLOAD_COMMAND ""
parameter overrides the git cloning no matter what :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is atleast better than the previous one. Thanks Dante for looking into this..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM. We'll anyways be testing this follow thoroughly once we have raft handle created. Then we can make appropriate updates to this infra.
@JohnZed forgot to request a rereview here from you |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Tiny suggestions added but preapproving too and happy to see those suggestions in a future PR
python/setup.py
Outdated
if not CUDA_HOME: | ||
CUDA_HOME = ( | ||
if not cuda_home: | ||
cuda_home = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny thing - there is shutil.which for a more python native version of this
python/setuputils.py
Outdated
|
||
|
||
def get_environment_option(name): | ||
ENV_VARIABLE = os.environ.get(name, False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably shouldn’t be all caps
python/setuputils.py
Outdated
""" | ||
Function to use the python code in RAFT in package.raft | ||
|
||
- If RAFT symling already exists, don't change anything. Use setup.py clean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Symlink
python/setuputils.py
Outdated
- Otherwise it will look for RAFT in the libcuml build folder, | ||
located either in the default location ../cpp/build or in | ||
$CUML_BUILD_PATH. | ||
-Otherwise it will clone RAFT into _external_repositories. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing space
rerun tests |
No description provided.