-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert build system to Bazel. #2887
Comments
Will CMake script be retired or will both ways be kept? Currently, there are duplicated building for thirdparty libraries, e.g. zlib for Arrow and Parquet, etc. Avoiding duplicated building may improve the building time. |
Hi @robertnishihara ,
I ever referred to tensorflow project to find out cmake best practice. And draw a principle for thirdparty dependency: In maven, which is the cornerstone, it has maven center repository and local repository. It helps manage all downloaded jars and you can use them in your maven project. So when it comes to cmake, cmake should also be able to reuse the downloaded dependencies. I would suggest to use So ... for c++ dependency, I need a mechanism to support it. cmake is not enough. I did not package ray yet, because it is already packaged by python wheels. But for cmake package mechanism, which is based on gcc library links, I found in tensorflow, it uses static library instead of dynamic for thirdparty libraries. em... I have to admit it is quite reasonable. I have not used bazel yet, but I found it is easy to understand once you know cmake. And it has the ability to support other languages, which is good for python and java. Also It has better third-party dependency management. The key problem may be it is too new to cooperate with IDEs, which for me it is important to debug. And at last, if you'd like, we can keep cmake and bazel both solutions. In tensorflow, it has cmake in a standalone directory which contains the whole tensorflow project. |
@guoyuhong @chuxi We could potentially keep around cmake for IDE reasons, but I think it would be substantially less work to just retire cmake. @chuxi where is the TensorFlow directory that you're talking about? |
yes, transitive dependency problem. And I am considering to write some findXXX modules, which support to find a directory which contains all dependency. we define a cmake property: CMAKE_EXTERNAL_HOME, by default it is ${CMKAE_CURRENT_BINARY_DIR}/build/external. And then we write some findXXX.cmake modules based on the directory. Then in the ${CMAKE_EXTERNAL_HOME}, all dependencies are kept and the user could get benefit from it. And for user he can change it to his own directory. So for each build he can get the benefit from local pre build projects. |
|
|
Bazel supports Python: https://docs.bazel.build/versions/master/be/python.html and is backed by Google. |
I looked a bit into the currently available python support:
Overall it looks workable, though it would probably make sense to tackle Python build conversion incrementally since things aren't fully baked here. Re: buck, my impression is that does not have much adoption outside of facebook. |
We've got all C++ dependencies building internally at v0.5.2. I've opened a PR with a partial BUILD file that still needs the external stuff to be added. Totally fine to not merge that, but thought I'd send along so that you all have a starting point. |
Ideally the code would be restructured so that there's a BUILD file per directory and things are a bit more decoupled. But that would be a lot more work to disentangle stuff. The BUILD file in the PR was written so as to require no/minimal changes to the source given the cmake build. |
For the external dependencies, it looks like each would have a A A |
I had quite some issues in the past installing Bazel to compile some libraries on some supercomputing clusters (like Berkeley's Savio) because it depends on Java. And then you have this problem that to compile a simple thing which does not use Java, you have to get Java onto the system. Was not impressed. |
@pcmoritz, if we use Arrow binary artifacts instead of compiling Arrow, then the switch to Bazel will probably be much easier. Note
|
Can't remember if I had previously shared this, but in case it's relevant
to what you're thinking about, Bazel genrule
<https://docs.bazel.build/versions/master/be/general.html#genrule> can wrap
cmake commands.
…On Tue, Dec 4, 2018 at 11:15 AM Robert Nishihara ***@***.***> wrote:
@pcmoritz <https://github.com/pcmoritz>, if we use Arrow binary artifacts
instead of compiling Arrow, then the switch to Bazel will probably be much
easier.
Note
- This requires frequent (e.g., nightly) and persistent binary
artifacts from Arrow.
- This makes it harder to test changes to Arrow in Ray.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2887 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABEGW2QswoV-s3Tgs_to-JDjqNe705_Hks5u1snKgaJpZM4WqPnW>
.
|
To interface with cmake, see also https://github.com/bazelbuild/rules_foreign_cc (cc @irengrig). |
Thanks @rsepassi and @laurentlb! |
Might already know this, but just in case, I think Bazel wasn't added as a dependency yet on master, so this breaks compilation unless the user explicitly installs it before running |
Looks like some error also pops up during redis compilation:
|
Hmm, I haven't seen this one. What happens if you do |
Sorry for delay. Found something really weird. Using the system curl, it works just fine. The (my) anaconda curl is what is "broken". System: CentOS 7
With anaconda3:
It appears that E: ok, same problem / error message either way. Not knowing much about
|
Any idea what may be happening? Weird that none of the previous |
Hm, for me
|
Your composition works for me, too, so even more confused now... is that the actual command being run by the build system, though? |
@zbarry I've seen the same error that you mentioned in our CI (on Travis) nondeterministically. It's possible that we need to just retry that command if it fails, though it'd be nice to have a more reliable approach. |
Gotcha. Let me get a fresh repo + conda env going and see what happens |
Hmm. At least in my case, I don't think the issue is a Monte Carlo I changed the
which gave the exact command being executed. Maybe this is helpful(?). Here, you can see clearly that the command run is indeed
|
Hmm, I don't really see what is going wrong here. As a workaround, do you want to try replacing curl by wget and see if that fixes the problem? |
Ugh. I figured it out because |
So not including your environment variables is actually a feature of Bazel (it's supposed to create the same build environment on every machine, this also means having a well-defined set of environment variables). For use cases like this, you can use the following functionality I think: https://bazel.build/designs/2016/06/21/environment.html If this helps, contributing a patch that allows to pass arguments from setup.py into bazel would be welcome! |
Actually, I think putting the option into your |
Thanks. Will look into modding |
Ok. Good to go. Since that Bazel env config page is not super explicit about how to specify vars to pass through the build environment, for others that have the misfortune of living life behind a proxy, just plop these into
|
@zbarry thanks for the fix, @jliagouris just ran into the same problem and your suggestion worked. |
It seems like using Bazel (https://github.com/bazelbuild/bazel) would improve our build system in a lot of ways.
@rshin did an early version of this a long time ago in ray-project/ray-legacy#408. @ericl, @concretevitamin, @pcmoritz have all used Bazel I believe and are proponents. @rsepassi and @dbieber have recently worked on this and have a good sense of what is involved.
Any thoughts about this? Any drawbacks to be aware of?
cc @chuxi @guoyuhong @raulchen
The text was updated successfully, but these errors were encountered: