Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to compile with Clang15 #10363

Closed
wants to merge 1 commit into from

Conversation

czentgr
Copy link
Collaborator

@czentgr czentgr commented Jul 1, 2024

Clang needs lib atomic symbols to be resolved for a number of executables. It is added as needed.

/opt/rh/gcc-toolset-13/root/usr/lib/gcc/x86_64-redhat-linux/13/../../../../bin/ld: velox/exec/libvelox_exec.a(Task.cpp.o): in function `std::__atomic_base<unsigned int>::is_lock_free() const':
/opt/rh/gcc-toolset-13/root/usr/lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/atomic_base.h:460: undefined reference to `__atomic_is_lock_free'
/opt/rh/gcc-toolset-13/root/usr/lib/gcc/x86_64-redhat-linux/13/../../../../bin/ld: velox/exec/libvelox_exec.a(Task.cpp.o): in function `std::__atomic_base<long>::is_lock_free() const':
/opt/rh/gcc-toolset-13/root/usr/lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/atomic_base.h:460: undefined reference to `__atomic_is_lock_free'clang++: error: linker command failed with exit code 1 (use -v to see invocation)ninja: build stopped: subcommand failed.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 1, 2024
@czentgr
Copy link
Collaborator Author

czentgr commented Jul 1, 2024

@majetideepak FYI

Copy link

netlify bot commented Jul 1, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 773900d
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/66b62e3c920a2900085a6e9b

Makefile Outdated Show resolved Hide resolved
@czentgr czentgr force-pushed the cz_prepare_clang17 branch 3 times, most recently from e3908cf to 6ccb99e Compare July 2, 2024 18:09
@@ -4401,13 +4407,18 @@ struct DictColumnWriterTestCase {
context.initBuffer();

// complexVectorType will be nullptr if the vector is not complex.
/*
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove old code.

@czentgr
Copy link
Collaborator Author

czentgr commented Jul 2, 2024

Need to work on this a bit more. Some of the deps built don't pick up the change. For example arrow:

16:48:46  #12 35.16 [203/860] Performing configure step for 'arrow_ep'
16:48:46  #12 35.16 -- Building using CMake version: 3.28.3
16:48:46  #12 35.16 -- The C compiler identification is GNU 12.2.1
16:48:46  #12 35.16 -- The CXX compiler identification is GNU 12.2.1

I guess the variables do not get inherited into bundled build with their own projects. Probably needs a few more changes.

Dependencies (from the setup scripts are also built with gcc).

@czentgr czentgr marked this pull request as ready for review July 3, 2024 14:53
@czentgr czentgr changed the title [WIP] Changes to compile with Clang17 Changes to compile with Clang17 Jul 3, 2024
velox/type/DecimalUtil.h Outdated Show resolved Hide resolved
velox/type/DecimalUtil.h Outdated Show resolved Hide resolved
Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @czentgr . Changes look good to me.
We should add a CI test as well to ensure support is not broken.
@kgpai any thoughts on this?

@majetideepak
Copy link
Collaborator

I think we have to support building deps with clang17 as well.

@czentgr
Copy link
Collaborator Author

czentgr commented Jul 9, 2024

So recently Centos9 updated their clang from version 17 to 18 and it causes additional build breaks (around arrays that use a variable input size and possibly more). As a result I've installed clang15 (the only other easily available version from the repo) and built with that.
As a result of using clang15 I also found out how to make it available in ccache (not done by default after installing the package). Cmake doesn't seem to have a good way of changing the default compiler especially in external projects. Cmake picks up the CC and CXX environment variables - and this would apply to subprojects made when building.

Overall improvement of 99 TPCDS queries is about 8% compared to gcc12. Some queries are slightly slower and some are faster. Spread is 0.65 to 3.4 ratio. That is the worst query took almost twice as long (this one is a subsecond query) and best query improved by factor 3ish (19s down to 5.5s) .

@czentgr czentgr force-pushed the cz_prepare_clang17 branch 8 times, most recently from eb7ed01 to 0dc3366 Compare July 10, 2024 22:10
@majetideepak majetideepak changed the title Changes to compile with Clang17 Changes to compile with Clang15 Jul 10, 2024
@czentgr
Copy link
Collaborator Author

czentgr commented Jul 23, 2024

A re-run of the same code but with various combinations of prestissimo and dependency compilation yielded some interesting results running the 99 queries of TPCDS in sf10k.

We tested 3 combinations:

  1. Prestissimo built with clang 15, dependencies (from setup script) built with clang 15
  2. Prestissimo built with clang 15, dependencies (from setup script) built with gcc12
  3. Prestissimo built with gcc12, dependencies built with gcc12

The final result is that the option 2 performed best in terms of total time. Option 1 and 2 were almost similar. However, notably option 1 and 2 shaved off about 5 mins off q67 which is the longest running query from 40 mins (in option 3) to 35 mins. However, some queries ran slower with clang 15. There is no clear culprit. Some queries show increased time to read from S3 (table scan operator) and other times the hash probe had lower throughput.

@czentgr
Copy link
Collaborator Author

czentgr commented Jul 23, 2024

@kgpai If you get a chance, can you please take a look? If we want to try and run the new target in the CI I can add a job for that too.

@czentgr czentgr force-pushed the cz_prepare_clang17 branch 3 times, most recently from b025899 to bdb2091 Compare August 5, 2024 18:09
@czentgr czentgr force-pushed the cz_prepare_clang17 branch 2 times, most recently from baea409 to 7f0ddaf Compare August 5, 2024 22:59
# Note, this will use the guessed toolset. If both gcc and clang-15 are present it will pick gcc.
# There is no easy way to use clang++-15 if it doesn't exist as clang++.
# Therefore, in order to use clang-15, one needs to install it as an alternative, set it as the default, and then
# select the --with-toolset=clang for boostrap script and toolset=clang for the b2 script.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add --with-toolset=clang to bootstrap and toolset=clang for the b2 script when USE_CLANG is used.
The error then will force users to set up the clang++ correctly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the problem is then that this will fail on fresh install. This assumes a user has not had anything installed previously (fresh setup) and sets USE_CLANG. In that case clang-15 isn't symlinked to clang after installation (which is why I had set up the alternatives before). That is a problem. So the experience will be that this will fail.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default behavior will be to build boost using GCC correct? This is not ideal.
It's okay to error. If we document this in the README, users will understand how to overcome the error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case clang-15 isn't symlinked to clang after installation

Apologies, why is this the case ?

Copy link
Collaborator Author

@czentgr czentgr Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because the default clang version on Centos9 was Clang17 (when I started this) and is now Clang18. So just installing "clang" will result in the system default version installation symlinked to clang and clang++.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesnt boost check for the compiler specified via CXX env variables ? That is what they should be using and if thats not set, go about using latest compiler.

Copy link
Collaborator Author

@czentgr czentgr Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Unfortunately, it doesn't. CC/CXX are CMake environment variables. And the boost bootstrap script specifically tries to use clang++ to check for the toolset.

This is what it looks like:

++ export CC=/usr/bin/clang-15
++ CC=/usr/bin/clang-15
++ export CXX=/usr/bin/clang++-15
++ CXX=/usr/bin/clang++-15
++ [[ 1 -ne 0 ]]
++ for cmd in "$@"
++ run_and_time install_boost
++ install_boost
++ wget_and_untar https://github.com/boostorg/boost/releases/download/boost-1.84.0/boost-1.84.0.tar.gz boost
++ local URL=https://github.com/boostorg/boost/releases/download/boost-1.84.0/boost-1.84.0.tar.gz
++ local DIR=boost
++ mkdir -p boost
++ pushd boost
~/deps/boost ~/deps
++ curl -L https://github.com/boostorg/boost/releases/download/boost-1.84.0/boost-1.84.0.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  118M  100  118M    0     0  82.8M      0  0:00:01  0:00:01 --:--:--  124M
++ tar -xz --strip-components=1 -f boost.tar.gz
++ popd
~/deps
++ cd boost
++ ./bootstrap.sh --prefix=/usr/local --with-toolset=clang
Building B2 engine..

A C++11 capable compiler is required for building the B2 engine.
Toolset 'clang' does not appear to support C++11.

> clang++ -x c++ -std=c++11  check_cxx11.cpp
./tools/build/src/engine/build.sh: 120: clang++: not found
** Note, the C++11 capable compiler is _only_ required for building the B2
** engine. The B2 build system allows for using any C++ level and any other
** supported language and resource in your projects.


You can specify the toolset as the argument, i.e.:
    ./build.sh [options] gcc

Toolsets supported by this script are:
    acc, clang, como, gcc, intel-darwin, intel-linux, kcc, kylix, mipspro,
    pathscale, pgi, qcc, sun, sunpro, tru64cxx, vacpp

For any toolset you can override the path to the compiler with the '--cxx'
option. You can also use additional flags for the compiler with the
'--cxxflags' option.

A special toolset; cxx, is available which is used as a fallback when a more
specific toolset is not found and the cxx command is detected. The 'cxx'
toolset will use the '--cxx' and '--cxxflags' options, if present.

We would need to change how we build boost. Currently, the bootstrap script is used to call the build.sh script. Calling it directly allows one to override the path to the compiler.
But we need to generate some build related files first with the b2 script which doesn't allow changing the compiler path.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, found a solution to the problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @czentgr , is the solution what you described above ?

Copy link
Collaborator Author

@czentgr czentgr Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No but it was highlighting some customization. Only problem was that bootstrap output and b2 jam files didn't quite work in the desired way. I needed it to use clang 15 in the bootstrap - which I did. But the generated jam file required a separate definition for the "new" clang-15 toolset when all I wanted was to use the clang compiler of version 15 (which is actually used when providing the --with-toolset="clang-15" arg to the bootstrap script) because it has code to automatically build the right command.
And providing a custom user defined jam file with the compiler (from the b2 doc) also didn't help because it was using the generated version which loaded the compiler differently. So I ended up changing the generated jam file to actually load the known compiler in the specific version we wanted and that works. Code is this

      ./bootstrap.sh --prefix=/usr/local --with-toolset="clang-15"
      # Switch the compiler from the clang-15 toolset which doesn't exist (clang-15.jam) to
      # clang of version 15 when toolset clang-15 is used.
      # This reconciles the project-config.jam generation with what the b2 build system allows for customization.
      sed -i 's/using clang-15/using clang : 15/g' project-config.jam
      ${SUDO} ./b2 "-j$(nproc)" -d0 install threading=multi toolset=clang-15 --without-python

Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@czentgr this looks great! Can we update the README as well?

Copy link
Contributor

@kgpai kgpai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great - minor nits.


function install_clang15 {
VERSION=`cat /etc/os-release | grep VERSION_ID`
if [[ ! ${VERSION} =~ "22.04" && ! ${VERSION} =~ "24.04" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we had problems with clang on Ubuntu apart from these versions ? Maybe it would just be good to add this as a warning rather than forcing the user to not use clang.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that it is more complicated to get clang 15 onto the older Ubuntu 20.04. Perhaps we should just not support it anymore and only support the last two LTS versions? Because we officially say Ubuntu 20.04+ is supported I made this check - so really only affects users of Ubuntu 20.04.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still say it should be fine to spit out a warning ; we can assume folks know what they are doing if they choose to play around with compiler versions.

function install_clang15 {
VERSION=`cat /etc/os-release | grep VERSION_ID`
if [[ ! ${VERSION} =~ "22.04" && ! ${VERSION} =~ "24.04" ]]; then
echo "The clang configuration is for Ubuntu 22.04 and 22.04. Unset USE_CLANG and redo the setup."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: Ubuntu 22.04 and 24.04

@czentgr czentgr force-pushed the cz_prepare_clang17 branch 3 times, most recently from 17f8444 to 010a851 Compare August 7, 2024 02:53
Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @czentgr! Great to see Clang support.
@kgpai Should we add/modify a CI job with clang enabled? Maybe the Ubuntu Linux build workflows/linux-build.yml:ubuntu-debug?

Copy link
Contributor

@kgpai kgpai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this excellent contribution @czentgr !

@kgpai
Copy link
Contributor

kgpai commented Aug 7, 2024

@kgpai Should we add/modify a CI job with clang enabled? Maybe the Ubuntu Linux build workflows/linux-build.yml:ubuntu-debug?

Yes that is required. Will be great if some one contribute to that.

@kgpai kgpai added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Aug 7, 2024
@czentgr
Copy link
Collaborator Author

czentgr commented Aug 8, 2024

@kgpai Should we add/modify a CI job with clang enabled? Maybe the Ubuntu Linux build workflows/linux-build.yml:ubuntu-debug?

Yes that is required. Will be great if some one contribute to that.

Yes, I can do it in another PR.

@facebook-github-bot
Copy link
Contributor

@kgpai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@xiaoxmeng merged this pull request in cf283b7.

Copy link

Conbench analyzed the 1 benchmark run on commit cf283b73.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants