-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes to compile with Clang15 #10363
Conversation
@majetideepak FYI |
✅ Deploy Preview for meta-velox canceled.
|
e3908cf
to
6ccb99e
Compare
@@ -4401,13 +4407,18 @@ struct DictColumnWriterTestCase { | |||
context.initBuffer(); | |||
|
|||
// complexVectorType will be nullptr if the vector is not complex. | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove old code.
6ccb99e
to
db3ce35
Compare
Need to work on this a bit more. Some of the deps built don't pick up the change. For example arrow:
I guess the variables do not get inherited into bundled build with their own projects. Probably needs a few more changes. Dependencies (from the setup scripts are also built with gcc). |
db3ce35
to
9c16d37
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have to support building deps with clang17 as well. |
So recently Centos9 updated their clang from version 17 to 18 and it causes additional build breaks (around arrays that use a variable input size and possibly more). As a result I've installed clang15 (the only other easily available version from the repo) and built with that. Overall improvement of 99 TPCDS queries is about 8% compared to gcc12. Some queries are slightly slower and some are faster. Spread is 0.65 to 3.4 ratio. That is the worst query took almost twice as long (this one is a subsecond query) and best query improved by factor 3ish (19s down to 5.5s) . |
eb7ed01
to
0dc3366
Compare
0dc3366
to
1f4f2ce
Compare
A re-run of the same code but with various combinations of prestissimo and dependency compilation yielded some interesting results running the 99 queries of TPCDS in sf10k. We tested 3 combinations:
The final result is that the option 2 performed best in terms of total time. Option 1 and 2 were almost similar. However, notably option 1 and 2 shaved off about 5 mins off q67 which is the longest running query from 40 mins (in option 3) to 35 mins. However, some queries ran slower with clang 15. There is no clear culprit. Some queries show increased time to read from S3 (table scan operator) and other times the hash probe had lower throughput. |
@kgpai If you get a chance, can you please take a look? If we want to try and run the new target in the CI I can add a job for that too. |
b025899
to
bdb2091
Compare
baea409
to
7f0ddaf
Compare
scripts/setup-centos9.sh
Outdated
# Note, this will use the guessed toolset. If both gcc and clang-15 are present it will pick gcc. | ||
# There is no easy way to use clang++-15 if it doesn't exist as clang++. | ||
# Therefore, in order to use clang-15, one needs to install it as an alternative, set it as the default, and then | ||
# select the --with-toolset=clang for boostrap script and toolset=clang for the b2 script. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add --with-toolset=clang
to bootstrap and toolset=clang
for the b2 script when USE_CLANG
is used.
The error then will force users to set up the clang++ correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the problem is then that this will fail on fresh install. This assumes a user has not had anything installed previously (fresh setup) and sets USE_CLANG. In that case clang-15 isn't symlinked to clang after installation (which is why I had set up the alternatives before). That is a problem. So the experience will be that this will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default behavior will be to build boost using GCC correct? This is not ideal.
It's okay to error. If we document this in the README, users will understand how to overcome the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case clang-15 isn't symlinked to clang after installation
Apologies, why is this the case ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because the default clang version on Centos9 was Clang17 (when I started this) and is now Clang18. So just installing "clang" will result in the system default version installation symlinked to clang and clang++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesnt boost check for the compiler specified via CXX env variables ? That is what they should be using and if thats not set, go about using latest compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Unfortunately, it doesn't. CC/CXX are CMake environment variables. And the boost bootstrap script specifically tries to use clang++ to check for the toolset.
This is what it looks like:
++ export CC=/usr/bin/clang-15
++ CC=/usr/bin/clang-15
++ export CXX=/usr/bin/clang++-15
++ CXX=/usr/bin/clang++-15
++ [[ 1 -ne 0 ]]
++ for cmd in "$@"
++ run_and_time install_boost
++ install_boost
++ wget_and_untar https://github.com/boostorg/boost/releases/download/boost-1.84.0/boost-1.84.0.tar.gz boost
++ local URL=https://github.com/boostorg/boost/releases/download/boost-1.84.0/boost-1.84.0.tar.gz
++ local DIR=boost
++ mkdir -p boost
++ pushd boost
~/deps/boost ~/deps
++ curl -L https://github.com/boostorg/boost/releases/download/boost-1.84.0/boost-1.84.0.tar.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 118M 100 118M 0 0 82.8M 0 0:00:01 0:00:01 --:--:-- 124M
++ tar -xz --strip-components=1 -f boost.tar.gz
++ popd
~/deps
++ cd boost
++ ./bootstrap.sh --prefix=/usr/local --with-toolset=clang
Building B2 engine..
A C++11 capable compiler is required for building the B2 engine.
Toolset 'clang' does not appear to support C++11.
> clang++ -x c++ -std=c++11 check_cxx11.cpp
./tools/build/src/engine/build.sh: 120: clang++: not found
** Note, the C++11 capable compiler is _only_ required for building the B2
** engine. The B2 build system allows for using any C++ level and any other
** supported language and resource in your projects.
You can specify the toolset as the argument, i.e.:
./build.sh [options] gcc
Toolsets supported by this script are:
acc, clang, como, gcc, intel-darwin, intel-linux, kcc, kylix, mipspro,
pathscale, pgi, qcc, sun, sunpro, tru64cxx, vacpp
For any toolset you can override the path to the compiler with the '--cxx'
option. You can also use additional flags for the compiler with the
'--cxxflags' option.
A special toolset; cxx, is available which is used as a fallback when a more
specific toolset is not found and the cxx command is detected. The 'cxx'
toolset will use the '--cxx' and '--cxxflags' options, if present.
We would need to change how we build boost. Currently, the bootstrap script is used to call the build.sh
script. Calling it directly allows one to override the path to the compiler.
But we need to generate some build related files first with the b2 script which doesn't allow changing the compiler path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally, found a solution to the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @czentgr , is the solution what you described above ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No but it was highlighting some customization. Only problem was that bootstrap output and b2 jam files didn't quite work in the desired way. I needed it to use clang 15 in the bootstrap - which I did. But the generated jam file required a separate definition for the "new" clang-15 toolset when all I wanted was to use the clang compiler of version 15 (which is actually used when providing the --with-toolset="clang-15"
arg to the bootstrap script) because it has code to automatically build the right command.
And providing a custom user defined jam file with the compiler (from the b2 doc) also didn't help because it was using the generated version which loaded the compiler differently. So I ended up changing the generated jam file to actually load the known compiler in the specific version we wanted and that works. Code is this
./bootstrap.sh --prefix=/usr/local --with-toolset="clang-15"
# Switch the compiler from the clang-15 toolset which doesn't exist (clang-15.jam) to
# clang of version 15 when toolset clang-15 is used.
# This reconciles the project-config.jam generation with what the b2 build system allows for customization.
sed -i 's/using clang-15/using clang : 15/g' project-config.jam
${SUDO} ./b2 "-j$(nproc)" -d0 install threading=multi toolset=clang-15 --without-python
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@czentgr this looks great! Can we update the README as well?
7f0ddaf
to
17861ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great - minor nits.
|
||
function install_clang15 { | ||
VERSION=`cat /etc/os-release | grep VERSION_ID` | ||
if [[ ! ${VERSION} =~ "22.04" && ! ${VERSION} =~ "24.04" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we had problems with clang on Ubuntu apart from these versions ? Maybe it would just be good to add this as a warning rather than forcing the user to not use clang.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that it is more complicated to get clang 15 onto the older Ubuntu 20.04. Perhaps we should just not support it anymore and only support the last two LTS versions? Because we officially say Ubuntu 20.04+ is supported I made this check - so really only affects users of Ubuntu 20.04.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still say it should be fine to spit out a warning ; we can assume folks know what they are doing if they choose to play around with compiler versions.
scripts/setup-ubuntu.sh
Outdated
function install_clang15 { | ||
VERSION=`cat /etc/os-release | grep VERSION_ID` | ||
if [[ ! ${VERSION} =~ "22.04" && ! ${VERSION} =~ "24.04" ]]; then | ||
echo "The clang configuration is for Ubuntu 22.04 and 22.04. Unset USE_CLANG and redo the setup." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: Ubuntu 22.04 and 24.04
17f8444
to
010a851
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this excellent contribution @czentgr !
Yes that is required. Will be great if some one contribute to that. |
Yes, I can do it in another PR. |
@kgpai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
010a851
to
773900d
Compare
@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@xiaoxmeng merged this pull request in cf283b7. |
Conbench analyzed the 1 benchmark run on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Clang needs lib atomic symbols to be resolved for a number of executables. It is added as needed.