-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDBF-793 - Retire MSAN clang-15 builder, upgrade to clang-20 #562
base: dev
Are you sure you want to change the base?
MDBF-793 - Retire MSAN clang-15 builder, upgrade to clang-20 #562
Conversation
@grooverdan |
89ab27f
to
852a451
Compare
0c93f34
to
92566d9
Compare
d7eb898
to
4be5d7c
Compare
- replace debian 11 - clang 15 with debian 12 - clang 19 For msan.Dockerfile - starting from clang 18 libunwing is added to ENABLE_RUNTIMES - make the dockerfile work for both bookworm / bullseye editions (LLVM repository) - doc Makefile is not present in bookworm for gmp - bookworm has newer aclocal / automake -> cracklib2 fix - clang 19 needs libclang-19-dev and libllvmlibc-19-dev installed
This is to aid consumption by developers and have a ready available MSAN container. Technically we don't need to purge from image as its a build stage, but we do to keep size a bit smaller.
Also use update-alternatives to provide clang/clang++ links. With a 6 monthly new release cycle of clang. Keeping it such that a rebuild of image is sufficient to re-deploy rather than a master restart will facilitate more frequent updates and a fixed builder url.
Requested by Marko to make this a more realistic test of the codebase delivered to users. No -DWITH_DBUG_TRACE=OFF exists in RelWithDebInfo mode
make clang/clang++ alternates as soon as installed
Unit tests run all ok. WITH_SAFE_MALLOC=OFF is default in RelWithDebInfo mode.
Also convient CFLAGS and MSAN environment variables.
to non-instrumented things. Otherwise these sorts of error in testing: LD_LIBRARY_PATH=/msan-libs/ /usr/bin/ctest /usr/bin/ctest: symbol lookup error: /msan-libs/libgmp.so.10: undefined symbol: __msan_va_arg_overflow_size_tls
And the --force is for debian:11 compat where the files exist in the tarball.
Use cmake to install
73cd2d3
to
6703791
Compare
ci_build_images/rr.Dockerfile
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I correctly understand, this is a docker file that can be built alone, right?
It took me a while to figure it out that you are actually using COPY --from=rr
to transfer the binary to the MSAN container image.
Maybe it's useful that in the header of this Dockerfile to say:
- that this is behaving like a stage in a multi-stage Dockerfile after the contents of the base dockerfile (debian.Dockerfile) are added on top of it . (order is important)
- can be built alone if needed
- in the multi-stage setup it's acting like a .fragment. Dockerfile and sharing the RR binary is achieved using
COPY --from=stage
syntax - in the same multi-stage setup, to highlight the importance of
FROM $BASE_IMAGE
, I guess that it's important that Building RR and Building MSAN should happen on the same OS/VERSION
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it can be built alone.
Multi-stage - doesn't explictly add on top. It just provides the arctifacts for COPY --from=stage
.
The requirements for the later stage are a bit less. Hopefully covered all these in updated header of the file.
util.BuilderConfig( | ||
name="amd64-debug-msan-clang", | ||
workernames=workers["x64-bbw-docker-debug-msan-clang"], | ||
tags=["Debian", "clang", "msan", "debug"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the builder name doesn't have the version of clang in the name and if the tags do not give the user a clue, then the user's last resort is to study the logs.
It's useful to have an easy to spot place for telling the clang-version even if that means we will update the configuration once in a while.
In Grid view, builder name is the easiest indicator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion:
amd64-msan-clang-20
amd64-msan-clang-20-debug
What are the benefits?
- debug word at the end as in other builders, also draws attention
- builder history wise we keep apples and pears in separate baskets e.g. clang-20 full run history in
amd64-msan-clang-20
, future clang-21 history inamd64-msan-clang-21
- transitioning . Future
amd64-msan-clang-21
can run in parallel withamd64-msan-clang-20
until is stable and we can discontinue the latter smoothly (just a protected branches switch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The convention build type/compiler/extra tests wasn't fully clear.
Ok, I accept these reasons.
reminder both Marko and I are almost of the opinion that the msan-debug might be providing more timeouts for a minimal addition of coverage (on an non-released work). And the issue noted on zulip means it currently doesn't even bootstrap on debug, so look at dropping this if too much trouble, and the debug obviousn't won't be protected branches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given updates to clang-21 if it still occurs within the support cycle of debian-12 I've updated the image tag to include -20
.
- image: debian:11 | ||
platforms: linux/amd64 | ||
branch: 10.11 | ||
tag: debian11-msan-clang-16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should highlight the approach on transitioning, I know it was discussed briefly but please confirm If my understand is right.
- you are discontinuing the build of
debian-11-msan-clang-16
container image because it is incompatible with the patches inmsan.fragment.Dockerfile
- you don't touch the
msan-clang-16 builder
definition inmaster-docker-nonstandard
because we want to keep it alive while we validate the newmsan-clang-20 builder
If 1 & 2 are true, I think it's worth that a commit clearly states that.
Maybe you already mentioned this, but is very hard for me to go through 101 commits. And about that, if you can please make the commit stack smaller, group them and make them highlight the important decisions of this patch. It will be very hard for anyone to follow them.
4-5 commits if possible, would be perfect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 & 2 are correct. I've included these and the other transition plan details in the MDBF.
I know the 101 commits (now more) is actually as horrendous as the development process was. Its not something I aimed to do or will aim to do again, and I concur with 4-5 being ideal.
I have tied to rebase this down and have failed so far. I'd like to think these commits are well documented up and show up in changing a limited set of files minimally to be unobtrusive. I'd like to think the terse commit messages are still conveying significant meaning to be searchable and meaningful for a direct read via git blame without a commit by commit bisectable review consistent with most recent commits.
So can I please ask for forgiveness here and accept the large number of commits as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry Daniel, if you want, I can help you grouping them.
My general idea is:
1 commit for msan.Dockerfile and .yml workflow
- this could be an introductory commit where you first highlight the transition to clang-20 and debian-12
1 commit for the instrumented libs shell script:
- this is a follow up of the first commit, the flow is natural and you basically show the implementation of
RUN ./msan.instrumentedlibs.sh
- I saw many commits that patch various instrumented libs in this file. These can be grouped under a single commit. You already gave precious details for each library in the file itself, so you can keep the short version in the commit message.
1 commit for all factory changes:
- there are many commits only affecting the factory
- in this commit you can also include changes to constants.py
- this is also a natural transition where you can highlight how the factory changes are coupled with msan.Dockerfile changes
- here is also a place to explain the Debug / RelwithDebInfo builders
1 commit for RR
- highlighting that this is a "human helper" and a few words about its implementation
Some commits are "fix that, and other", "revert", "typo" and so on. So the changes in these can be easily grouped under the above categories.
I apologize again for insisting but I hope you understand my reasoning.
canStartBuild=canStartBuild, | ||
properties={"c_compiler": "clang", "cxx_compiler": "clang++", "build_type": "RelWithDebInfo"}, | ||
locks=getLocks, | ||
factory=f_msan_build, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take into account that now f_msan_build
factory in master-docker-nonstandard-2
is not equal anymore with f_msan_build
factory in master-docker-nonstandard
(where clang-16 builder is).
We should have extra care at transitioning and mention the plan. Things to note:
- a protected branches builder should have backup workers. It's OK that now these 2 builders only have
apexis
, they are not protected.MSAN Clang-16
for example hashz-bbw1 / 4 / 5
as worker nodes. - at transitioning,
f_msan_build
factory that clang-16 uses is no longer valid (right ?) and we can movef_msan_build
from your patch incommon_factories.py
and of course, adjust the worker pool of these 2 builders when they become protected.
Maybe @cvicentiu has better ideas here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable, however a transition plan is probably better in in the MDBF task.
image: ${{ matrix.image }} | ||
platforms: ${{ matrix.platforms }} | ||
tag: ${{ matrix.tag }} | ||
branch: ${{ matrix.branch }} | ||
clang_version: ${{ matrix.clang_version }} | ||
nogalera: ${{ matrix.nogalera }} | ||
files: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think files:
is needed, adds extra complexity.
In bintars, for example, we just do COPY ci_build_images/scripts/* /scripts/
without any files:
specification in the workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And as @cvicentiu mentioned, for future us, as many detailed comments as you can for the library instrumentation script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is build with uses: bbw_build_container_template.yml
, so the additional file complexity is needed to fit in with what the template expects.
I did include as much information as I know in the library instrumentation script.
# Marker to make it possible to build a dev msan builder | ||
# from the nightly clang versions as they are in a differently | ||
# name repo | ||
ENV CLANG_DEV_VERSION=21 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just for local testing?
So when we move to clang 21 we just set CLANG_VERSION=21
and CLANG_DEV_VERSION
to what is currently in the nightly builds repository?
If yes, it's worth adding a comment on how do we handle upgrades.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local testing or if ever every want to use a dev version for BB testing.
CLANG_VERSION
is the version you want. CLANG_DEV_VERSION
https://apt.llvm.org/ under "development" branch (which is in a state of flux as clang-20 is at the end of its rc cycle.
file comment updated.
Adjust to the resources on apexis
Suggestion: amd64-msan-clang-20 amd64-msan-clang-20-debug What are the benefits? * debug word at the end as in other builders, also draws attention * builder history wise we keep apples and pears in separate baskets e.g. clang-20 full run history in amd64-msan-clang-20, future clang-21 history in amd64-msan-clang-21 * transitioning . Future amd64-msan-clang-21 can run in parallel with amd64-msan-clang-20 until is stable and we can discontinue the latter smoothly (just a protected branches switch)
Part of MDBF-993. As per Vicențiu Ciorbaru request, last-N-failed is only reporting its status (not in branch protection) and Vicențiu Ciorbaru reported often sporadic failures for it. debian-11-msan It will be replaced by a clang-20 builder in MariaDB#562 so no need to update GITHUB_STATUS_BUILDERS for it in this Pull Request.
pkg-config \ | ||
python3-pexpect \ | ||
unzip \ | ||
zlib1g-dev \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This list of packages if exactly the same as the one installed could be in a variable, this would ease maintenance.
Replace MSAN CLANG-15 builder with CLANG 19 on Debian 12.