-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDBF-695 - add srpm tests as in the old bb #720
base: dev
Are you sure you want to change the base?
MDBF-695 - add srpm tests as in the old bb #720
Conversation
Mark all OS's for which we re-build from source RPM. has_srpm on Alma & Rocky can never be true if install_only=True (no autobake builders).
For each OS marked with has_srpm=True, define a builder list with extra information on arch,image,version and os. We don't have enough infrastructure available on s390x to afford an srpm test.
- add an extra step to the rpm factory for triggering the newly defined "s_srpms" scheduler An srpm builder can be triggered if: - the parent builder (autobake) has a corresponding srpm builder, same os / os-version / arch - branch is a main branch or release - packages were generated Later we will see that the srpm builder must download the rpm packages from ci.mariadb.org re-build them from the source RPM and then compare CI vs produced packages. Scheduler - each autobake build instance has it's own instance of the "s_srpms" scheduler, a list of 1 item (the corresponding srpm builder)
The decision to define them in one of the /2 non-standard masters is based on the need for fine-grained control over worker allocation. It is crucial to use RedHat based hosts so that RedHat containers are subscribed to the redhat.repo repositories. The worker and builder allocation is governed by the BUILDERS_SRPMS constant. While defining volumes for addWorker might seem repetitive, it was necessary, as it is the only method to pass subscription data to the container in a Docker environment. #FIXME Items: We still need PPC and ARM-based Red Hat hosts to be subscribed. The f_srpm factory is in progress. (note): black master-docker-nonstandard/master.cfg does more than intended
ee139e4
to
3b49d0f
Compare
Minimal container images with: - rpm tools installed - extra repositories - buildbot user - buildbot-worker (pip or distro) srpm.Dockerfile - start with an upgrade to keep the image up to date (from standard repositories) - capable to build any image that is in the supported $PLATFORM_ID / $ID lists - wget: needed to retrieve RPMS/SRPMS from CI - which: needed for checking the twistd version after installing from PIP - sudo: some OS's do not come with sudo. Needed for buildbot user when building src.rpm deps - perl-generators: server cannot compile without it, the same workaround found in old-bb. Ideally, building deps should retrieve this package - rpm-build / yum-utils: tools needed to re-build RPM's - dumb-init: needed to launch buildbot-worker process as PID 1 in container If you don't need buildbot-worker just remove all code under #BUILDBOT-WORKER SETUP Notable exceptions: - UBI (RHEL) needs CRB repository i.e. a valid RH subscription. Run UBI containers on RedHat (subscribed) hosts. build-srpm.yml - not using the yaml template because this a more basic setup - the same dev_ tags principle used - runner upgraded to ubuntu-24.04 to overcome qemu-user-static ssl bug https://gitlab.com/qemu-project/qemu/-/issues/1471 - platforms: is based on os_info.yaml, without s390x and only OS's marked to rebuild from SRPM
bfddb1d
to
0f39082
Compare
The factory will fetch the script from the master and launch it. About the script: - will compare the resulting RPM's vs CI (their deps via rpm -q --requires) << as in old-bb >> - download the rpms/srmps from CI, wget is used, hence the required paramaters: - tarball_number and autobake_builder - inherited from the trigger (autobake) - ci_base_url : dev / prod, based on the runtime environment - jobs - control the number of CPU's used for make via _smp_mflags - rebuilding works the same on all platforms via rpmbuild --rebuild - based on $PLATFORM_ID / $ID some operations are platform specific: - enabling CRB repository [only rhel] - should run on a subscribed RedHat host - installing build time dependencies from source RPM specs If a platform is deprecated or added, CASE lists should be updated. Some "hacks" were preserved from old-bb: - compat (all OS's) - judy-devel (sles) For reference please see the old-bb implementation: https://github.com/MariaDB/mariadb.org-tools/blob/198abd77ee023b45132ff2c9a9bdaac371a5a7e7/buildbot/maria-master.cfg#L1873
4c4d8b2
to
66dd272
Compare
00e2776
to
af2a95a
Compare
- pre-commit - SRPM worker pool dictionary - use only required volumes in SRPM_CONTAINER_VOLUMES - fix tags for srpm builders (e.g. fedora instead of fedora-40, rhel instead of rhel-8, etc)
af2a95a
to
faec8b3
Compare
@cvicentiu To keep things in motion, we can also run the |
agree. Let's start with AMD64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Started test with the following until ci.mariadb.org
is still slow:
[buildbot@684c29aea73b ~]$ /mnt/rebuild-from-srpm.sh https://ci.mariadb.org 54075 "amd64-fedora-41-rpm-autobake" 16
Downloading RPM files to 'rpms/' directory...
optional: can commit the rebuild-from-srpm.sh with +x permissions.
- name: Build image | ||
if: ${{ env.MAIN_BRANCH == 'false' }} | ||
run: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The oneline version:
$ buildah bud --manifest bb-worker:fedora39-srpm --jobs 4 --platform=linux/amd64 --platform=linux/arm64/v8 --build-arg BASE_IMAGE=fedora:39 --build-arg INSTALL_WORKER_FROM_PIP= ci_build_images/srpm.Dockerfile
or
buildah bud --manifest bb-worker:fedora39-srpm --jobs 4 $(IFS=", "; for p in ${{ matrix.platforms }} ; do echo --platform=$p\ ; done ) --build-arg BASE_IMAGE=fedora:39 --build-arg INSTALL_WORKER_FROM_PIP=
- dockerfile: srpm.Dockerfile | ||
image: fedora:39 | ||
tag: fedora39-srpm | ||
platforms: linux/amd64, linux/arm64/v8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think we can drop Fedora 39 now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's EOL since November 2024. Thanks!
Should we prepare a separate task to remove it all together?
Edging towards "No"
If "yes' then "Yes"
Only if we're releasing form it again.
As we don't release this, thinking No. |
finished and failed with:
|
Though about that, yes, that's better. Thanks! |
@grooverdan Is true that I didn't had the chance to test on all platforms, in some, g++ was installed as part of building the dependencies from the source RPM. How to classify this problem? Something wrong in the source RPM or should I have installed gcc in the image? |
Thanks, |
@grooverdan Looking on what we have on mirror, https://mirror.mariadb.org/yum/10.11/fedora39-amd64/srpms/MariaDB-10.11.11-1.fc39.src.rpm
https://mirror.mariadb.org/yum/11.4/fedora39-amd64/srpms/MariaDB-11.4.5-1.fc39.src.rpm
Trying with an older tarball |
@grooverdan @cvicentiu Test case:
Results => gcc+c++ missing, ccache instead
Now:
Problem partially solved, we have gcc-c++
|
Regarding the problem above, notable differences between OS's. In this first case (OpenSuse)
But the compilers will be indeed g++ and gcc as opposed to the second case (Fedora).
It's all about whether or not the ccache package creates the symlinks. A possible workaround is to pin the compiler using environment variables like so:
Then, normally: Which results in the correct deps:
And Compiler Launcher is OK:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking very good, nice job @RazvanLiviuVarzaru. See my comments though (with my maintenance hat on).
set -eu | ||
|
||
if [ $# -ne 4 ]; then | ||
echo "Usage: $0 <ci_base_url> <tarball_number> <autobake_builder> <<no_of_jobs>>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the bash library (bash_lib.sh
) and bb_log_*
functions (also in the rest of the script).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mm, I need to copy it over alongside this script.
But I am worried more about sourcing a lot of settings around set - {x,e,u} and I can't predict the behavior in this script.
Any advice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mm, I need to copy it over alongside this script.
Yes but that's the default behavior anyway for all the tests.
Default options are sane, the script should be working with those (appart maybe the unbound which can be bypassed momentarily, see
Line 137 in c309caf
set +u |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then this is the way to go, I am not really afraid of unbound, because this srpm script won't grow to the point we can't keep track of variable allocations.
I really like the logging functions under bash_lib but I was a bit afraid to source such a big library of functions.
About the "default behavior" .
I changed a bit the way we use the script. It's now pushed by the master -> worker, not the worker taking it for GitHub. I did this because I am not 100 % on boarded with our current of way of working where updating a script in GitHub has immediate production impact.
|
||
# Download RPMS / SRPMS from CI | ||
echo "Downloading RPM files to 'rpms/' directory..." | ||
wget -r -l1 -H -nd -A "*.rpm" "$base_url/rpms/" -P rpms/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we use the mirror directly (like for rpm_upgrade and rpm_install steps). I took quite some time trying to skip fetching rpms during thoses steps because appart from testing that what we produce is alright it also tests that our mirrors are working correctly.
As a reminder, the final goal is to have BB testing the mirrors (on ci.buildbot.org) as they will look when we automate the releases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only need to download the RPM's and do nothing with them other than compare.
Is it worth to setup the repository? I think this implies sourcing bash_lib also and be extra careful at every << platform >> specific way of setting up the repository.
Let's say this is done.
But then what's the approach for downloading the RPM's not installing them?
I am not against the idea, thought about it in my plan but I felt that I over-complicate my life and we already officially test the mirrors properly on dedicated Install / Upgrade builders.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sufficiently expert with RPM (and even less with SRPM) to judge.
So, let's say this that way: how are our user consuming srpms, if they download them to build rpms then I am ok with your approach, if they just point to the correct repository (I believe like apt does with deb-src), then we might want to make sure that our repositories also works for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They would probably (I am not sure, but makes sense) do it like that,
sudo dnf download --source <package-name>
or
yumdownloader --source <package-name>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me check what I can accomplish around dnf download
.
I see DNF has a download-dir also (good)
Zypper (SUSE) has a download, not a dir specification.
And these kind of differences makes everything painful.
But let's see.
&& echo 'buildbot ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers; | ||
|
||
## BUILDBOT-WORKER SETUP | ||
# Install worker from pip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is duplicated code from pip.Dockerfile
, can't we use it instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer a standalone Dockerfile that can be built locally without cat << this >> << that >>
When we get rid of docker-latent-worker this will be simplified a lot so I am not expecting to keep anything under ## BUILDBOT-WORKER SETUP
for too long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about this approach moby/moby#735 (comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guess you are referring to the make
approach in moby/moby#735 (comment) , right?
It sounds like a great idea to have something like a template-engine / pre-processor capable of much more.
It's better than our cat < this > < that >
acting like a stone-age pre-processor.
But I will keep it for later, when we get rid of DockerLatentWorker.
Because the vast majority of our Dockerfiles are just adding up buildbot-worker and pip
(which will be removed).
For the few exceptions that will remain after that we can either think of a pre-processor (really like the make
approach) or use docker multi-stage where possible.
|
||
# Get buildbot.tac and use the right twistd | ||
# some distributions install twistd3 instead of twistd \ | ||
RUN if ! which twistd >/dev/null; then \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment, duplicated code from ci_build_images/buildbot-worker.Dockerfile
, can't we use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as for pip with the following addition:
I modified the code to give sudo
access for buildbot on RPM based systems + some workarounds near /etc/sudoers
to make that happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if that's not duplicated code strictly then I am ok with it.
&& chmod +r /home/buildbot/buildbot.tac | ||
|
||
# install dumb-init to launch worker process as pid 1 | ||
RUN curl -sLo /usr/local/bin/dumb-init "https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_$(uname -m)" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment, and I see that this is already duplicated elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually duplicated in every << os >>.Dockerfile , except the ones that actually have the dumb-init
package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which makes upgrading dumb-init painfull...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't be needed when we get rid of docker-latent-worker. Which is far more painful :(
As per MDBF-695, rpm builders should do the srpm build step as the old buildbot does, to make sure source rpms are usable and a build from a source rpm produces packages identical to official.
Please follow the commits [asc] for implementation details.
Summary:
BUILDERS_SRPMS
for every OS / arch (no s390x) withhas_srpm: True
s_srpms
triggered by autobake builders for release / main branchesBUILDERS_SRPMS
) defined innon-standard master
with dedicated worker pool (should be RedHat hosts). #TODOf_srpm
factory for fetching / runningrebuild-from-srpm.sh
srpm.Dockerfile
able to build minimal container images for every OSGitHub workflow
to handle the container image build and publish to QUAY/GHCRHow to test?
rebuild-from-srpm.sh
overbash rebuild-from-srpm.sh https://ci.mariadb.org 54438 amd64-fedora-39-rpm-autobake 7
Mysteries of the day:
buildbot.dev.mariadb.org
and test the framework. I tested the images / script in isolation but not integrated with buildbot.