Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI : Add linux-gcc11 build variants #5622

Merged
merged 3 commits into from
Jan 18, 2024

Conversation

murraystevenson
Copy link
Contributor

Putting this up a draft to test CI with the new Rocky 8 / GCC 11.2.1 build container and the dependencies produced from that container.

@murraystevenson
Copy link
Contributor Author

Test failures at -O3, but not at -O0 in the debug build. Tests also fail locally in release builds at -O3 and -O2, but not -O1 or -O0.

https://github.com/GafferHQ/gaffer/actions/runs/7482550578/job/20366351895?pr=5622

@johnhaddon
Copy link
Member

I can reproduce those test failures in an Alma 9 VM with GCC 11 as well (using the prebuild dependencies). I'm suspecting an ImageStats bug - will dig a bit deeper.

@johnhaddon
Copy link
Member

The problem is that the hash for the tile stats isn't changing when we change the input area, even though we do append that to the hash here : https://github.com/GafferHQ/gaffer/blob/main/src/GafferImage/ImageStats.cpp#L378. Smells like the optimiser is dropping something because it doesn't think it contributes to the hash. This rang a bell - I think it's the same thing I worked around here : ddd1f85, and indeed that "fix" works here too.

Really we should sort this out in MurmurHash itself though - I think we're probably hitting undefined behaviour because we treat min, max as an array here. But if that's the case, then I'm worried that the same trick for vectors or arrays of vectors isn't valid either. Need to do some reading...

@murraystevenson
Copy link
Contributor Author

Thanks for the digging! In order to get the tests passing I've worked around the Box2i hashing strangeness in e6e0964. Getting all the tests passing required the workaround in CollectImages and Offset in addition to ImageStats. There appear to be other hashes of Box2i in GafferImage, but they don't seem to be triggering this weirdness - at least not as far as the tests are concerned...

I've added a couple of fixups, one that improves the variant handling by adding a --variant argument to installDependencies.py, and another to remove linux-debug-gcc11 from the matrix to reduce the overall number of checks run now that the linux-gcc11 tests are passing.

Copy link
Member

@johnhaddon johnhaddon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Murray! I've made a couple of comments to consider regarding the installation script, but other than that this looks good to merge.

@murraystevenson
Copy link
Contributor Author

ImportError: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'OpenSSL 1.0.2k-fips 26 Jan 2017'.

pip installing our usual Sphinx 4.3.1 into the new 2.1.0 build container is pulling in newer dependency versions than those in the 2.0.0 build container as we don't have a versionlock list of pip installed package dependencies.

Specifically Sphinx is now pulling in urllib3 2.1.0, which no longer supports the system OpenSSL version on Centos 7. It looks like we can get around this by specifying urllib3==1.26.7 (the current version in build 2.0.0, or upgrade to a compatible 1.26.x patch release) when we install Sphinx to ensure an OpenSSL 1.0.2 compatible version of urllib3 is installed. Build 2.1.1 here we come...

The sphinxcontrib.applehelp extension used by this project needs at least Sphinx v5.0; it therefore cannot be built with this version.

Similar story here, in the few days between build 3.0.0a2 and 3.0.0a3 a new version of sphinxcontrib-applehelp was released that is no longer compatible with Sphinx 4.3.1 but is still installed via pip install sphinx==4.3.1. There aren't a huge number of pip package dependencies, so we could be a bit more verbose in the Dockerfile or if that gets out of hand then start looking at a requirements file workflow a bit like the yum versionlock one.

@johnhaddon
Copy link
Member

The sphinxcontrib.applehelp extension used by this project

I have no idea what we're using that for, since this is the description for it :

sphinxcontrib-applehelp is a sphinx extension which outputs Apple help books

Maybe there's some way we can just stopping it getting imported? Although locking the versions as you suggested sounds fine too...

Update to new build containers with `GAFFER_BUILD_ENVIRONMENT` environment variable. We can use this to download the appropriate dependencies package for each build environment.

We could potentially use `GAFFER_BUILD_ENVIRONMENT` to also define the build name in setBuildVars.py. This would need to be some combination of the platform, build type (nothing for a release build, "-debug" for a debug build), and the build environment.
This is only temporary while there is no `gafferDependencies-8.0.0a3-windows.zip` available.
We should sort this out in MurmurHash itself though - John thinks we're probably hitting undefined behaviour because we treat `min, max` as an array in https://github.com/ImageEngine/cortex/blob/85f9bca60a8be0e5aad71b98b9a1602702f72bbd/include/IECore/MurmurHash.inl#L360.

For reference, see a similar workaround when hashing a display window in ddd1f85.
@murraystevenson
Copy link
Contributor Author

The sphinxcontrib.applehelp extension used by this project

I have no idea what we're using that for...

Yeah, this confused me too and even though we don't require it, removing sphinxcontrib-applehelp prevents Sphinx from running at all. More of the sphinxcontrib- dependencies also presented the "needs Sphinx 5.0" error, so I locked down all of them to specific compatible versions in the build 3.x Dockerfile.

@murraystevenson murraystevenson marked this pull request as ready for review January 18, 2024 22:50
@murraystevenson murraystevenson merged commit 98120c5 into GafferHQ:main Jan 18, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants