-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(cmake): Use FetchContent and ExternalProject for local deps #1744
feat(cmake): Use FetchContent and ExternalProject for local deps #1744
Conversation
tests/gtest/avifavmtest.cc
Outdated
@@ -53,9 +53,9 @@ TEST_P(AvmTest, EncodeDecode) { | |||
decoder->codecChoice = av1_codec; | |||
// AVIF_RESULT_NO_CODEC_AVAILABLE is expected because av1_codec is not | |||
// enabled or because we are trying to decode an AV2 file with an AV1 codec. | |||
ASSERT_EQ(avifDecoderReadMemory(decoder.get(), decoded.get(), encoded.data, | |||
ASSERT_NE(avifDecoderReadMemory(decoder.get(), decoded.get(), encoded.data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are broken in the main branch, but because of a typo in the github actions workflow it was never being executed. dav1d in particular was returning AVIF_RESULT_DECODE_COLOR_FAILED
, not AVIF_RESULT_NO_CODEC_AVAILABLE
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should also be its own PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now in #1748
@@ -110,6 +100,8 @@ jobs: | |||
-DAVIF_BUILD_EXAMPLES=ON -DAVIF_BUILD_APPS=ON | |||
-DAVIF_BUILD_TESTS=ON -DAVIF_ENABLE_GTEST=ON -DAVIF_LOCAL_GTEST=ON | |||
-DAVIF_BUILD_GDK_PIXBUF=ON -DCMAKE_INSTALL_PREFIX=./install | |||
-DENABLE_AVX2=OFF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small percentage of macOS github runners don't have avx2 support but advertise that they do, causing random "Illegal instruction: 4" errors in AOM tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should probably be its own PR right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for diagnosing this. These random errors were a mystery to me and a real pain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't come up with a way to conditionally set -DENABLE_AVX2
for macOS in aom.cmd in a way that's also compatible with sh and cmd.exe...short of creating a separate aom.sh file. So this might need to wait until aom is built as a subdirectory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to factor this out into a separate small PR? These "ILLEGAL" errors are really annoying. I tried creating a PR that just adds these two lines to the yml files but it seems that ENABLE_AVX2 is not recognized by cmake. I guess it's tied to other changes in this PR?
And a few more questions on this topic:
- This sounds like an issue with github runners that we're working around. Do you know if there's an issue open somewhere?
- Why is the CMAKE_OSX_DEPLOYMENT_TARGET flag needed?
- In practice, I'm fairly sure I've only ever seen the ci-unix-static one fail. I doubt it's just a fluke. Maybe we don't need the changes in the other two workflows?
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ENABLE_AVX2
flag is actually in AOM (see here). In this pull request, I am adding libaom with add_subdirectory
to incorporate it into the larger cmake project, which allows one to pass -DENABLE_AVX2=OFF
to libavif's cmake build config. Whereas on the libavif main branch it gets built in a separate step by running aom.cmd. I can't think of a way to pass arguments to that script that would work for windows and unix, and we presumably don't want to disable avx2 for everyone just to get around this issue.
As for your observation: I have seen this error in all of the workflows that use macos, but it does seem to be much more frequent for ci-unix-static. And there is a pattern I've noticed: the builds seem to fail for macOS 12.7 but not for 12.6.9 (you can see the version by expanding the "Operating system" output in the "Set up job" step). If you use tmate to open a shell and run sysctl machdep.cpu
, you'll see that a 12.6.9 runner always has AVX2 and a 12.7 runner does not. Anecdotally it seems like the ci-unix-static gets assigned the 12.7 worker more often. I can only speculate as to why that is.
I found an issue opened a year ago on the subject of AVX2 in macos runners: actions/runner-images#6845. The github employee who closed that issue references a comment on a separate issue actions/runner-images#5609 (comment), which reads:
Unfortunately, at this time out agents are not guaranteed to have AVX2 CPU instruction, and only the way to enable the feature for you on permanent basis is to use self-hosted agents(where it is possible to control the underlying hardware).
I don't know if it's worth raising a new issue, since it seems like the 12.6.9 runners have AVX2 pretty reliably, and the 12.7 runners represent a regression of sorts.
FWIW, AOM actually does cpu feature detection before it uses its own avx2 optimizations, but when building it will add the -mavx2
compile flag so long as the compiler supports it, and that enables avx2 optimizations in glibc. The failures in these tests all seem to be caused by a call to the round
function, which has a vpbroadcastq
instruction in the SIMD-optimized version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CMAKE_OSX_DEPLOYMENT_TARGET is a red herring. Before I figured out that the hardware was to blame, I had hoped that targeting a specific OS X version might resolve this apparent discrepancy between 12.6.9 and 12.7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the extra context, I understand the issue much better now.
Can we maybe use a sed
command to add -DENABLE_AVX2=OFF
to aom.cmd in the CI workflows, similar to what we used to do for shared libs here? 12e0666#diff-cf7515a55275bc6077268c78cd03a0f56422f2e21e35040f907cb7965aed4f14L45
It's not pretty but I'll take any fix at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I think that ought to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just make sure to do it before the actions/cache call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so here are a few comments:
- first, thank you, this is very welcome (the ext folder is definitely not CMake friendly and has grown too much). Using FetchContent will definitely clean our CMake, make it more future proof, more compatible with static libraries and cross compilation
- there are unrelated fixes in this PR, you should make other PRs to get those fixes in first
- the Local*.cmake files could be made more user friendly: have comments that explain what you should change if you want to tweak those (e.g. have libaom with specific flags), set a version number variable on top so that it is easy to update. I believe some variables are useless define (e.g. *_INCLUDE_DIRS). Maybe we should focus on having proper targets that we can then use in the libavif CMake.
- I believe we should first have PR with Local*.cmake for dependencies we do not want to tweak: definitely gtest, libxml2, zlibpng, jpeg, maybe libsharpyuv, libyuv, rav1e, dav1d. We can then focus on the ones we usually tweak as developers (libaom, gav1)
@@ -110,6 +100,8 @@ jobs: | |||
-DAVIF_BUILD_EXAMPLES=ON -DAVIF_BUILD_APPS=ON | |||
-DAVIF_BUILD_TESTS=ON -DAVIF_ENABLE_GTEST=ON -DAVIF_LOCAL_GTEST=ON | |||
-DAVIF_BUILD_GDK_PIXBUF=ON -DCMAKE_INSTALL_PREFIX=./install | |||
-DENABLE_AVX2=OFF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should probably be its own PR right?
@@ -5,13 +5,19 @@ plugins { | |||
android { | |||
namespace 'org.aomedia.avif.android' | |||
compileSdk 30 | |||
ndkVersion "25.2.9519653" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should probably be its own PR
tests/gtest/avifavmtest.cc
Outdated
@@ -53,9 +53,9 @@ TEST_P(AvmTest, EncodeDecode) { | |||
decoder->codecChoice = av1_codec; | |||
// AVIF_RESULT_NO_CODEC_AVAILABLE is expected because av1_codec is not | |||
// enabled or because we are trying to decode an AV2 file with an AV1 codec. | |||
ASSERT_EQ(avifDecoderReadMemory(decoder.get(), decoded.get(), encoded.data, | |||
ASSERT_NE(avifDecoderReadMemory(decoder.get(), decoded.get(), encoded.data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should also be its own PR
cmake/Modules/LocalGoogletest.cmake
Outdated
add_library(GTest::gtest ALIAS gtest) | ||
add_library(GTest::gtest_main ALIAS gtest_main) | ||
|
||
set(GTEST_INCLUDE_DIRS ${googletest_SOURCE_DIR}/googletest/include) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All those variables are useless right? Because we use the targets GTest::gtest in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. I was aiming to provide the same variables as find_package but I agree that it's unnecessary.
cmake/Modules/LocalAom.cmake
Outdated
|
||
set_property(TARGET aom PROPERTY AVIF_LOCAL ON) | ||
|
||
set(AOM_INCLUDE_DIR "${libaom_SOURCE_DIR}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is also unused right? Because we use the target.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I'm not using any include_dir variables, everything is based on interface include directories. Likewise for transitive library dependencies.
cmake/Modules/LocalDav1d.cmake
Outdated
endif() | ||
|
||
function(avif_build_local_dav1d) | ||
set(source_dir "${AVIF_SOURCE_DIR}/ext/dav1d") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this PR I opted to put the source dirs in the same place as they are now (the source ext directory), but the default behavior is for the source to also be in the build directory. The rules for determining the directories are in the cmake docs at the end of this section. Is there a preference for these directory options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say the only reason to keep it in ext would be for legacy. ext makes it clearer to:
- tweak build parameters (but now that will be in the Local*cmake files)
- patch the repos
If we modify the source of a FetchContent, do they get reset anyway by CMake? Or is the only way to patch the sources through the CMake file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few ways to prevent cmake from overwriting local changes to source repos. One option is to set FETCHCONTENT_SOURCE_DIR_<PackageName>
; it allows you to override the source directory, and FetchContent will use whatever is there instead of fetching it. You could also make DOWNLOAD_COMMAND
a no-op if(EXISTS)
, either by default or based on a user option. If we ended up going that route maybe we would only put SOURCE_DIR in ext if that flag is present.
From what I can tell, it's strongly recommended to keep fetchcontent sources outside of the source tree and to treat them as build artifacts, and to only make exceptions when the user has the intention of modifying sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so as long as it is documented, I think we do not care. My preference would be to check if there is anything in ext: if so, take it, otherwise FetchContent it in build.
A big use case we have is to find bugs in libaom/dav1d: we might need to go down those repos and modify them and/or use a specific tag.
4656174
to
a802fa3
Compare
a802fa3
to
dadea96
Compare
Author: Frankie Dintino in AOMediaCodec#1744 "A small percentage of macOS github runners don't have avx2 support but advertise that they do, causing random "Illegal instruction: 4" errors in AOM tests."
Author: Frankie Dintino in AOMediaCodec#1744 "A small percentage of macOS github runners don't have avx2 support but advertise that they do, causing random "Illegal instruction: 4" errors in AOM tests."
Author: Frankie Dintino in AOMediaCodec#1744 "A small percentage of macOS github runners don't have avx2 support but advertise that they do, causing random "Illegal instruction: 4" errors in AOM tests."
2487437
to
8de50ea
Compare
2cc9df6
to
a2cdce3
Compare
I guess some of the files in |
Yes, I think so. What are your thoughts on my question above about source_dir? #1744 (comment) |
3b59738
to
1f93498
Compare
ac139b2
to
fdf2e93
Compare
@vrabaud this PR is ready for another review. Sorry for the long gaps in my responses; I'm out on leave from work with a newborn at home, so my attention is a bit divided. I've reworked the LocalPackage.cmake modules so that they're compatible with the existing ext scripts: if the library artifact is found in the expected location, the behavior is unchanged from what it is now. I've also set it up so that FetchContent will use the source directory in |
554e4be
to
ec57d7b
Compare
endif() | ||
|
||
set(CONFIG_PIC 1 CACHE INTERNAL "") | ||
if(libyuv_FOUND) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that means libyuv needs to be built before that, it should be commented somewhere.
if(NOT EXISTS "${LIB_FILENAME}") | ||
message(FATAL_ERROR "libavif(AVIF_LIBSHARPYUV=LOCAL): ${LIB_FILENAME} is missing, bailing out") | ||
endif() | ||
if(EXISTS "${LIB_FILENAME}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All those checks for a static lib are for backward compatibility right? In case somebody has built those libs locally. We could decide that this is not supported which would result in less code right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly.
cmake/Modules/LocalRav1e.cmake
Outdated
message(CHECK_START "libavif(AVIF_CODEC_RAV1E=LOCAL): fetching and configuring rav1e") | ||
endif() | ||
|
||
find_program(CARGO_CINSTALL cargo-cinstall HINTS "$ENV{HOME}/.cargo/bin") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe those 5 lines can be moved to below with if(NOT TARGET cargo-cinstall)
CMakeLists.txt
Outdated
@@ -716,8 +733,8 @@ if(AVIF_BUILD_APPS) | |||
) | |||
endif() | |||
if(AVIF_ENABLE_EXPERIMENTAL_GAIN_MAP) | |||
include(LocalLibargparse) | |||
if(TARGET libargparse) | |||
if(AVIF_LOCAL_LIBARGPARSE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not do include(LocalLibargparse)
and assume AVIF_LOCAL_LIBARGPARSE
is ON
by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. I suppose it's implied by AVIF_ENABLE_EXPERIMENTAL_GAIN_MAP
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly ok, please rebase. And congrats on the new kid !!!
.github/workflows/ci-unix-static.yml
Outdated
- name: Setup cmake | ||
uses: jwlawson/actions-setup-cmake@d06b37b47cfd043ec794ffa3e40e0b6b5858a7ec # v1.14.2 | ||
with: | ||
# CMake version 3.17 is required to build libwebp (which libsharpyuv is part of) on macOS. | ||
cmake-version: '3.17.x' | ||
# CMake version 3.18 is required to build libxml2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to bump here, libxml2 is not built in this CI (it is not set as a CMake argument)
ec57d7b
to
39a3c5c
Compare
The CI Windows build is failing, but I don't think that was introduced by my changes. Judging by this output from a recently run build on the main branch, it looks like the avifgainmaputil test was just being skipped previously:
|
@vrabaud Any updates on this? I see there are some version bumps on the main branch that need to be incorporated here, but I've held off on including that in a rebase while awaiting further feedback. I'm happy to push that if it's helpful for review. |
@fdintino , sorry for the late review. I believe it is good as is. We decided to keep the legacy way of building things through ext scripts for now. We can remove that code later. Please sync the d'EPS to their latest version. Thx again for this tremendous work ! |
39a3c5c
to
a8b946d
Compare
a8b946d
to
500bfd8
Compare
Thx again ! Let's see how it goes with the core developers. We will also revisit all the static lib open bugs. I guess you can also close the other CMake related PR. |
It seems the CI workflows have been slow to complete since this PR have been merged. Example before/after this PR: |
I am enabling cache in |
I also have weird issues on the CI in my fork: https://github.com/y-guyon/libavif/actions/runs/8662720594/job/23755243272#step:15:47
|
This replaces the ad-hoc scripts in the ext folder with modules that use FetchContent and ExternalProject to fetch and build the dependencies as part of the overall project.
This is a fairly radical change, so I would not take offense if it is rejected out of hand. That said, I find the developer experience for this approach very appealing. I think #1683 and #1708 were maybe trying to solve the wrong problem: static linking is hard, and trying to solve for it generally (by, for instance, trying to account for all the variations of system static libraries) is maybe not worth the time and complexity required. Better to make the build process easier and more robust for library consumers.
There is a lot of duplicated code in these local build modules that could be refactored into functions and macros, but I held off on doing any of that work—in part because I've already invested a lot of time in an approach that might not get any traction, but also because I think the mechanics are clearer without additional abstraction.
I tested out this approach with my python library here. A lot of workarounds that would have to be implemented separately (supporting arm64 cross-builds, for instance, or incorporating @wantehchang's aom alpine patch) can instead be shared and maintained with libavif itself.
I feel pretty confident in the code here and have done a fair bit of testing, but I'm marking this as draft while I await general feedback. At the very least the build documentation would need to be updated before this was ready to merge.