Feature/rework dockerfile feedback #292

ruffsl · 2024-05-29T00:46:11Z

Some feedback on the Dockerfile changes.
Would be nice to decouple dependency declaration from compilation scripts.
Also, it seem that Cmake is going wild and touching the network interface for external resources at build time.

internet should not be required when compiling

as apt does not have a stable CLI and thus unsuitable for scripting - https://unix.stackexchange.com/a/590703/213124

and mark for readability

to avoid needless busting build cache of unrelated stages

to flush out all non-determinism given offline caching is impractical otherwise

For both docker and github actions. This should: - auto open a PR when new nvidia tags are available - as well as keep Github actions versions up to date

to robustly ignore randomly named IDE files and temp folders

ruffsl · 2024-05-29T00:49:41Z

Dockerfile


+# Copy rest of source tree
+COPY . .
 RUN --mount=type=bind,from=optix,target=${OptiX_INSTALL_DIR} \


Could we not require an internet connection at build time?

#14 [builder 3/4] RUN --mount=type=bind,from=optix,target=/optix ./setup.py #14 0.443 -- The C compiler identification is GNU 11.4.0 #14 0.474 -- The CXX compiler identification is GNU 11.4.0 #14 0.845 -- The CUDA compiler identification is NVIDIA 11.7.99 #14 0.851 -- Detecting C compiler ABI info #14 0.885 -- Detecting C compiler ABI info - done #14 0.888 -- Check for working C compiler: /usr/bin/cc - skipped #14 0.888 -- Detecting C compile features #14 0.888 -- Detecting C compile features - done #14 0.890 -- Detecting CXX compiler ABI info #14 0.939 -- Detecting CXX compiler ABI info - done #14 0.942 -- Check for working CXX compiler: /usr/bin/c++ - skipped #14 0.942 -- Detecting CXX compile features #14 0.943 -- Detecting CXX compile features - done #14 0.945 -- Detecting CUDA compiler ABI info #14 1.322 -- Detecting CUDA compiler ABI info - done #14 1.335 -- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped #14 1.336 -- Detecting CUDA compile features #14 1.336 -- Detecting CUDA compile features - done #14 1.505 [ 11%] Creating directories for 'spdlog-populate' #14 1.505 [ 22%] Performing download step (git clone) for 'spdlog-populate' #14 1.505 Cloning into 'spdlog-src'... #14 1.505 fatal: unable to access 'https://github.com/gabime/spdlog.git/': Could not resolve host: github.com #14 1.505 Cloning into 'spdlog-src'... #14 1.505 fatal: unable to access 'https://github.com/gabime/spdlog.git/': Could not resolve host: github.com #14 1.505 Cloning into 'spdlog-src'... #14 1.505 fatal: unable to access 'https://github.com/gabime/spdlog.git/': Could not resolve host: github.com #14 1.505 -- Had to git clone more than once: #14 1.505 3 times. #14 1.505 CMake Error at spdlog-subbuild/spdlog-populate-prefix/tmp/spdlog-populate-gitclone.cmake:31 (message): #14 1.505 Failed to clone repository: 'https://github.com/gabime/spdlog.git' #14 1.505 #14 1.505 #14 1.505 gmake[2]: *** [CMakeFiles/spdlog-populate.dir/build.make:102: spdlog-populate-prefix/src/spdlog-populate-stamp/spdlog-populate-download] Error 1 #14 1.505 gmake[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/spdlog-populate.dir/all] Error 2 #14 1.505 gmake: *** [Makefile:91: all] Error 2 #14 1.505 #14 1.505 CMake Error at /usr/share/cmake-3.22/Modules/FetchContent.cmake:1087 (message): #14 1.505 Build step for spdlog failed: 2 #14 1.505 Call Stack (most recent call first): #14 1.505 /usr/share/cmake-3.22/Modules/FetchContent.cmake:1216:EVAL:2 (__FetchContent_directPopulate) #14 1.505 /usr/share/cmake-3.22/Modules/FetchContent.cmake:1216 (cmake_language) #14 1.505 /usr/share/cmake-3.22/Modules/FetchContent.cmake:1259 (FetchContent_Populate) #14 1.505 external/CMakeLists.txt:9 (FetchContent_MakeAvailable) #14 1.505 #14 1.505 #14 1.506 -- Configuring incomplete, errors occurred! #14 1.506 See also "/opt/rgl/build/CMakeFiles/CMakeOutput.log". #14 1.511 Found CUDA 11.7.99 #14 1.511 Executing command: 'cmake -B build -G 'Unix Makefiles' -DCMAKE_TOOLCHAIN_FILE= -DVCPKG_TARGET_TRIPLET= -DRGL_BUILD_PCL_EXTENSION=OFF -DRGL_BUILD_ROS2_EXTENSION=OFF -DRGL_BUILD_UDP_EXTENSION=OFF -DRGL_BUILD_SNOW_EXTENSION=OFF -DCMAKE_SHARED_LINKER_FLAGS="-Wl,-rpath=\$ORIGIN" -DRGL_BUILD_TAPED_TESTS=OFF ' #14 1.511 Traceback (most recent call last): #14 1.511 File "/opt/rgl/./setup.py", line 361, in <module> #14 1.511 sys.exit(main()) #14 1.511 File "/opt/rgl/./setup.py", line 175, in main #14 1.512 run_subprocess_command(f"cmake -B {args.build_dir} -G {cfg.CMAKE_GENERATOR} {cmake_args}") #14 1.512 File "/opt/rgl/./setup.py", line 211, in run_subprocess_command #14 1.512 raise RuntimeError(f"Failed to execute command: '{command}'") #14 1.512 RuntimeError: Failed to execute command: 'cmake -B build -G 'Unix Makefiles' -DCMAKE_TOOLCHAIN_FILE= -DVCPKG_TARGET_TRIPLET= -DRGL_BUILD_PCL_EXTENSION=OFF -DRGL_BUILD_ROS2_EXTENSION=OFF -DRGL_BUILD_UDP_EXTENSION=OFF -DRGL_BUILD_SNOW_EXTENSION=OFF -DCMAKE_SHARED_LINKER_FLAGS="-Wl,-rpath=\$ORIGIN" -DRGL_BUILD_TAPED_TESTS=OFF ' #14 ERROR: process "/bin/sh -c ./setup.py" did not complete successfully: exit code: 1 ------ > [builder 3/4] RUN --mount=type=bind,from=optix,target=/optix ./setup.py: 1.511 Found CUDA 11.7.99 1.511 Executing command: 'cmake -B build -G 'Unix Makefiles' -DCMAKE_TOOLCHAIN_FILE= -DVCPKG_TARGET_TRIPLET= -DRGL_BUILD_PCL_EXTENSION=OFF -DRGL_BUILD_ROS2_EXTENSION=OFF -DRGL_BUILD_UDP_EXTENSION=OFF -DRGL_BUILD_SNOW_EXTENSION=OFF -DCMAKE_SHARED_LINKER_FLAGS="-Wl,-rpath=\$ORIGIN" -DRGL_BUILD_TAPED_TESTS=OFF ' 1.511 Traceback (most recent call last): 1.511 File "/opt/rgl/./setup.py", line 361, in <module> 1.511 sys.exit(main()) 1.511 File "/opt/rgl/./setup.py", line 175, in main 1.512 run_subprocess_command(f"cmake -B {args.build_dir} -G {cfg.CMAKE_GENERATOR} {cmake_args}") 1.512 File "/opt/rgl/./setup.py", line 211, in run_subprocess_command 1.512 raise RuntimeError(f"Failed to execute command: '{command}'") 1.512 RuntimeError: Failed to execute command: 'cmake -B build -G 'Unix Makefiles' -DCMAKE_TOOLCHAIN_FILE= -DVCPKG_TARGET_TRIPLET= -DRGL_BUILD_PCL_EXTENSION=OFF -DRGL_BUILD_ROS2_EXTENSION=OFF -DRGL_BUILD_UDP_EXTENSION=OFF -DRGL_BUILD_SNOW_EXTENSION=OFF -DCMAKE_SHARED_LINKER_FLAGS="-Wl,-rpath=\$ORIGIN" -DRGL_BUILD_TAPED_TESTS=OFF ' ------ Dockerfile:48 -------------------- 47 | COPY . . 48 | >>> RUN --mount=type=bind,from=optix,target=${OptiX_INSTALL_DIR} \ 49 | >>> ./setup.py 50 | --------------------

This is a best practice for bolstering deterministic builds, that many other projects try to follow:

build: disable internet access during the build step autowarefoundation/autoware-github-actions#241

Yes, it is possible to configure a build environment that does not require an internet connection at build time. We can stop using FetchContent and clone dependent projects with python script at --install-deps option.
We see the benefits of this practice and would like to follow it as well.

Preferably, this arg could call a separate script file, that Docker could then COPY and invoke directly instead of this massive setup.py file, without the need to COPY any other miscellaneous sources that could otherwise change from tasks unrelated to dependency updates or dependency management.

ruffsl · 2024-05-29T00:54:19Z

Dockerfile

-FROM rgl-core AS build
+# install dependencies while caching apt downloads
+# RUN --mount=type=cache,sharing=locked,target=/var/cache/apt \
+#    ./setup.py --install-deps-only


This --install-deps-only arg obviously doesn't exist, but it would be nice if it did.

Even better would be splitting up the setup and compilation scripts, such that changes in the compilation script wouldn't ivalidate the build cache of consecutive layers that could install GB of dependencies.

Or- a package manifest such as a package.xml that one could point rosdep at to decouple the dependency lock files from turing complete setup scripts would be even cooler!

Good idea, thank you for the suggestion.
I will try to do it.

msz-rai · 2024-05-29T10:52:49Z

Thank you for your feedback. Your changes and suggestions are valuable and significantly enhance RGL’s Docker-based pipeline.

If you do not mind, we would like to merge your PR first and work on your suggestions on top of your changes.

ruffsl · 2024-05-29T13:08:52Z

If you do not mind, we would like to merge your PR first and work on your suggestions on top of your changes.

Sure, feel free to merge.

ruffsl · 2024-05-29T15:08:45Z

.dockerignore

+!/.clang-format
+!/.githooks
+!/CMakeLists.txt
+!/Dockerfile


@msz-rai , I forgot to remove this line. The Dockerfile can of course be ignored, so that changes it itself don't break the build cache after the COPY . . directive.

ruffsl added 13 commits May 28, 2024 19:09

Do not apt update at build time

1488dba

internet should not be required when compiling

Use apt-get instead of apt

dd09d16

as apt does not have a stable CLI and thus unsuitable for scripting - https://unix.stackexchange.com/a/590703/213124

Automate apt-get yes prompts for non-interactive scripting

001be1f

Use placeholder stage for dependabot automation

79f1d60

Rename stages uniquely for locatability

078fab6

and mark for readability

Move ARG to appropriate scope

29f2247

to avoid needless busting build cache of unrelated stages

Disable internet for build time

457bacb

to flush out all non-determinism given offline caching is impractical otherwise

Edit apt config for caching and update once

4bc2cc5

Bootstrap and cache tools for install scripts

d4f804f

Add missing implicit dependency

1c2a77e

Add pseudo code to install dependencies only in pepper stage

781ca15

Add dependabot config

7487744

For both docker and github actions. This should: - auto open a PR when new nvidia tags are available - as well as keep Github actions versions up to date

Simplify docker ignore config by inverting exclusion set

8c6a6b9

to robustly ignore randomly named IDE files and temp folders

ruffsl commented May 29, 2024

View reviewed changes

ruffsl marked this pull request as ready for review May 29, 2024 13:08

msz-rai merged commit 830130f into RobotecAI:feature/rework-dockerfile May 29, 2024

ruffsl deleted the feature/rework-dockerfile-feedback branch May 29, 2024 13:18

ruffsl commented May 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/rework dockerfile feedback #292

Feature/rework dockerfile feedback #292

ruffsl commented May 29, 2024 •

edited

Loading

ruffsl May 29, 2024

msz-rai May 29, 2024

ruffsl May 29, 2024

ruffsl May 29, 2024

msz-rai May 29, 2024

msz-rai commented May 29, 2024

ruffsl commented May 29, 2024

ruffsl May 29, 2024

Feature/rework dockerfile feedback #292

Feature/rework dockerfile feedback #292

Conversation

ruffsl commented May 29, 2024 • edited Loading

ruffsl May 29, 2024

Choose a reason for hiding this comment

msz-rai May 29, 2024

Choose a reason for hiding this comment

ruffsl May 29, 2024

Choose a reason for hiding this comment

ruffsl May 29, 2024

Choose a reason for hiding this comment

msz-rai May 29, 2024

Choose a reason for hiding this comment

msz-rai commented May 29, 2024

ruffsl commented May 29, 2024

ruffsl May 29, 2024

Choose a reason for hiding this comment

ruffsl commented May 29, 2024 •

edited

Loading