Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/rework dockerfile feedback #292

Merged
33 changes: 17 additions & 16 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -1,17 +1,18 @@
.vs
.vscode
bin
[_][Bb]uild
Builds/*
Testing/*
.idea
cmake-build-debug
cmake-build-release
build
external/*
!external/CMakeLists.txt
# Ignore everything by default
*

README.md
CHANGELOG.md
CONTRIBUTING.md
LICENSE
# First-order allow exception for select directories
!/.clang-format
!/.githooks
!/CMakeLists.txt
!/Dockerfile
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msz-rai , I forgot to remove this line. The Dockerfile can of course be ignored, so that changes it itself don't break the build cache after the COPY . . directive.

!/docs
!/extensions
!/extensions.repos
!external/CMakeLists.txt
!/include
!/ros2_standalone
!/setup.py
!/src
!/test
!/tools
14 changes: 14 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
version: 2
updates:
- package-ecosystem: "docker"
directory: "/"
schedule:
interval: "daily"
commit-message:
prefix: "🐳 "
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "daily"
commit-message:
prefix: "🛠️ "
62 changes: 50 additions & 12 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,21 +1,59 @@
ARG BASE_IMAGE=nvidia/cuda:11.7.1-devel-ubuntu22.04
ARG BASE_IMAGE=base
# Stage from full image tag name for dependabot detection
FROM nvidia/cuda:11.7.1-devel-ubuntu22.04 as base

################################################################################
# MARK: prepper - prep rgl dependencies
################################################################################
FROM $BASE_IMAGE as prepper
ARG DEBIAN_FRONTEND=noninteractive

FROM ${BASE_IMAGE} as rgl-core
RUN apt update
RUN apt install -y \
git \
cmake \
python3
# Edit apt config for caching and update once
RUN mv /etc/apt/apt.conf.d/docker-clean /etc/apt/ && \
echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' \
> /etc/apt/apt.conf.d/keep-cache && \
apt-get update

# Install bootstrap tools for install scripts
RUN --mount=type=cache,sharing=locked,target=/var/cache/apt \
apt-get install -y --no-install-recommends \
cmake \
git \
python3 \
sudo

# Set working directory using standard opt path
WORKDIR /opt/rgl

# Copy only dependencies definition files
COPY ./setup.py .

FROM rgl-core AS build
# install dependencies while caching apt downloads
# RUN --mount=type=cache,sharing=locked,target=/var/cache/apt \
# ./setup.py --install-deps-only
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This --install-deps-only arg obviously doesn't exist, but it would be nice if it did.

Even better would be splitting up the setup and compilation scripts, such that changes in the compilation script wouldn't ivalidate the build cache of consecutive layers that could install GB of dependencies.

Or- a package manifest such as a package.xml that one could point rosdep at to decouple the dependency lock files from turing complete setup scripts would be even cooler!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, thank you for the suggestion.
I will try to do it.


################################################################################
# MARK: builder - build rgl binaries
################################################################################
FROM prepper AS builder
ARG OptiX_INSTALL_DIR=/optix

WORKDIR /code
COPY . .
# Disable DNS lookups
RUN cat /etc/nsswitch.conf && \
sed -e 's#hosts:\(.*\)dns\(.*\)#hosts:\1\2#g' -i.bak /etc/nsswitch.conf && \
cat /etc/nsswitch.conf

# Copy rest of source tree
COPY . .
RUN --mount=type=bind,from=optix,target=${OptiX_INSTALL_DIR} \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we not require an internet connection at build time?

#14 [builder 3/4] RUN --mount=type=bind,from=optix,target=/optix     ./setup.py
#14 0.443 -- The C compiler identification is GNU 11.4.0
#14 0.474 -- The CXX compiler identification is GNU 11.4.0
#14 0.845 -- The CUDA compiler identification is NVIDIA 11.7.99
#14 0.851 -- Detecting C compiler ABI info
#14 0.885 -- Detecting C compiler ABI info - done
#14 0.888 -- Check for working C compiler: /usr/bin/cc - skipped
#14 0.888 -- Detecting C compile features
#14 0.888 -- Detecting C compile features - done
#14 0.890 -- Detecting CXX compiler ABI info
#14 0.939 -- Detecting CXX compiler ABI info - done
#14 0.942 -- Check for working CXX compiler: /usr/bin/c++ - skipped
#14 0.942 -- Detecting CXX compile features
#14 0.943 -- Detecting CXX compile features - done
#14 0.945 -- Detecting CUDA compiler ABI info
#14 1.322 -- Detecting CUDA compiler ABI info - done
#14 1.335 -- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
#14 1.336 -- Detecting CUDA compile features
#14 1.336 -- Detecting CUDA compile features - done
#14 1.505 [ 11%] Creating directories for 'spdlog-populate'
#14 1.505 [ 22%] Performing download step (git clone) for 'spdlog-populate'
#14 1.505 Cloning into 'spdlog-src'...
#14 1.505 fatal: unable to access 'https://github.com/gabime/spdlog.git/': Could not resolve host: github.com
#14 1.505 Cloning into 'spdlog-src'...
#14 1.505 fatal: unable to access 'https://github.com/gabime/spdlog.git/': Could not resolve host: github.com
#14 1.505 Cloning into 'spdlog-src'...
#14 1.505 fatal: unable to access 'https://github.com/gabime/spdlog.git/': Could not resolve host: github.com
#14 1.505 -- Had to git clone more than once:
#14 1.505           3 times.
#14 1.505 CMake Error at spdlog-subbuild/spdlog-populate-prefix/tmp/spdlog-populate-gitclone.cmake:31 (message):
#14 1.505   Failed to clone repository: 'https://github.com/gabime/spdlog.git'
#14 1.505 
#14 1.505 
#14 1.505 gmake[2]: *** [CMakeFiles/spdlog-populate.dir/build.make:102: spdlog-populate-prefix/src/spdlog-populate-stamp/spdlog-populate-download] Error 1
#14 1.505 gmake[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/spdlog-populate.dir/all] Error 2
#14 1.505 gmake: *** [Makefile:91: all] Error 2
#14 1.505 
#14 1.505 CMake Error at /usr/share/cmake-3.22/Modules/FetchContent.cmake:1087 (message):
#14 1.505   Build step for spdlog failed: 2
#14 1.505 Call Stack (most recent call first):
#14 1.505   /usr/share/cmake-3.22/Modules/FetchContent.cmake:1216:EVAL:2 (__FetchContent_directPopulate)
#14 1.505   /usr/share/cmake-3.22/Modules/FetchContent.cmake:1216 (cmake_language)
#14 1.505   /usr/share/cmake-3.22/Modules/FetchContent.cmake:1259 (FetchContent_Populate)
#14 1.505   external/CMakeLists.txt:9 (FetchContent_MakeAvailable)
#14 1.505 
#14 1.505 
#14 1.506 -- Configuring incomplete, errors occurred!
#14 1.506 See also "/opt/rgl/build/CMakeFiles/CMakeOutput.log".
#14 1.511 Found CUDA 11.7.99
#14 1.511 Executing command: 'cmake -B build -G 'Unix Makefiles' -DCMAKE_TOOLCHAIN_FILE= -DVCPKG_TARGET_TRIPLET= -DRGL_BUILD_PCL_EXTENSION=OFF -DRGL_BUILD_ROS2_EXTENSION=OFF -DRGL_BUILD_UDP_EXTENSION=OFF -DRGL_BUILD_SNOW_EXTENSION=OFF -DCMAKE_SHARED_LINKER_FLAGS="-Wl,-rpath=\$ORIGIN" -DRGL_BUILD_TAPED_TESTS=OFF '
#14 1.511 Traceback (most recent call last):
#14 1.511   File "/opt/rgl/./setup.py", line 361, in <module>
#14 1.511     sys.exit(main())
#14 1.511   File "/opt/rgl/./setup.py", line 175, in main
#14 1.512     run_subprocess_command(f"cmake -B {args.build_dir} -G {cfg.CMAKE_GENERATOR} {cmake_args}")
#14 1.512   File "/opt/rgl/./setup.py", line 211, in run_subprocess_command
#14 1.512     raise RuntimeError(f"Failed to execute command: '{command}'")
#14 1.512 RuntimeError: Failed to execute command: 'cmake -B build -G 'Unix Makefiles' -DCMAKE_TOOLCHAIN_FILE= -DVCPKG_TARGET_TRIPLET= -DRGL_BUILD_PCL_EXTENSION=OFF -DRGL_BUILD_ROS2_EXTENSION=OFF -DRGL_BUILD_UDP_EXTENSION=OFF -DRGL_BUILD_SNOW_EXTENSION=OFF -DCMAKE_SHARED_LINKER_FLAGS="-Wl,-rpath=\$ORIGIN" -DRGL_BUILD_TAPED_TESTS=OFF '
#14 ERROR: process "/bin/sh -c ./setup.py" did not complete successfully: exit code: 1
------
 > [builder 3/4] RUN --mount=type=bind,from=optix,target=/optix     ./setup.py:
1.511 Found CUDA 11.7.99
1.511 Executing command: 'cmake -B build -G 'Unix Makefiles' -DCMAKE_TOOLCHAIN_FILE= -DVCPKG_TARGET_TRIPLET= -DRGL_BUILD_PCL_EXTENSION=OFF -DRGL_BUILD_ROS2_EXTENSION=OFF -DRGL_BUILD_UDP_EXTENSION=OFF -DRGL_BUILD_SNOW_EXTENSION=OFF -DCMAKE_SHARED_LINKER_FLAGS="-Wl,-rpath=\$ORIGIN" -DRGL_BUILD_TAPED_TESTS=OFF '
1.511 Traceback (most recent call last):
1.511   File "/opt/rgl/./setup.py", line 361, in <module>
1.511     sys.exit(main())
1.511   File "/opt/rgl/./setup.py", line 175, in main
1.512     run_subprocess_command(f"cmake -B {args.build_dir} -G {cfg.CMAKE_GENERATOR} {cmake_args}")
1.512   File "/opt/rgl/./setup.py", line 211, in run_subprocess_command
1.512     raise RuntimeError(f"Failed to execute command: '{command}'")
1.512 RuntimeError: Failed to execute command: 'cmake -B build -G 'Unix Makefiles' -DCMAKE_TOOLCHAIN_FILE= -DVCPKG_TARGET_TRIPLET= -DRGL_BUILD_PCL_EXTENSION=OFF -DRGL_BUILD_ROS2_EXTENSION=OFF -DRGL_BUILD_UDP_EXTENSION=OFF -DRGL_BUILD_SNOW_EXTENSION=OFF -DCMAKE_SHARED_LINKER_FLAGS="-Wl,-rpath=\$ORIGIN" -DRGL_BUILD_TAPED_TESTS=OFF '
------
Dockerfile:48
--------------------
  47 |     COPY . .
  48 | >>> RUN --mount=type=bind,from=optix,target=${OptiX_INSTALL_DIR} \
  49 | >>>     ./setup.py
  50 |     
--------------------

This is a best practice for bolstering deterministic builds, that many other projects try to follow:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is possible to configure a build environment that does not require an internet connection at build time. We can stop using FetchContent and clone dependent projects with python script at --install-deps option.
We see the benefits of this practice and would like to follow it as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preferably, this arg could call a separate script file, that Docker could then COPY and invoke directly instead of this massive setup.py file, without the need to COPY any other miscellaneous sources that could otherwise change from tasks unrelated to dependency updates or dependency management.

./setup.py

FROM scratch AS export-binaries
COPY --from=build /code/build/libRobotecGPULidar.so /
# Restore DNS lookups
RUN mv /etc/nsswitch.conf.bak /etc/nsswitch.conf && \
cat /etc/nsswitch.conf

################################################################################
# MARK: exporter - export rgl binaries
################################################################################
FROM scratch AS exporter
COPY --from=builder /code/build/libRobotecGPULidar.so /
8 changes: 3 additions & 5 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -240,8 +240,7 @@ def install_pcl_deps(cfg):
if not os.path.isdir(cfg.VCPKG_DIR):
if on_linux() and not inside_docker(): # Inside docker already installed
print("Installing dependencies for vcpkg...")
run_system_command("sudo apt update")
run_system_command("sudo apt install git curl zip unzip tar freeglut3-dev libglew-dev libglfw3-dev")
run_system_command("sudo apt-get install -y git curl zip unzip tar freeglut3-dev libglew-dev libglfw3-dev")
run_subprocess_command(
f"git clone -b {cfg.VCPKG_TAG} --single-branch --depth 1 https://github.com/microsoft/vcpkg {cfg.VCPKG_DIR}")
# Bootstrap vcpkg
Expand All @@ -265,8 +264,7 @@ def install_ros2_deps(cfg):
if on_windows():
run_system_command("pip install colcon-common-extensions")
elif not inside_docker(): # Linux; Inside docker already installed
run_system_command("sudo apt update")
run_system_command("sudo apt install python3-colcon-common-extensions")
run_system_command("sudo apt-get install -y python3-colcon-common-extensions")
# Clone radar msgs
if not os.path.isdir(cfg.RADAR_MSGS_DIR):
run_subprocess_command(
Expand All @@ -286,7 +284,7 @@ def ensure_git_lfs_installed():
print("Installing git-lfs...")
run_subprocess_command(
"curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash")
run_subprocess_command("sudo apt install git-lfs")
run_subprocess_command("sudo apt-get install -y git-lfs")


def clone_taped_test_data_repo(cfg):
Expand Down