Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dockerfile): Add pip caching for faster build #35026

Merged
merged 11 commits into from
Oct 31, 2023
20 changes: 14 additions & 6 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -553,9 +553,9 @@ function common::install_pip_version() {
echo "${COLOR_BLUE}Installing pip version ${AIRFLOW_PIP_VERSION}${COLOR_RESET}"
echo
if [[ ${AIRFLOW_PIP_VERSION} =~ .*https.* ]]; then
pip install --disable-pip-version-check --no-cache-dir "pip @ ${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check --cache-dir $AIRFLOW_USER_HOME_DIR/.cache/pip "pip @ ${AIRFLOW_PIP_VERSION}"
else
pip install --disable-pip-version-check --no-cache-dir "pip==${AIRFLOW_PIP_VERSION}"
pip install --disable-pip-version-check --cache-dir $AIRFLOW_USER_HOME_DIR/.cache/pip "pip==${AIRFLOW_PIP_VERSION}"
fi
mkdir -p "${HOME}/.local/bin"
}
Expand Down Expand Up @@ -1123,7 +1123,7 @@ if [[ -n "${_PIP_ADDITIONAL_REQUIREMENTS=}" ]] ; then
>&2 echo " the container starts, so it is only useful for testing and trying out"
>&2 echo " of adding dependencies."
>&2 echo
pip install --root-user-action ignore --no-cache-dir ${_PIP_ADDITIONAL_REQUIREMENTS}
pip install --root-user-action ignore --cache-dir $AIRFLOW_USER_HOME_DIR/.cache/pip ${_PIP_ADDITIONAL_REQUIREMENTS}
fi


Expand Down Expand Up @@ -1386,8 +1386,15 @@ WORKDIR ${AIRFLOW_HOME}
COPY --from=scripts install_from_docker_context_files.sh install_airflow.sh \
install_additional_dependencies.sh /scripts/docker/

# Useful for creating a cache id based on the underlying architecture, preventing the use of cached python packages from
# an incorrect architecture.
ARG TARGETARCH
# Value to be able to easily change cache id and therefore use a bare new cache
ARG PIP_CACHE_EPOCH="0"

# hadolint ignore=SC2086, SC2010
RUN if [[ ${INSTALL_PACKAGES_FROM_CONTEXT} == "true" ]]; then \
RUN --mount=type=cache,id=$PYTHON_BASE_IMAGE-$AIRFLOW_PIP_VERSION-$TARGETARCH-$PIP_CACHE_EPOCH,target=$AIRFLOW_USER_HOME_DIR/.cache/pip,uid=${AIRFLOW_UID} \
if [[ ${INSTALL_PACKAGES_FROM_CONTEXT} == "true" ]]; then \
bash /scripts/docker/install_from_docker_context_files.sh; \
fi; \
if ! airflow version 2>/dev/null >/dev/null; then \
Expand All @@ -1405,8 +1412,9 @@ RUN if [[ ${INSTALL_PACKAGES_FROM_CONTEXT} == "true" ]]; then \
# In case there is a requirements.txt file in "docker-context-files" it will be installed
# during the build additionally to whatever has been installed so far. It is recommended that
# the requirements.txt contains only dependencies with == version specification
RUN if [[ -f /docker-context-files/requirements.txt ]]; then \
pip install --no-cache-dir --user -r /docker-context-files/requirements.txt; \
RUN --mount=type=cache,id=$PYTHON_BASE_IMAGE-$AIRFLOW_PIP_VERSION-$TARGETARCH-$PIP_CACHE_EPOCH,target=$AIRFLOW_USER_HOME_DIR/.cache/pip,uid=${AIRFLOW_UID} \
if [[ -f /docker-context-files/requirements.txt ]]; then \
pip install --cache-dir $AIRFLOW_USER_HOME_DIR/.cache/pip --user -r /docker-context-files/requirements.txt; \
fi

##############################################################################################
Expand Down
3 changes: 3 additions & 0 deletions docs/docker-stack/build-arg-ref.rst
Original file line number Diff line number Diff line change
Expand Up @@ -278,3 +278,6 @@ Docker context files.
| | | This allows to optimize iterations for |
| | | Image builds and speeds up CI builds. |
+------------------------------------------+------------------------------------------+------------------------------------------+
| ``PIP_CACHE_EPOCH`` | ``"0"`` | Allow to invalidate cache by passing a |
| | | new argument. |
+------------------------------------------+------------------------------------------+------------------------------------------+
12 changes: 12 additions & 0 deletions docs/docker-stack/build.rst
Original file line number Diff line number Diff line change
Expand Up @@ -972,3 +972,15 @@ The architecture of the images

You can read more details about the images - the context, their parameters and internal structure in the
`IMAGES.rst <https://github.com/apache/airflow/blob/main/IMAGES.rst>`_ document.


Pip packages caching
....................

To enable faster iteration when building the image locally (especially if you are testing different combination of
python packages), pip caching has been enabled. The caching id is based on four different parameters:

1. `PYTHON_BASE_IMAGE`: Avoid sharing same cache based on python version and target os
2. `AIRFLOW_PIP_VERSION`
3. `TARGETARCH`: Avoid sharing architecture specific cached package
4. `PIP_CACHE_EPOCH`: Enable changing cache id by passing `PIP_CACHE_EPOCH` as `--build-arg`
2 changes: 2 additions & 0 deletions docs/docker-stack/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,8 @@ Airflow 2.7

* Docker CLI version in the image is bumped to 24.0.6 version.

* PIP caching for local builds has been enabled to speed up local custom image building

* 2.7.0

* As of now, Python 3.7 is no longer supported by the Python community. Therefore, to use Airflow 2.7.0, you must ensure your Python version is
Expand Down
Loading