Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify breeze commands and flags for standalone shell commands #35862

Merged
merged 1 commit into from
Nov 26, 2023

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Nov 26, 2023

Breeze shell command has been foreseen eventually to also allow single commands in CI and pre-commit, and we are preparing it to be able to do so with all the bells and whistles. There are a few improvements needed to be able to do it smoothly.

There are a few things that were preventing using breeze shell as general-purpose commmand executor:

  • we used defailt docker-compose project for running compose runs (derived from docker-compose parent folder of the basy.yml compose file. If we want to run standalone commands, we need to run them in separate projects in order to allow to run them while breeze shell is already running

  • when runing command in pre-commit hook, there is no terminal available because this command is run deep under-the-hood in git internals and it is run as command detached from the terminal it runs in. For that automated terminal allocation strategy used by docker compose was not working

  • breeze prints a lot of diagnostics of what's going on when you run the entrypoint by default - including diagnostics information printed by docker compose and automated docker builds. This is polluting the output of automated scripts (especially in pre-commits)

  • breeze command checks by defalt if image needs to be upgraded and prints information about it and allows the user to trigger the upgrade by answering "yes" for 10 seconds. This add extra delay and unnecessary information when running the scripts in pre-commits and generally in unattended mode.

  • Breeze CI entrypoint uses additional environment initialization when entering the container - it waits until DBs are available and performs few other initializaiton checks (for example loading scripts from init.sh. All that is not needed and should be skipped when running standalone scripts.

  • Sometimes it's not obious what internal commands are run by breeze shell - and it's difficult to debug/diagnose it. We run CI commands with --verbose flags so those command should not be printed even if --verbose flag is used though as they are polluting the output.

  • By default some flags are persistently stored after used (python version, backend, backend version) and they are "cached" - they are automatically used next time you start breeze - to keep the environment you use for the last time. Those shoudl not be persisted when running standalone command and want to override Python version or backend only in the command you want to execute.

The PR implements the following changes to improve those aspects:

  • Allow specifying --project-name when running breeze, breze shell breeze start-airflow and breeze down commands. Default project is breeze now and user can use any project but predefined ones are also pre-commit (to be used in pre-commits) and legacy docker-compose if one would like to use breeze down to shut down some remaining containers from the legacy project.

  • Add --restart (aliased by --remove-orphans) flag that might be used when starting the command, which will also remove orphan containers (for example postgres container running while we are switching to sqlite).

  • Add --tty option ("auto", "disabled", "enabled") in breeze shell command. This flag allows to determine strategy of terminal allocation by the underlying docker-compose flags.

  • Add --quiet flag that silences the output of entrypoint diagnostics including the output of docker-compose and docker build when they are running as part of the command.

NOTE: We also need to upgrade min docker-compose and docker version as we are using features available only in recent versions of these that allow to suppress some output.

  • Add --skip-image-upgrade-check in breeze shell to avoid checking if upgrade is needed. However if the image is not present, the latest version will be upgraded anyway - this allows to run pre-commits even if breeze has not been used before or docker system prune -a was used. We had --skip-image-check before, but we rename it to explain what check we are talking about.

  • Add --skip-environment-initialization in breeze shell to skip all initialization happening for interactive breeze entering.

  • Add --verbose-commands flag in breeze shell. For a long time we had VERBOSE_COMMANDS env variable support doing it (and printing commands as they are executed) but adding it explicitly as flag makes it easier discoverable and usable.

  • Add SKIP_SAVING_CHOICES environment variable. When this variable is set, the choices you specify when runing breeze command are not stored in cache and only used for the particular command you execute.

Few other cleanups and minor fixes were also implemented as part of this change:

  • the click decorators and corresponding parameters in breeze/shell/start-airflow commands have been sorted to easier manage them (adding/updating the option became a real pain for those as they have a lot of them)

  • version_suffix_for_pypi default for CI image has been set up to dev0 to make automated upgrade after pressing Y use the same cache as docker ci-image build

  • added PYTHONWARNINGS to be forwardeable from host environment to container - which allows to control which warnings to display.

  • only print "Github Actions" group end when we are running inside GitHub Actions.

  • switch from distutls Versions to packaging Version - used to check pre-commit version (better Python 3.12 compatibility)

  • clarified that executor flag for "shell" command can accept more values than the one in "start-airflow"

  • when config options are not synchronized with "rich-click" help groups, proper location is displayed in error messsages (it asked to fix the issue in non-existing files).


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member Author

potiuk commented Nov 26, 2023

This is a logically-separated PR from #35830 as discussed in #35830 (review)

It's mostly revolving about making breeze shell command more usable as standalone "driver" to execute arbitrary commands. It's quite a bit smaller than #35830 and focuses only on making such execution possible - once this is complete, I think I will just rebase the #35830 and what will be left is actually switching to the new breeze shell command as execution driver by the few pre-commits we have already using breeze code in a twisted way.

I still left a few minor "loosely-related" fixes. I could probably extract out a few of those in separate PRs if you think it makes sense ("Few other cleanups and minor fixes were also implemented as part of this change" above). But it would inflate number of PRs and since it is just dev env, It's not going to be cherry-picked individually, so I'd rather lave them in - there are one-liner fixes that are anyhow done close to change already being implemented.

But if needed I can separate them.

@potiuk potiuk force-pushed the improve-shell-command-flags branch 3 times, most recently from c77b1af to e06cae0 Compare November 26, 2023 14:05
@potiuk
Copy link
Member Author

potiuk commented Nov 26, 2023

BTW. Even with those small fixes adde - it should be FAR easier to review than #35830

Breeze `shell` command has been foreseen eventually to also allow
single commands in CI and pre-commit, and we are preparing it to
be able to do so with all the bells and whistles. There are a
few improvements needed to be able to do it smoothly.

There are a few things that were preventing using `breeze shell` as
general-purpose commmand executor:

* we used defailt `docker-compose` project for running compose runs
  (derived from `docker-compose` parent folder of the `basy.yml` compose
  file. If we want to run standalone commands, we need to run them in
  separate projects in order to allow to run them while `breeze` shell
  is already running

* when runing command in pre-commit hook, there is no terminal available
  because this command is run deep under-the-hood in git internals and
  it is run as command detached from the terminal it runs in. For that
  automated terminal allocation strategy used by docker compose was
  not working

* breeze prints a lot of diagnostics of what's going on when you run
  the entrypoint by default - including diagnostics information
  printed by docker compose and automated docker builds. This is
  polluting the output of automated scripts (especially in pre-commits)

* breeze command checks by defalt if image needs to be upgraded and
  prints information about it and allows the user to trigger the
  upgrade by answering "yes" for 10 seconds. This add extra delay
  and unnecessary information when running the scripts in pre-commits
  and generally in unattended mode.

* Breeze CI entrypoint uses additional environment initialization
  when entering the container - it waits until DBs are available
  and performs few other initializaiton checks (for example
  loading scripts from `init.sh`. All that is not needed and should
  be skipped when running standalone scripts.

* Sometimes it's not obious what internal commands are run by breeze
  shell - and it's difficult to debug/diagnose it. We run CI commands
  with `--verbose` flags so those command should not be printed even
  if `--verbose` flag is used though as they are polluting the output.

* By default some flags are persistently stored after used (python
  version, backend, backend version) and they are "cached" - they
  are automatically used next time you start breeze - to keep the
  environment you use for the last time. Those shoudl not be
  persisted when running standalone command and want to override Python
  version or backend only in the command you want to execute.

The PR implements the following changes to improve those aspects:

* Allow specifying `--project-name` when running `breeze`, `breze shell`
  `breeze start-airflow` and `breeze down` commands. Default project is
  `breeze` now and user can use any project but predefined ones are
  also `pre-commit` (to be used in pre-commits) and legacy
  `docker-compose` if one would like to use `breeze down` to shut down
  some remaining containers from the legacy project.

* Add `--restart` (aliased by `--remove-orphans`) flag that might be
  used when starting the command, which will also remove orphan
  containers (for example postgres container running while we are
  switching to sqlite).

* Add `--tty` option ("auto", "disabled", "enabled") in `breeze shell`
  command. This flag allows to determine strategy of terminal allocation
  by the underlying docker-compose flags.

* Add `--quiet` flag that silences the output of entrypoint diagnostics
  including the output of docker-compose and docker build when they
  are running as part of the command.

NOTE: We also need to upgrade min docker-compose and docker version
as we are using features available only in recent versions of these
that allow to suppress some output.

* Add `--skip-image-upgrade-check` in `breeze shell` to avoid checking
  if upgrade is needed. However if the image is not present, the latest
  version will be upgraded anyway - this allows to run pre-commits even
  if breeze has not been used before or `docker system prune -a`
  was used. We had `--skip-image-check` before, but we rename it to
  explain what check we are talking about.

* Add `--skip-environment-initialization` in `breeze shell` to skip
  all initialization happening for interactive breeze entering.

* Add `--verbose-commands` flag in `breeze shell`. For a long time
  we had `VERBOSE_COMMANDS` env variable support doing it (and printing
  commands as they are executed) but adding it explicitly as flag makes
  it easier discoverable and usable.

* Add `SKIP_SAVING_CHOICES` environment variable. When this variable
  is set, the choices you specify when runing breeze command are
  not stored in cache and only used for the particular command you
  execute.

Few other cleanups and minor fixes  were also implemented as part of this
change:

* the click decorators and corresponding parameters in
  breeze/shell/start-airflow commands have been sorted to easier manage
  them (adding/updating the option became a real pain for those as they
  have a lot of them)

* version_suffix_for_pypi default for CI image has been set up to dev0
  to make automated upgrade after pressing `Y` use the same cache
  as `docker ci-image build`

* added PYTHONWARNINGS to be forwardeable from host environment to
  container - which allows to control which warnings to display.

* only print "Github Actions" group end when we are running inside
  GitHub Actions.

* switch from distutls Versions to packaging Version - used
  to check pre-commit version (better Python 3.12 compatibility)

* clarified that executor flag for "shell" command can accept more
  values than the one in "start-airflow"

* when config options are not synchronized with "rich-click" help
  groups, proper location is displayed in error messsages (it asked
  to fix the issue in non-existing files).
Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First I was fearing the size to review - until I realized that 50% was boilerplate code changes and 50% SVG image generated :-D LGTM!

@potiuk
Copy link
Member Author

potiuk commented Nov 26, 2023

First I was fearing the size to review - until I realized that 50% was boilerplate code changes and 50% SVG image generated :-D LGTM!

Precisely :).

@potiuk potiuk merged commit 4495de7 into apache:main Nov 26, 2023
@potiuk potiuk deleted the improve-shell-command-flags branch November 26, 2023 19:38
potiuk added a commit to potiuk/airflow that referenced this pull request Nov 28, 2023
When pre-commits are run and breeze shell is opened they were
both attempting to use the same port forwarding. However pre-commits
do not ever need port forwarding because the forwarded ports are
only needed when you run breeze shell interactively.

This is a follow-up afte apache#35862.

This PR removes port forwarding when "pre-commit" project name is
used to run breeze shell command.
potiuk added a commit that referenced this pull request Nov 28, 2023
…5922)

When pre-commits are run and breeze shell is opened they were
both attempting to use the same port forwarding. However pre-commits
do not ever need port forwarding because the forwarded ports are
only needed when you run breeze shell interactively.

This is a follow-up afte #35862.

This PR removes port forwarding when "pre-commit" project name is
used to run breeze shell command.
potiuk added a commit to potiuk/airflow that referenced this pull request Dec 10, 2023
So far we executed commands in CI image in breeze in two ways:

* entering the shell (which runs docker-compose under the hood)
* running `docker run` with the CI image

This requires rather complex mapping of environment variables
between `docker-compose` and `docker`. Since recently (apache#35862)
we can use `shell` command to run commands in very similar way
as docker run (with docker-compose, without database and extra
components - just using the same `breeze shell` mechanisms.

This PR converts all the usages of docker run CI_IMAGE we had
and converts them to use modified `enter_shell` method that has
been moved to "docker_command_utils". This also simplified
passing arguments to the "enter_shell" command - no longer need
to filter out none parameters and **kwargs - all parameters are
passed explicitly.

This also allowed to remove some of the code (extracting args,
filtering_out_none) that are not used anymore.

The entypoint CI has been slightly refactored - to provide
a bit better structure and handle `--skip-environment-initialization`
better - we can now both set `--use-airflow-version` and
`--skip-environment-initialization` which was not possible
before.
potiuk added a commit that referenced this pull request Dec 10, 2023
So far we executed commands in CI image in breeze in two ways:

* entering the shell (which runs docker-compose under the hood)
* running `docker run` with the CI image

This requires rather complex mapping of environment variables
between `docker-compose` and `docker`. Since recently (#35862)
we can use `shell` command to run commands in very similar way
as docker run (with docker-compose, without database and extra
components - just using the same `breeze shell` mechanisms.

This PR converts all the usages of docker run CI_IMAGE we had
and converts them to use modified `enter_shell` method that has
been moved to "docker_command_utils". This also simplified
passing arguments to the "enter_shell" command - no longer need
to filter out none parameters and **kwargs - all parameters are
passed explicitly.

This also allowed to remove some of the code (extracting args,
filtering_out_none) that are not used anymore.

The entypoint CI has been slightly refactored - to provide
a bit better structure and handle `--skip-environment-initialization`
better - we can now both set `--use-airflow-version` and
`--skip-environment-initialization` which was not possible
before.
potiuk added a commit that referenced this pull request Dec 15, 2023
…5922)

When pre-commits are run and breeze shell is opened they were
both attempting to use the same port forwarding. However pre-commits
do not ever need port forwarding because the forwarded ports are
only needed when you run breeze shell interactively.

This is a follow-up afte #35862.

This PR removes port forwarding when "pre-commit" project name is
used to run breeze shell command.

(cherry picked from commit ad04d2a)
potiuk added a commit that referenced this pull request Dec 15, 2023
So far we executed commands in CI image in breeze in two ways:

* entering the shell (which runs docker-compose under the hood)
* running `docker run` with the CI image

This requires rather complex mapping of environment variables
between `docker-compose` and `docker`. Since recently (#35862)
we can use `shell` command to run commands in very similar way
as docker run (with docker-compose, without database and extra
components - just using the same `breeze shell` mechanisms.

This PR converts all the usages of docker run CI_IMAGE we had
and converts them to use modified `enter_shell` method that has
been moved to "docker_command_utils". This also simplified
passing arguments to the "enter_shell" command - no longer need
to filter out none parameters and **kwargs - all parameters are
passed explicitly.

This also allowed to remove some of the code (extracting args,
filtering_out_none) that are not used anymore.

The entypoint CI has been slightly refactored - to provide
a bit better structure and handle `--skip-environment-initialization`
better - we can now both set `--use-airflow-version` and
`--skip-environment-initialization` which was not possible
before.

(cherry picked from commit 2d15dbf)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants