Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync v2-7-stable with v2-7-test to release 2.7.1 #33826

Merged
merged 117 commits into from
Sep 4, 2023
Merged

Conversation

ephraimbuddy
Copy link
Contributor

Time for 2.7.1rc1!

Note that some provider changes where cherrypicked to solve some conflicts in those simplify code PR

potiuk and others added 30 commits August 18, 2023 15:33
Now that Airflow 2.7.0 is released, we can remove exclusion that
we had for openlineage which prevented from using it as
dependency of Airflow in CI.

(cherry picked from commit 008f233)
The label is provider:openlineage instead of API-53

(cherry picked from commit de17b93)
We already fixed all deprecation warnings for Pydantic 2 and we can
thus remove Pydantic 2 limitation. Even if we are waiting for other
dependencies (aws-sam-translator) it should be save to remove the
limit - we will get Pydantic 2 when aws-sam-translate new version
is released in a week or two (Pydantic 2 support has been added
last week in
aws/serverless-application-model#3282)

(cherry picked from commit 754a4ab)
* Add MySQL 8.1 to supported versions.

Anticipating Lazy Consensus to be reached we add 8.1 version
of MySQL to supported versions.

* Apply suggestions from code review

(cherry picked from commit 825f65f)
We used to have problems with `pip` backtracking when we relaxed
too much open-telemetry dependencies. It turned out that the
backtracting was only happening on Python 3.8 and that it was
ultimately caused by conflict between importlib_metadata between
Airflow and newer versions of opentelemetry (we had <5 for Python
3.8, they had >6 for all versions. The reason for limiting it in
Airflow was Celery that was not working well with importlib 5.

Since Celery 5.3 solved the problems (released 6th of June) we can
now relax the importlib_metadata limit and set Celery to version >=
5.3.0) which nicely resolves the conflict and there is no more
backtracking when trying to install newer versions of opentelemetry
for Python 3.8.

Fixes: #33577
(cherry picked from commit ae25a52)
* Improve detection of when breeze CI image needs rebuilding

Previously we have been using provider.yaml file modification as
a sign that the docker image needs rebuilding when starting image.
However just modification of provider.yaml file is not a sign
that the image needs rebuilding. The image needs rebuilding when
provider dependencies changed, but there are many more reasons why
provider.yaml file changed - especially recently provider.yaml
file contains much more information and dependencies are only part
of it. Provider.yaml files can also be modified by release manager
wnen documentation is prepared, but none of the documentation
change is a reason for rebuilding the image.

This PR optimize the check for image building introducing two
step process:

* first we check if provider.yaml files changed
* if they did, we regenerate provider dependencies by manully
  running the pre-commit script
* then provider_dependencies.json is used instead of all providers
  to determine if the image needs rebuilding

This has several nice side effects:

* the list of files that have been modified displayed to the
  user is potentially much smaller (no provider.yaml files)
* provider_dependencies.json is regenereated automatically when
  you run any breeze command, which means that you do not have
  to have pre-commit installed to regenerate it
* the notification "image needs rebuilding" will be printed less
  frequently to the user - only when it is really needed
* preparing provider documentation in CI will not trigger
  image rebuilding (which might occasionally fail in such case
  especially when we bring back a provider from long suspension
  like it happened in #33574

* Update dev/breeze/src/airflow_breeze/commands/developer_commands.py

(cherry picked from commit ac0d5b3)
Botocore has a very peculiar process of releasing new version
every day, which means that it gives `pip` hard time to figure
what will be the non-conflicting set of packages when we have
too low of a minium version set as requirement.

Since we had > 1.24 that means that `pip` had to consider
more than 340 versions for botocore, but also for related
mypy packages and also a number of aiobotocore packages when
resolving eager-upgrade.

We limit all the relevant packages to 1.28 as minimum version
now, and we should continue doing that regularly in the future.

(cherry picked from commit 5f504e9)
In order to generate constraints, we need to temporarily limit
also hive provider. There is a gap between wnen we added it
in airflow setup and when we can generate constraints for
the released providers from PyPI - we need to release the provider
similarly like we have to do it for yandex.

Therefore - until the upcoming hive provider is released (in 3 days)
we need to limit hive from being considered in Python 3.11 consstraint
generation for providers from PyPI

(cherry picked from commit 984ba22)
Redis 5 relased last week breaks celery, celery is limiting it for
now and will resolve it later, we should similarly limit redis on
our side to limit redis for users who will not upgrade to celery
that will be released shortly.

Fixes: #33744
(cherry picked from commit 3ba994d)
There were still some left-overs of EAGER_UPGRADE in PROD image
building. Howwever "eager upgrade" only makes sense for CI images.
PROD images when being built should use eager upgrades as they
are produced in the CI image step.

This PR does the following:

* removes eager upgrade parameters from PROD image
* instead, prod image build has a new flag for installing
  the images: --use-constraints-for-context-packages which will
  automatically use constraints from "docker-context-files" if
  they are present there.
* modifies the CI workflows to upload constraints as artifacts
  and download them for PROD image build when "eager upgrade"
  has been used and directs it to use "source" constraints
* adds back support to "upgrade to newer dependencies" label
  that makes it easy to test "eager upgrade"

As the result, when PROD image is build in CI:

* when regular PR is run, it will use latest github "source" constraints
* whwn "eager upgrade" PR is run, it will use the eager-upgrade
  constraints that were generated during CI build

(cherry picked from commit 2b1a194)
When we are building PROD image in CI for non main branch, we are
installing providers from PyPI rather than building them locally
from sources. Therefore we should use `PyPI` constraints for
such builds not the "source" constraints (they might differ).

This PR adds two steps:

* In the CI build, when we do not build providers we generate
  PyPI constraints additionally to source constraints
* In the PROD build we use the PyPI constraints in case we
  do not build providers locally

(cherry picked from commit f9276f0)
…#32272)

* Fix rendering the mapped parameters in the mapped operator

Signed-off-by: Hussein Awala <hussein@awala.fr>

* add template_in_template arg to expand method to tell Airflow whether to resolve the xcom data or not

* fix dag serialization tests

* Revert "fix dag serialization tests"

This reverts commit 191351c.

* Revert "add template_in_template arg to expand method to tell Airflow whether to resolve the xcom data or not"

This reverts commit 14bd392.

* Fix ListOfDictsExpandInput resolve method

* remove _iter_parse_time_resolved_kwargs method

* remove unnecessary step

---------

Signed-off-by: Hussein Awala <hussein@awala.fr>
(cherry picked from commit d1e6a5c)
Update airflow/providers/apache/hive/CHANGELOG.rst

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
(cherry picked from commit 08188f8)
…ble schema before adding (#32731)

* Check if the index is in the table schema before adding

* add pre-condition assertion

* static checks

* Update test_models.py

* integrate upstream auth manager changes

(cherry picked from commit 2950fd7)
When a PR is referenced by other PRs, our dev tool for getting the correct
commit lists the latest commit when looking for the commmit sha but we should get the oldest.

(cherry picked from commit 5b104a9)
…3418)

This is a more complete fix to #33411. This is also a follow up on
earlier implementation of #33261 that addressed checking if PRs
are merged. This one applies the same pattern to finding commit
but also improves it by checking if the (#NNNNNN) ends the subject
- so even if the PR is in the same form in the message, it will be
filtered out.

The previous "--reverse" quick fix in #33411 had potential of problem in
case there were releated PRs merged before the original PR (which is
quite posssible when you have a series of PRs referring to each other.

(cherry picked from commit 3766ab0)
* Fix pydantic warning about `orm_mode` rename

Pydantic 2 renamed orm_mode to from_attributes. This was missed during the upgrade to pydantic 2 and it gives excessive warning about the rename.
This PR fixes it

* Also rename from_orm to model_validate and use model_dump instead of dict

* Fix Pydantic 1.x compatibility

---------

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
(cherry picked from commit 75bb04b)
By going up to `parents[3]` we were going outside the repository root,
luckily(or unluckily the repo folder is also named `airflow` so the
pattern `airflow/**/example_dags/example_*.py` still worked,
but `tests/system/providers/**/example_*.py` wasn't being used.

This discovered 2 new errors:

- `example_local_to_wasb.py` was trivial to fix

- `example_redis_publish.py`is more interesting: this one fails because
`RedisPubSubSensor` constructor calls Redis.pubsub().subscribe(), which
just hangs and DagBag fails with timeout. For now I'm just deleting this
operator from the example.

(cherry picked from commit c048bd5)
eladkal and others added 14 commits September 1, 2023 16:16
* Raise variable not found if session returns empty

* Added detail to the exception for json reponse

* tests for patch api when variable doesn't exist

* Dropped fstring

* Unify varialbe not found message

---------

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
(cherry picked from commit 701c3b8)
Some of the recent sqlalchemy 2 changes used features tha were
added in 1.4.24 sqlalchemy (session.scalar).

We need to bump the minimum version to avoid accidental problems
with people upgrading and not bumping sqlalchemy nor using
constraints

Fixes: #33887
(cherry picked from commit bfab7da)
)

* Clarify that DAG authors can also run code in DAG File Processor

Small addition to our security model - it was not entirely clear
that DAG authors can also execute code in DAG File Processor and
that DAG File Processor can be run in standalone mode effectively
physically separating machines where scheduler is run and where
the code modified by DAG authors gets parsed.

Co-authored-by: Ephraim Anierobi <splendidzigy24@gmail.com>
(cherry picked from commit 1dc6ba0)
With latest change enabling Pydantic #33956 some old dependencies
(aws-sam-translator) remained in the CI image from cached
installation and they are breaking `pip check` when refreshing
the image cache. This PR bumps EPOCH numbers so that the dependencies
are not installed from cache and we have a clean, new image with
just those depencies we need.

(cherry picked from commit dd7cc87)
* fix(sensors): ensure that DateTimeSensorAsync, TimeDeltaSensorAsync, TimeSensorAsync respect soft_fail

* refactor(sensors): move the soft_fail checking logic from DateTimeSensorAsync, TimeDeltaSensorAsync, TimeSensorAsync to DateTimeTrigger

* test(triggers/temporal): add test case for DateTimeSensorAsync respects soft_fail

* fix(triggers/temporal): use the original error message with skipping postfix as message for AirflowSkipException

* Revert "fix(triggers/temporal): use the original error message with skipping postfix as message for AirflowSkipException"

This reverts commit a6d803303bf71a84e9e59e94d9c088e3120bedb5.

* Revert "test(triggers/temporal): add test case for DateTimeSensorAsync respects soft_fail"

This reverts commit 50e39e08a415685ace788ae728397a199c21e82b.

* Revert "refactor(sensors): move the soft_fail checking logic from DateTimeSensorAsync, TimeDeltaSensorAsync, TimeSensorAsync to DateTimeTrigger"

This reverts commit 985981a269cea68da719d6fd1c60bedd9a7e5225.

* Revert "fix(sensors): ensure that DateTimeSensorAsync, TimeDeltaSensorAsync, TimeSensorAsync respect soft_fail"

This reverts commit b2f2662ae1a11ea928aad57acd2892c763c2db25.

* fix(sensors): move core async sensor trigger initialization to __init__ if possible

(cherry picked from commit 9ce76e3)
…33926)

in #33403, we move trigger initialization to __init__
which causes a failure for one uses template variable

(cherry picked from commit eaa6126)
* Suspend qubole provider

Qubole has been acquired and seems that maintainers have left the project
https://github.com/qubole/qds-sdk-py#where-are-the-maintainers-
the package has been unmaintained for a long time and it's likely no-one uses it
until someone steps up to maintain it, we suspend it

Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
(cherry picked from commit 1f0e673)
The `devel_only` extra puts together all dependencies that are
needed for CI image and in order to run tests in local virtualenv
for various test cases of ours - but they are not needed as dependencies
of particular providers. They were a little "bag of everything"
and they were hiding some unused dependencies or dependencies that
were either unused or they were actually provider dependencies already.

For example we had qds-sdk dependency there which was really the
qubole provider dependency and it held us back from removing
deprecated boto library from CI image (removed in #33889).

This PR organizes the dependency a bit better:

* split it to logical groups
* removes some unused dependencies
* moves "amazon" mypy dependency from providers to here

At later stage we will move the provider ones into "[devel]" extras
of the providers as part of provider decooupling, but this
will require a bit more changes in CI image building and some
documentation update for developers. This is an intermediate step
to organize it better.

(cherry picked from commit b497234)
The only blocking factor to migrate to Pydantic 2 was the
aws-sam-translator which was transitive dependency to
`moto[cloudformation]` via `cfn-lint` and we do not really need
everything in that extra - used only for testing.

While aws-sam-translator is already preparing to release Pydantic 2
compatible version, we do not want to wait - instead we replace the
cloudformation extra with openapi_spec_validator and jsonschema
needed by the cloudformation tests.

(cherry picked from commit 1cda0c3)
)

Many users have problem with it. Adding the way how they can
check it easily.

(cherry picked from commit 9702a14)
@ephraimbuddy ephraimbuddy marked this pull request as ready for review September 1, 2023 15:24
RELEASE_NOTES.rst Outdated Show resolved Hide resolved
RELEASE_NOTES.rst Outdated Show resolved Hide resolved
RELEASE_NOTES.rst Outdated Show resolved Hide resolved
RELEASE_NOTES.rst Outdated Show resolved Hide resolved
RELEASE_NOTES.rst Outdated Show resolved Hide resolved
RELEASE_NOTES.rst Outdated Show resolved Hide resolved
RELEASE_NOTES.rst Outdated Show resolved Hide resolved
RELEASE_NOTES.rst Outdated Show resolved Hide resolved
RELEASE_NOTES.rst Outdated Show resolved Hide resolved
Owen-CH-Leung and others added 2 commits September 2, 2023 06:49
@ephraimbuddy ephraimbuddy force-pushed the v2-7-test branch 2 times, most recently from bd75ee2 to 6cb5ef1 Compare September 2, 2023 07:41
potiuk and others added 4 commits September 3, 2023 17:01
There is a new database field introduced by Celery in 5.3.2 and
repeated in 5.3.3 wihch is not included in automated migrations,
so users upgrading celery might have failing celery installation.

The issue is already reported and acknowledged, so it is lilely
to be fixed in 5.3.4 - so excluding 5.3.2 and 5.3.4 is the best
approach.

(cherry picked from commit b6318ff)
* move internal functions to methods -- no behavior change

* add setup constraint logic

* comments

* simplify

* simplify

* fix

* fix

* update tests

* static checks

* add constraint that setup tasks followed by ALL_SUCCESS rule

* add todo

* docs

* docs

* add test

* fix static check

(cherry picked from commit e75ceca)
@ephraimbuddy ephraimbuddy merged commit 891fae5 into v2-7-stable Sep 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:CLI area:dev-tools area:production-image Production image improvements and fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.