-
Notifications
You must be signed in to change notification settings - Fork 14.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate non-db-tests (and others) to providers/non-providers/task_sdk #42632
Comments
@potiuk I am interested in this should I take this on? Might take bit time to workout this, this area new to me , but this will help me to understand more on the airflow ci/cd . :) |
When providers have been moved in apache#42505 broke passing parameters when provider tests were passed as extra args of "testing tests" command of breeze. Previously all tests were under "tests" folder and there was an exclusion to disable "All" tests when any test was passed as parameter. But after moving tests to "providers" this stopped working. Additional exclusion needs to be added for "providers/tests" and "providers/tests_sdk/". This PR also adds autocompletion for tests passed this way by setting the click type to Path for the extra args (but without the need for the Path to exist). Also during this check it turned out that "All" tests are not working in the intended way - but this should not impact our CI only local runs. Appropriate comment has been added and it's captured in apache#42632
When providers have been moved in apache#42505 broke passing parameters when provider tests were passed as extra args of "testing tests" command of breeze. Previously all tests were under "tests" folder and there was an exclusion to disable "All" tests when any test was passed as parameter. But after moving tests to "providers" this stopped working. Additional exclusion needs to be added for "providers/tests" and "providers/tests_sdk/". This PR also adds autocompletion for tests passed this way by setting the click type to Path for the extra args (but without the need for the Path to exist). Also during this check it turned out that "All" tests are not working in the intended way - but this should not impact our CI only local runs. Appropriate comment has been added and it's captured in apache#42632
It's going to be a little more complex it seems - we likely wil have to redesign a bit the way how tests are executed in CI with different test "groups" now so i will start a wider discussion on slack channel to discuss how to do it and involve others. |
When providers have been moved in #42505 broke passing parameters when provider tests were passed as extra args of "testing tests" command of breeze. Previously all tests were under "tests" folder and there was an exclusion to disable "All" tests when any test was passed as parameter. But after moving tests to "providers" this stopped working. Additional exclusion needs to be added for "providers/tests" and "providers/tests_sdk/". This PR also adds autocompletion for tests passed this way by setting the click type to Path for the extra args (but without the need for the Path to exist). Also during this check it turned out that "All" tests are not working in the intended way - but this should not impact our CI only local runs. Appropriate comment has been added and it's captured in #42632
Absolutely! I just began analyzing this yesterday to see which components can be grouped or not. I noticed some existing options, like skip-provider-tests, and now that we have the task SDK, we may need to support this for executing tests or providing skipping options as well. I completely agree that some redesign is necessary. |
Hey Here. I would like to start a discussion and propose some redesign of our CI - to address some of the issues mentioned here - and involve others so that we come up together with a better design and then we get it implemented together. First let me state the problems and what is wrong currently. And I will write a separte comment on what would be the first proposal to address it.
Also "system" tests are a separate type - but they are under "tests" in "airflow" and "provider' tests.
|
My proposal (high level) - it needs some detailed thinking and prototyping regarding the parallel execution of tests in CI is to follow the
This means that we will loose some of the parallelism benefifts when we run We can also gradually move tests to local venvs where they do not need database containers from docker compose and Pytthon / System dependencies that we currently standardise in the CI image. Also System tests should have a 'dry-run" mode - where we can run them as part of the CI so that we know that they ar e not broken. We are not able to run system tests for all external systems, but at least we should be able to have some "dry-run" system tests that we should be able to run with all PRs. So high level we have: NOWCI image bound tests:
System tests are not currently executed. And we can do it this way:
|
I made a few refinements to the initial proposal |
Yes, there should be no deps on a any providers, not even the new standard provider as all of those need to depend on the SDK. Given the ability of UV to create predictable venvs with the right versions installed, yeah I agree. Also if it's relevant I think every* test in the task SDK will be stateless and could be parallelized - i.e. they are all non-db style tests
|
Yeah and starting to have it in a freshly created venv without any DB or whatever deps for Task SDK is actually even GOOD because there we will not have even accidental dependencies that we missed. |
Thoughts on your problem statement:
Also system tests are run by the breeze testing tests command as of now which caused some confusion and issues recently. I think that's worth having a bullet point for in your explanation comment. Thoughts on proposal:
Again, should we make system tests a top level command as well ( Otherwise I like the plan! |
Agree. Will Update the plan accordingly. |
I updated it. I added something that was long time overdue - "dry run system tests". While we wil not be able to run all the system tests from Amazon/Google/Astronomer/Teradata/Microsoftt in the future, we should be able to have some "canonical" non-externally-bound system tests that we should run to "mimic" what Amazon/Google/Astronomer/Teradata/Microsoft in the future will be doing to make sure that at least "execution framework" works. |
I updated it following the discussion on slack - adding the stage where we get rid of all db and only use task sdk in all provider tests (stage 3) |
@potiuk Going to attempt this one stage 1: |
yeah that make sense. :) |
Sorry - only see it now (Hackathon) I thought about having completely separate
Then each of the tests will have it's own (specific for the test command) set of test types it can use - core tests will have all the "airflow sub-tests", providers will be able to use Doing it this way has the nice effect that you can easily parallelise them as needed following our paralllel framework but also when you enter breeze with I actually have a draft version of this in my local branch, already and I wanted to make a PR today - so if you have not started on it, I might want to set it up for review for you :D |
No worries :). super thats great if you have it already, please go ahead, have not yet started, actually was looking over the weekend, backporting and testcontainer things. Testcontainers is incredibly helpful for reducing the need for numerous Docker files and configurations. It’s straightforward to use (we can define some test kind of interfaces to extend or implement to any testconatainer in our tests) —I experimented with Kafka and Airflow, and while it’s not yet fully functional setup, I managed to get it to a point where it connects a Kafka Testcontainer and runs a DAG from my local setup. I think this is definitely worth using. will try this runnig with GHA to see how it performs :) |
cool! |
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: apache#42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: apache#42632
…43529) When providers have been moved in apache#42505 broke passing parameters when provider tests were passed as extra args of "testing tests" command of breeze. Previously all tests were under "tests" folder and there was an exclusion to disable "All" tests when any test was passed as parameter. But after moving tests to "providers" this stopped working. Additional exclusion needs to be added for "providers/tests" and "providers/tests_sdk/". This PR also adds autocompletion for tests passed this way by setting the click type to Path for the extra args (but without the need for the Path to exist). Also during this check it turned out that "All" tests are not working in the intended way - but this should not impact our CI only local runs. Appropriate comment has been added and it's captured in apache#42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: #42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: apache#42632
The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE. However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system. This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable. Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command. Related: apache#42632
Currently
non-db-tests
execute all test type specified via--test-types
in a single process using xdist (-n processors
) parallelism in order to make efficient use of parallelisation of the tests. This means that both "core" and "providers" types of tests are run in a singlepytest
command. For example:breeze testing non-db-tests \ --parallel-test-types "API Providers[google]"
Where PARALLELL_TEST_TYPES are produced by the "selective check" to determine which tests should be run.
However the #42505 splits out providers to a separate directory and we cannot run providers and non-provider tests any longer in the same process, because pytest has a limitation where test module names should not overlap (and tests/conftest.py is different for providers and non-providers, so they cannot be run in the same process as it will result in
One solution to that (probably best approach) is to introduce a required flag (
--providers/--no-providers
) fornon-db-tests
and filter out / core provider test folders when respective flag is used, and run two separate jobs in our CI - one for providers and one for core.That would require to have to separate jobs in
ci.yml
. "Non-db Core tests", "Non-db Providers tests" - each of them passing the flag down torun-unit-tests.yml
composite workflow.The text was updated successfully, but these errors were encountered: