-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add helper functions for uploading target directory artifacts to remote cloud storages #1389
Conversation
Deploying astronomer-cosmos with Cloudflare Pages
|
25712cb
to
27ef798
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @pankajkoti , it's amazing to see the progress here, we'll be able to close almost 10 tickets of our backlog. A minor feedback, while this is in draft mode
e9de335
to
7a166c9
Compare
7a166c9
to
332c25a
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1389 +/- ##
==========================================
+ Coverage 96.21% 96.28% +0.07%
==========================================
Files 67 68 +1
Lines 4071 4149 +78
==========================================
+ Hits 3917 3995 +78
Misses 154 154 ☔ View full report in Codecov by Sentry. |
6138522
to
e453233
Compare
2a0a858
to
9d6fa8f
Compare
9d6fa8f
to
e4bb6a4
Compare
1203b73
to
ae83d6b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for addressing all the feedback, @pankajkoti !
This is probably a breaking record for a Cosmos PR: to close seven tickets in one go. Really excited to share this with the community in 1.8
… in ExecutionMode.VIRTUALENV (#1401) Following up on PR #1389 wherein we add helper functions to demonstrate how to utilise the callback functionality in `ExecutionMode.LOCAL`, I also tested this functionality with `ExecutionMode.VIRTUALENV` since the VirtualEnv mode operators inherit the Local mode operators and it does seem to support the callback functionality well. I was well able to test this by adding the callback arg to the VirtualEnv example DAG we have and see that the files do get uploaded to remote store. I have added the argument in comments in the example DAG so that users could take a reference of that, however I have left the argument as commented so that we do not always keep on uploading the files in our CI runs and given that we already this in the `example_operators.py` DAG. The PR also updates the documentation to reflect that the callback functionality is supported in the `ExecutionMode.VIRTUALENV` too. closes: #1399
**New Features** * Support customizing Airflow operator arguments per dbt node by @wornjs in #1339. [More information](https://astronomer.github.io/astronomer-cosmos/getting_started/custom-airflow-properties.html). * Support uploading dbt artifacts to remote cloud storages via callback by @pankajkoti in #1389. [Read more](https://astronomer.github.io/astronomer-cosmos/configuration/callbacks.html). * Add support to ``TestBehavior.BUILD`` by @tatiana in #1377. [Documentation](https://astronomer.github.io/astronomer-cosmos/configuration/testing-behavior.html). * Add support for the "at" operator when using ``LoadMode.DBT_MANIFEST`` or ``CUSTOM`` by @benjy44 in #1372 * Add dbt clone operator by @pankajastro in #1326, as documented in [here](https://astronomer.github.io/astronomer-cosmos/getting_started/operators.html). * Support rendering tasks with non-ASCII characters by @t0momi219 in #1278 [Read more](https://astronomer.github.io/astronomer-cosmos/configuration/task-display-name.html) * Add warning callback on source freshness by @pankajastro in #1400 [Read more](https://astronomer.github.io/astronomer-cosmos/configuration/source-nodes-rendering.html#on-warning-callback-callback) * Add Oracle Profile mapping by @slords and @pankajkoti in #1190 and #1404 * Emit telemetry to Scarf during DAG run by @tatiana in #1397 * Save tasks map as ``DbtToAirflowConverter`` property by @internetcoffeephone and @hheemskerk in #1362 **Bug Fixes** * Fix the mock value of port in ``TrinoBaseProfileMapping`` to be an integer by @dwolfeu #1322 * Fix access to the ``dbt docs`` menu item outside of Astro cloud by @tatiana in #1312 * Add missing ``DbtSourceGcpCloudRunJobOperator`` in module ``cosmos.operators.gcp_cloud_run_job`` by @anai-s in #1290 * Support building ``DbtDag`` without setting paths in ``ProjectConfig`` by @tatiana in #1307 * Fix parsing dbt ls outputs that contain JSONs that are not dbt nodes by @tatiana in #1296 * Fix Snowflake Profile mapping when using AWS default region by @tatiana in #1406 * Fix dag rendering for taskflow + DbtTaskGroup combo by @pankajastro in #1360 **Enhancements** * Improve dbt command execution logs to troubleshoot ``None`` values by @tatiana in #1392 * Add logging of stdout to dbt graph run_command by @KarolGongola in #1390 * Save tasks map as DbtToAirflowConverter property by @internetcoffeephone and @hheemskerk in #1362 * Support rendering build operator task-id with non-ASCII characters by @pankajastro in #1415 **Docs** * Remove extra ` char from docs by @pankajastro in #1345 * Add limitation about copying target dir files to remote by @pankajkoti in #1305 * Generalise example from README by @ReadytoRocc in #1311 * Add security policy by @tatiana, @chaosmaw and @lzdanski in # 1385 * Mention in documentation that the callback functionality is supported in ``ExecutionMode.VIRTUALENV`` by @pankajkoti in #1401 **Others** * Restore Jaffle Shop so that ``basic_cosmos_dag`` works as documented by @tatiana in #1374 * Remove Pytest durations from tests scripts by @tatiana in #1383 * Remove typing-extensions as dependency by @pankajastro in #1381 * Pin dbt-databricks version to < 1.9 by @pankajastro in #1376 * Refactor ``dbt-sqlite`` tests to use ``dbt-postgres`` by @pankajastro in #1366 * Remove 'dbt-core<1.8.9' pin by @tatiana in #1371 * Remove dependency ``eval_type_backport`` by @tatiana in #1370 * Enable kubernetes tests for dbt>=1.8 by @pankajastro #1364 * CI Workaround: Pin dbt-core, Disable SQLite Tests, and Correctly Ignore Clone Test to Pass CI by @pankajastro in #1337 * Enable Azure task in the remote store manifest example DAG by @pankajkoti in #1333 * Enable GCP remote manifest task by @pankajastro in #1332 * Add exempt label option in GH action stale job by @pankajastro in #1328 * Add integration test for source node rendering by @pankajastro in #1327 * Fix vulnerability issue on docs dependency by @tatiana in #1313 * Add postgres pod status check for k8s tests in CI by @pankajkoti in #1320 * [CI] Reduce the amount taking to run tests in the CI from 5h to 11min by @tatiana in #1297 * Enable secret detection precommit check by @pankajastro in #1302 * Fix security vulnerability, by not pinning Airflow 2.10.0 by @tatiana in #1298 * Fix Netlify build timeouts by @tatiana in #1294 * Add stalebot to label/close stale PRs and issues by @tatiana in #1288 * Unpin dbt-databricks version by @pankajastro in #1409 * Fix source resource type tests by @pankajastro in #1405 * Increase performance tests models by @tatiana in #1403 * Drop running 1000 models in the CI by @pankajkoti in #1411 * Fix releasing package to PyPI by @tatiana in #1396 * Pre-commit hook updates in #1394, #1373, #1358, #1340, #1331, #1314, #1301 Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com> Co-authored-by: Pankaj Singh <pankaj.singh@astronomer.io> Closes: #1193 --------- Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com> Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>
) This PR addresses an issue where the `callback` callable parameter's handling was modified in PR #1389 leading to a change in the way keyword arguments like context were passed to the callable. However, the Docs Operator has its own implementation of the callback callable, which only expects a single parameter, project_dir. As a result, passing extra keyword arguments like context is causing a mismatch and resulting in the error described in #1420. This PR adds the `kwargs` param in `upload_to_cloud_storage` method of the Docs operators We have integration tests for these operator but look like CI does have not required setup and it get ignored. https://github.com/astronomer/astronomer-cosmos/blob/main/dev/dags/dbt_docs.py related: #1420
This PR introduces helper functions that can be passed as callable callbacks for Cosmos tasks to execute post-task execution. These helper functions enable the uploading of artifacts (from the project's target directory) to various cloud storage providers, including AWS S3, Google Cloud Storage (GCS), Azure WASB, and general remote object stores using Airflow’s ObjectStoragePath.
Key Changes
Adds a
cosmos/io.py
module that includes the following helper functionsupload_artifacts_to_aws_s3
upload_artifacts_to_gcp_gs
upload_artifacts_to_azure_wasb
upload_artifacts_to_cloud_storage
remote_target_path
andremote_target_path_conn_id
.These helpers functions can be passed as the
callback
argument toDbtDAG
or to yourDag
instance as demonstrated in the example DAGsdev/dags/cosmos_callback_dag.py
anddev/dags/example_operators.py
correspondingly. You can also passcallback_args
as shown in the example DAGs. These helper functions are mere examples of how callback functions can be written and passed to your operators/DAGs to be executed after task completions. Taking reference of these helper functions, you can write your own callback function and pass those.Limitations
ExecutionMode.LOCAL
. We encourage the community to contribute by adding callback support for other execution modes as needed, using the implementation forExecutionMode.LOCAL
as a reference.closes: #1350
closes: #976
closes: #867
closes: #801
closes: #1292
closes: #851
closes: #1351
related: #1293
related: #1349
Checklist