Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Store dbt artifacts (namely, manifest.json, catalog.json and run_results.json) after every dbt run. #1292

Closed
1 task
victormacaubas opened this issue Oct 30, 2024 · 8 comments · Fixed by #1389
Assignees
Labels
dbt:run Primarily related to dbt run command or functionality enhancement New feature or request execution:callback Tasks related to callback when executing tasks
Milestone

Comments

@victormacaubas
Copy link

victormacaubas commented Oct 30, 2024

Description

Hi all,

I’ve noticed some similar requests, but I wanted to ask if it’s possible to persist dbt artifacts (namely, manifest.json, catalog.json and run_results.json) permanently after every run, regardless of whether we use TestBehavior.AFTER_ALL or TestBehavior.AFTER_EACH.

Use case/motivation

I’d like to be able to send these run results to both Metaplane and Atlan for data observability analytics.

Related issues

#1253
#801

Are you willing to submit a PR?

  • Yes, I am willing to submit a PR!
@victormacaubas victormacaubas added enhancement New feature or request triage-needed Items need to be reviewed / assigned to milestone labels Oct 30, 2024
@dosubot dosubot bot added the dbt:run Primarily related to dbt run command or functionality label Oct 30, 2024
@pankajkoti
Copy link
Contributor

hi @victormacaubas , thanks for requesting this feature. Would it help if we start uploading these to a remote_target_path? We already introduced this config in PR #1224 but at the moment we only upload files in the target -> compileddirectory to the remote_target_path & only in case of ExecutionMode.AIRFLOW_ASYNC. We just logged an issue #1293 and would appreciate if you'd have any comment there or here highlighting how these could help you & if we start supporting these what files are more particularly of importance. Would also be nice if you would have the time to help contribute supporting one or two of these :)

@victormacaubas
Copy link
Author

Hello @pankajkoti

It would be even better if these artifacts could be persisted in a remote_target_path, like S3! I’m currently running in ExecutionMode.LOCAL and using the parsing method with manifest.json (since my project is quite large, using dbt_ls was problematic). I need to send these results to Atlan and Metaplane for alerting and data observability, and storing them externally would be ideal.

Copy link

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 30, 2024
@darennathanielviki
Copy link

Hello @pankajkoti, I would also like to propose to have this feature on ExecutionMode.KUBERNETES. We would like to upload run_results.json file to Datahub for observability of the data quality. On top of that, since dbt test is ran per model basis in Cosmos, would be great to have the run_results to be based off of the model name, something like run_results_<model_name> perhaps?

@github-actions github-actions bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Dec 3, 2024
@tatiana tatiana added execution:callback Tasks related to callback when executing tasks and removed triage-needed Items need to be reviewed / assigned to milestone labels Dec 12, 2024
@tatiana tatiana added this to the Cosmos 1.8.0 milestone Dec 12, 2024
@tatiana
Copy link
Collaborator

tatiana commented Dec 12, 2024

@victormacaubas @darennathanielviki, to give visibility, we're actively working on this task, and we should release it by Cosmos 1.8. Since different flavours of this feature have been requested, we've created a label to track them:
https://github.com/astronomer/astronomer-cosmos/issues?q=is%3Aissue%20state%3Aopen%20label%3Aexecution%3Acallback

Once this is out there, we'd love feedback. Thanks for your patience.

@victormacaubas
Copy link
Author

@tatiana thank you for your work on this! excited to see it live

@pankajkoti
Copy link
Contributor

hi @victormacaubas , we recently merged PR #1389, which introduces minor changes to the existing callback functionality and will be included in the upcoming Cosmos 1.8.0 release.

To allow users to try out these changes ahead of the official release, we have prepared an alpha release. You can install it using the following link: astronomer-cosmos 1.8.0a3. PR #1389 also provides examples showcasing how to use this callback functionality.

For additional guidance, refer to the documentation on leveraging callbacks: Callback Configuration. The helper functions demonstrated in the examples can be found here: cosmos/io.py. However, you are not limited to these; you can create your own custom callback functions using these examples as a reference and pass them via the callback argument.

We would greatly appreciate any feedback you have after testing this alpha release!

@victormacaubas
Copy link
Author

Howdy! That's great, I'll look into it and see if we can test this out next sprint =) thank you all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dbt:run Primarily related to dbt run command or functionality enhancement New feature or request execution:callback Tasks related to callback when executing tasks
Projects
None yet
4 participants