Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] "Full Refresh" on_schema_change strategy #10838

Open
3 tasks done
ddm-kyasinski opened this issue Oct 9, 2024 · 0 comments
Open
3 tasks done

[Feature] "Full Refresh" on_schema_change strategy #10838

ddm-kyasinski opened this issue Oct 9, 2024 · 0 comments
Labels
enhancement New feature or request triage

Comments

@ddm-kyasinski
Copy link

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Requesting a new on_schema_change strategy for incremental models, "full_refresh". In this instance, if a schema change is detected, the model will run as if the --full-refresh arg had been appended to the dbt run command.

Describe alternatives you've considered

Overloading the should_full_refresh and is_incremental macros in our project (hacky and prone to breaking on an update of core), Modifying our CI/CD process to automatically full refresh all modified models (not desirable, would cause unnecessary refreshes of models where this use case doesn't apply)

Who will this benefit?

Anyone using Incremental Models that don't want to have to manually backfill new/modified columns in a model. In our use case, we're building a model storing the latest revisions of large JSON data structures (around 4 million records). A downstream incremental model then runs a series of json_extracts against the latest revisions of each document to provide meaningful datapoints to our end users. However, new datapoints that may already exist in the JSON are requested all the time, so when we add a new extract, we either have to manually trigger a full refresh of the data for that specific downstream model to populate existing rows, or we choose not to use an incremental model at all and run all extracts on every dag invocation, which will very quickly increase our overhead for the number of extracts done every run.

Are you interested in contributing this feature?

Unsure if able to, would need to talk to my superiors.

Anything else?

Convo in DBT Slack - https://getdbt.slack.com/archives/C50NEBJGG/p1728498654785809

@ddm-kyasinski ddm-kyasinski added enhancement New feature or request triage labels Oct 9, 2024
@ddm-kyasinski ddm-kyasinski changed the title [Feature] <title>"Full Refresh" on_schema_change strategy </title> [Feature] "Full Refresh" on_schema_change strategy Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request triage
Projects
None yet
Development

No branches or pull requests

1 participant