-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove trivial nodes before building subdag #7194
Remove trivial nodes before building subdag #7194
Conversation
Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA. In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR. CLA has not been signed by users: @ttusing |
Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA. In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR. CLA has not been signed by users: @ttusing |
Believe this is in error! Filled it out a few minutes before opening PR, might be a timing issue. |
remove comment
Overall this looks like an exciting improvement. We're thinking through whether it is likely to work well for all of the graph topologies that come up in practice, but I'm pretty optimistic about that. |
I think this is strictly better than random selection in all cases, especially with switching to the product of |
@iknox-fa Looks like the issues we discussed have been addressed here. What do you think about this PR? |
@iknox-fa @jtcohen6 (cc: @leahwicz) I'd like to approve this PR and get it into the RC for 1.5, and what follows is my case for doing that. What we know:
Based on my general algorithmic/performance experience, I'd guess the following:
Since we will have time to test the performance in the wild during the RC interval, I think it is worth a 5% risk of some performance regressions if it lets us dramatically improve the experience of many users in the 95% case. Even if there are performance regressions, I believe we could probably address them by rolling forward instead of back, but rolling back is easy. |
@peterallenwebb Thanks for laying out the rationale - I'm down to give this a shot, and merge for inclusion in v1.5.0-rc1. |
@peterallenwebb thanks as well for the analysis. I feel good with getting this in the RC based off of that and monitoring to see if we see any regressions in perf from different DAG shapes. |
* remove trial nodes before building subdag * add changie * Update graph.py remove comment * further optimize by sorting node search by degree * change degree to product of in and out degree
Resolves #7195
Description
This implements the "smart trimming" suggested in (#7195).
Local testing on my project:
DBT 1.4.4:
Build time: 2 minutes 45 seconds
Version 1.5.0-b4:
Build time: 2 minutes 18 seconds
This branch:
Build time: 10 seconds
Checklist
changie new
to create a changelog entry