-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cheeky performance improvement on big DAGs #6694
Cheeky performance improvement on big DAGs #6694
Conversation
8d24180
to
9f96087
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Perfect, plan looks good @dbeatty10! Thanks, let me know if you need anything else from me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's goooo!
I will merge once it finishes its additional round of CI checks.
resolves #6697
Description
Small optimization on manifest parsing benefitting large DAGs.
Swapping the conditions of the
if
statement as shown in 9f96087 short-circuits the relatively expensive_already_known
method call. In a test DAG provided by my client, with 8468 models and 17103 tests, this significantly reduced calls to_already_known
and reduced runtime in my test environment. See profiling screenshot of adbt compile
command:I don't know precisely what this will translate to in real-world results, but it's safe to say (1) there are fewer calls to a relatively expensive method, and (2) the logic is the same. My client reports any
dbt build
command takes 15 minutes startup time -- I hope afterwards the startup time will be between 11-12 minutes.Checklist
changie new
to create a changelog entry