Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Minor] Short circuit ApplyFunctionRewrites if there are no function rewrites #11765

Merged
merged 2 commits into from
Aug 2, 2024

Conversation

gruuya
Copy link
Contributor

@gruuya gruuya commented Aug 1, 2024

Which issue does this PR close?

Relates to #9373 and #9375.

Rationale for this change

I'm dealing with a situation where we have deeply nested plans, which we want to execute and stream the data into storage (Parquet/Delta), and we're hitting the stack overflow problem observed in the aforementioned issues.

Since this is on a write path we don't really need the analyzer/optimizer rules (I think), which are a part of the problem due to tree node recursion that takes place there. This is not an issue, since those can easily be opted out of via with_analyzer_rules/with_optimizer_rules.

However, the tightest bottleneck as per lldb is actually ApplyFunctionRewrites, which can't be opted out of, even though after #11155 it has no rewrite rules by default.

What changes are included in this PR?

Make ApplyFunctionRewrites simply bail out of the plan transformation/rewrite if it has no rules to apply (the default case presently).

Are these changes tested?

I wanted to add a test that checks for reference equity of the in/out plans but then recalled AnalyzerRule::analyze takes ownership of it.

So the only test is that I see a higher stack overflow threshold with this change.

Are there any user-facing changes?

None.

@github-actions github-actions bot added the optimizer Optimizer rules label Aug 1, 2024
@jayzhan211
Copy link
Contributor

let expr_to_function: Arc<dyn AnalyzerRule + Send + Sync> =
Arc::new(ApplyFunctionRewrites::new(self.function_rewrites.clone()));
let rules = std::iter::once(&expr_to_function).chain(self.rules.iter());

Is it better to opt it out here?

@gruuya
Copy link
Contributor Author

gruuya commented Aug 2, 2024

let expr_to_function: Arc<dyn AnalyzerRule + Send + Sync> =
Arc::new(ApplyFunctionRewrites::new(self.function_rewrites.clone()));
let rules = std::iter::once(&expr_to_function).chain(self.rules.iter());

Is it better to opt it out here?

Makes sense to me, pushed the update.

@gruuya gruuya force-pushed the short-circuit-fn-rewrite branch from 8fa74b9 to 81bb6d7 Compare August 2, 2024 05:57
@gruuya gruuya force-pushed the short-circuit-fn-rewrite branch from 81bb6d7 to 82daf94 Compare August 2, 2024 06:59
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Aug 2, 2024
Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jayzhan211
Copy link
Contributor

Thanks @gruuya

@jayzhan211 jayzhan211 merged commit df4e6cc into apache:main Aug 2, 2024
24 checks passed
@gruuya gruuya deleted the short-circuit-fn-rewrite branch August 2, 2024 09:56
@alamb
Copy link
Contributor

alamb commented Aug 5, 2024

However, the tightest bottleneck as per lldb is actually ApplyFunctionRewrites, which can't be opted out of, even though after #11155 it has no rewrite rules by default.

Maybe we should simply remove this API -- people can write their own rules directly 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants