[Minor] Short circuit `ApplyFunctionRewrites` if there are no function rewrites #11765

gruuya · 2024-08-01T15:41:17Z

Which issue does this PR close?

Relates to #9373 and #9375.

Rationale for this change

I'm dealing with a situation where we have deeply nested plans, which we want to execute and stream the data into storage (Parquet/Delta), and we're hitting the stack overflow problem observed in the aforementioned issues.

Since this is on a write path we don't really need the analyzer/optimizer rules (I think), which are a part of the problem due to tree node recursion that takes place there. This is not an issue, since those can easily be opted out of via with_analyzer_rules/with_optimizer_rules.

However, the tightest bottleneck as per lldb is actually ApplyFunctionRewrites, which can't be opted out of, even though after #11155 it has no rewrite rules by default.

What changes are included in this PR?

Make ApplyFunctionRewrites simply bail out of the plan transformation/rewrite if it has no rules to apply (the default case presently).

Are these changes tested?

I wanted to add a test that checks for reference equity of the in/out plans but then recalled AnalyzerRule::analyze takes ownership of it.

So the only test is that I see a higher stack overflow threshold with this change.

Are there any user-facing changes?

None.

jayzhan211 · 2024-08-02T00:45:59Z

datafusion/datafusion/optimizer/src/analyzer/mod.rs

Lines 139 to 141 in f044bc8

    
           let expr_to_function: Arc<dyn AnalyzerRule + Send + Sync> = 
        
               Arc::new(ApplyFunctionRewrites::new(self.function_rewrites.clone())); 
        
           let rules = std::iter::once(&expr_to_function).chain(self.rules.iter());

Is it better to opt it out here?

gruuya · 2024-08-02T05:51:22Z

datafusion/datafusion/optimizer/src/analyzer/mod.rs

Lines 139 to 141 in f044bc8

let expr_to_function: Arc<dyn AnalyzerRule + Send + Sync> =

Arc::new(ApplyFunctionRewrites::new(self.function_rewrites.clone()));

let rules = std::iter::once(&expr_to_function).chain(self.rules.iter());

Is it better to opt it out here?

Makes sense to me, pushed the update.

jayzhan211

👍

jayzhan211 · 2024-08-02T09:55:45Z

Thanks @gruuya

alamb · 2024-08-05T18:01:09Z

However, the tightest bottleneck as per lldb is actually ApplyFunctionRewrites, which can't be opted out of, even though after #11155 it has no rewrite rules by default.

Maybe we should simply remove this API -- people can write their own rules directly 🤔

Short circuit ApplyFunctionRewrites if there are no function rewrites

026a784

github-actions bot added the optimizer Optimizer rules label Aug 1, 2024

gruuya force-pushed the short-circuit-fn-rewrite branch from 8fa74b9 to 81bb6d7 Compare August 2, 2024 05:57

Short circuit ApplyFunctionRewrites in the Analyzer itself

82daf94

gruuya force-pushed the short-circuit-fn-rewrite branch from 81bb6d7 to 82daf94 Compare August 2, 2024 06:59

github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Aug 2, 2024

jayzhan211 approved these changes Aug 2, 2024

View reviewed changes

jayzhan211 merged commit df4e6cc into apache:main Aug 2, 2024
24 checks passed

gruuya deleted the short-circuit-fn-rewrite branch August 2, 2024 09:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Minor] Short circuit `ApplyFunctionRewrites` if there are no function rewrites #11765

[Minor] Short circuit `ApplyFunctionRewrites` if there are no function rewrites #11765

gruuya commented Aug 1, 2024

jayzhan211 commented Aug 2, 2024

gruuya commented Aug 2, 2024

jayzhan211 left a comment

jayzhan211 commented Aug 2, 2024

alamb commented Aug 5, 2024 •

edited

Loading

[Minor] Short circuit ApplyFunctionRewrites if there are no function rewrites #11765

[Minor] Short circuit ApplyFunctionRewrites if there are no function rewrites #11765

Conversation

gruuya commented Aug 1, 2024

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

jayzhan211 commented Aug 2, 2024

gruuya commented Aug 2, 2024

jayzhan211 left a comment

Choose a reason for hiding this comment

jayzhan211 commented Aug 2, 2024

alamb commented Aug 5, 2024 • edited Loading

[Minor] Short circuit `ApplyFunctionRewrites` if there are no function rewrites #11765

[Minor] Short circuit `ApplyFunctionRewrites` if there are no function rewrites #11765

alamb commented Aug 5, 2024 •

edited

Loading