[FEA] Create a new Expression execution framework #7260

revans2 · 2022-12-05T22:05:39Z

Is your feature request related to a problem? Please describe.
After #7258 is at least partly done, and we have a few initial expressions from #7259 we should start to work on a framework to execute these expressions. This will likely involve a lot of back and forth with #7259 to refine things. But it is there to at least get a starting point.

The ideal for a framework like this would be to reduce the total memory usage, but that is kind of a hard problem to solve, so for the time being it will just be to keep track of the memory usage and avoid running out of memory. We may find ways to optimize memory usage in the future. The execution API from #7258 should be used. We should have a config to turn this off if we run into issues. It should dedupe expressions like tiered project does. I hope that the tiered project is stable enough that we can just have the config decide if we want to do a regular tiered project or use the memory aware version. In the memory aware version instead of executing an expression tree all at once, it should split it up. Anything under a Stateful modifying expression, including those expressions, will need to be executed as an entire group for now. For expressions that have not been updated yet, they will need to run as a part of a tree, but we can probably rewrite them with new children that are bound to specific inputs to be able to isolate them. For all other expressions we will want to try and keep everything under budget.

This would mean that we subtract the input batch size from the current budget. Starting at the leaf expressions and working towards the root we should ask each expression if they can predict up front what the size will be, and if so ask them for that size (based on the number of input rows). I think in the short term it would be best to find a subtree that can execute within the remaining budget and run it, then repeat. But if you just want to make things work we can treat them just like predictable expressions. For expressions that can only predict what they need, we need to execute these in isolation. We would need to pass their children to them and ask them to either execute it within the budget or return a response that indicates that they could not. If we see that they cannot run within budget, then we will split all pending results in half (based on the number of rows), making one half of it spillable, and the other half we will keep for execution. If we are down to one row, then we need to fail because we cannot spit it. If we think this is going to be really rare, we could also throw away all of the intermediate results and just split the input batch and redo all of the computation. I don’t like this as a long term solution, but it could be a good way to get started quickly.

If we don't do the long term solution, then we should file follow on issues, or have a good reason why we are doing it differently.

revans2 · 2023-03-31T16:31:00Z

This is probably done enough that we can close it. Not 100% of what we originally envisioned, but it is good enough to cover 99% of the use cases for expressions.

revans2 added feature request New feature or request ? - Needs Triage Need team to review and classify labels Dec 5, 2022

This was referenced Dec 5, 2022

[FEA] Avoid going over memory and size limitations in GpuExpression evaluation #7257

Open

[FEA] Memory Budget Execution for Stateful Modifying Expressions #7261

Open

sameerz added the reliability Features to improve reliability or bugs that severly impact the reliability of the plugin label Dec 6, 2022

mattahrens removed the ? - Needs Triage Need team to review and classify label Dec 8, 2022

mattahrens assigned revans2 Feb 24, 2023

mattahrens closed this as completed Mar 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Create a new Expression execution framework #7260

[FEA] Create a new Expression execution framework #7260

revans2 commented Dec 5, 2022

revans2 commented Mar 31, 2023

[FEA] Create a new Expression execution framework #7260

[FEA] Create a new Expression execution framework #7260

Comments

revans2 commented Dec 5, 2022

revans2 commented Mar 31, 2023