-
Notifications
You must be signed in to change notification settings - Fork 180
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
perf(swordfish): Parallel expression evaluation (#3593)
Addresses: #3389. More generally, this PR optimizes for projections with many expressions, particularly memory intensive expressions like UDFs. **Problem:** Currently, swordfish parallelizes projections across morsels, with 1 CPU per morsel. However, if each projection has many memory intensive expressions, we could experience a massive inflation in memory because we will have many materialized morsels living in memory at once. **Proposed solution:** Instead, we can parallelize the expressions within the projection (but only for expressions that require compute). This way, we still have good CPU utilization, but we keep a lower number of materialized morsels in memory. In the linked issue above, we see that a 128cpu machine will parallelize morsels across the cores, each doing multiple udfs, resulting in "317GB allocations and duration 351 secs". This PR reduces that to 7.8GB peak memory and runtime of 66 seconds. <img width="1187" alt="Screenshot 2024-12-17 at 3 54 06 PM" src="https://github.com/user-attachments/assets/88f4ad49-a1d3-4659-b49f-8364214ee146" /> **Notes:** - Found a bug with the loole channels where an `async` send to a `sync` receive was not respecting capacity constraints, and was allowing sends even though the receive did not happen. Moved over to https://github.com/fereidani/kanal, which worked much better. Todos for next time: - We should also be able to parallelize expression evaluation within a single expression, since it is a tree. We can calculate max width of the tree and set that as max parallel tasks. --------- Co-authored-by: EC2 Default User <ec2-user@ip-172-31-50-162.us-west-2.compute.internal> Co-authored-by: Colin Ho <colinho@Colins-MBP.localdomain> Co-authored-by: Colin Ho <colinho@Colins-MacBook-Pro.local>
- Loading branch information
1 parent
246e3e9
commit 1ae9605
Showing
16 changed files
with
281 additions
and
135 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.