-
Notifications
You must be signed in to change notification settings - Fork 180
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Sequentially materialize left and right sides during hash join (#…
…3735) Run left / right child in a hash join sequentially instead of in lock-step. This prevents shuffles from executing in parallel, reducing risk of spillage. Additionally, emit finalized hash join steps as they are ready, and then return outputs of the hash join in partition order. ### Results (TPCH SF 1000, 4 x i8g.4xlarge): #### After: Daft Q1 took 29.05 seconds Daft Q2 took 28.42 seconds Daft Q3 took 42.57 seconds Daft Q4 took 19.79 seconds Daft Q5 took 141.80 seconds Daft Q6 took 11.00 seconds Daft Q7 took 66.04 seconds Daft Q8 took 128.28 seconds Daft Q9 took 254.26 seconds Daft Q10 took 43.72 seconds Total time: 12m 45s Spilled 1882432 MiB #### Before: Q1 took 31.05 seconds Q2 took 24.95 seconds Q3 took 50.91 seconds Q4 took 24.11 seconds Q5 took 177.07 seconds Q6 took 11.17 seconds Q7 took 75.97 seconds Q8 took 150.76 seconds Q9 took 263.51 seconds Q10 took 59.37 seconds Total time: 14m 29s Spilled 2200948 MiB **318,516 MB spillage difference, 14.4% decrease 1 minute and 44 seconds difference, 12% decrease** --------- Co-authored-by: Colin Ho <colinho@Colins-MBP.localdomain> Co-authored-by: Colin Ho <colinho@Colins-MacBook-Pro.local>
- Loading branch information
1 parent
00ef3bf
commit 246e3e9
Showing
1 changed file
with
98 additions
and
61 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters