Use the new `loadIntoBitSet` API to speed up dense conjunctions. #14080

jpountz · 2024-12-18T14:47:19Z

Now that loading doc IDs into a bit set is much more efficient thanks to auto-vectorization, it has become tempting to evaluate dense conjunctions by and-ing bit sets.

jpountz · 2024-12-18T14:51:32Z

wikibigall on my AMD Ryzen 9 3900X:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                      OrHighHigh       56.11      (4.8%)       54.37      (4.9%)   -3.1% ( -12% -    6%) 0.043
                       OrHighMed      201.53      (3.9%)      196.14      (3.8%)   -2.7% ( -10% -    5%) 0.029
                     OrStopWords       36.04      (5.1%)       35.35      (4.1%)   -1.9% ( -10% -    7%) 0.188
                        Or3Terms      176.72      (3.2%)      174.09      (2.8%)   -1.5% (  -7% -    4%) 0.114
                 DismaxOrHighMed      172.78      (2.3%)      170.35      (2.3%)   -1.4% (  -5% -    3%) 0.050
                DismaxOrHighHigh      120.86      (2.6%)      119.23      (2.6%)   -1.4% (  -6% -    3%) 0.100
              Or2Terms2StopWords      167.72      (2.9%)      165.48      (2.4%)   -1.3% (  -6% -    4%) 0.118
                  CountOrHighMed      185.08      (2.6%)      182.76      (2.5%)   -1.3% (  -6% -    3%) 0.124
                      OrHighRare      278.00      (5.2%)      275.55      (4.9%)   -0.9% ( -10% -    9%) 0.582
                     AndHighHigh       45.04      (1.9%)       44.65      (1.8%)   -0.9% (  -4% -    2%) 0.136
                   TermTitleSort      154.93      (2.1%)      153.61      (2.5%)   -0.9% (  -5% -    3%) 0.242
                     CountOrMany       11.63      (2.6%)       11.55      (2.5%)   -0.7% (  -5% -    4%) 0.358
                 CountOrHighHigh      121.39      (2.6%)      120.50      (3.1%)   -0.7% (  -6% -    5%) 0.411
                  FilteredOrMany       16.99      (1.9%)       16.86      (3.4%)   -0.7% (  -5% -    4%) 0.394
                  FilteredPhrase       29.75      (2.0%)       29.55      (1.7%)   -0.7% (  -4% -    3%) 0.253
                      AndHighMed      129.04      (1.7%)      128.19      (1.9%)   -0.7% (  -4% -    2%) 0.244
                   TermMonthSort     3413.48      (1.9%)     3393.05      (2.1%)   -0.6% (  -4% -    3%) 0.340
                          Fuzzy2       77.37      (1.7%)       76.92      (2.3%)   -0.6% (  -4% -    3%) 0.369
                 FilteredPrefix3      135.75      (3.1%)      134.97      (3.3%)   -0.6% (  -6% -    6%) 0.571
                         Prefix3      142.04      (3.2%)      141.32      (3.6%)   -0.5% (  -6% -    6%) 0.636
                    FilteredTerm      153.85      (1.6%)      153.10      (1.7%)   -0.5% (  -3% -    2%) 0.344
                          Fuzzy1       82.18      (2.1%)       81.78      (2.5%)   -0.5% (  -5% -    4%) 0.512
             FilteredAndHighHigh       62.25      (1.5%)       61.96      (2.4%)   -0.5% (  -4% -    3%) 0.451
                          Phrase       14.99      (5.3%)       14.93      (4.9%)   -0.4% ( -10% -   10%) 0.801
            FilteredAndStopWords       47.33      (1.4%)       47.15      (2.5%)   -0.4% (  -4% -    3%) 0.538
              CombinedOrHighHigh       18.92      (1.9%)       18.85      (1.6%)   -0.4% (  -3% -    3%) 0.498
               CombinedOrHighMed       71.64      (1.8%)       71.39      (1.6%)   -0.3% (  -3% -    3%) 0.529
             CountFilteredPhrase       25.10      (2.3%)       25.02      (1.8%)   -0.3% (  -4% -    3%) 0.639
                AndMedOrHighHigh       59.93      (1.4%)       59.77      (1.9%)   -0.3% (  -3% -    3%) 0.602
               TermDayOfYearSort      618.93      (2.0%)      617.38      (3.3%)   -0.3% (  -5% -    5%) 0.769
     FilteredAnd2Terms2StopWords      195.39      (1.2%)      195.03      (1.7%)   -0.2% (  -3% -    2%) 0.691
                          OrMany       19.69      (2.2%)       19.66      (2.2%)   -0.2% (  -4% -    4%) 0.815
                    AndStopWords       32.36      (3.1%)       32.30      (2.0%)   -0.2% (  -5% -    5%) 0.847
      FilteredOr2Terms2StopWords      146.94      (0.9%)      146.75      (1.3%)   -0.1% (  -2% -    2%) 0.718
             And2Terms2StopWords      164.05      (1.9%)      163.91      (1.7%)   -0.1% (  -3% -    3%) 0.878
               FilteredAnd3Terms      192.52      (1.5%)      192.36      (2.1%)   -0.1% (  -3% -    3%) 0.886
              FilteredAndHighMed      129.25      (1.9%)      129.23      (1.8%)   -0.0% (  -3% -    3%) 0.973
                 AndHighOrMedMed       44.80      (1.1%)       44.80      (1.4%)   -0.0% (  -2% -    2%) 0.981
                        Wildcard       79.40      (3.9%)       79.49      (3.9%)    0.1% (  -7% -    8%) 0.930
               FilteredOrHighMed      152.49      (1.2%)      152.66      (1.3%)    0.1% (  -2% -    2%) 0.782
                FilteredOr3Terms      163.65      (1.2%)      163.89      (1.5%)    0.1% (  -2% -    2%) 0.727
              CombinedAndHighMed       54.76      (3.3%)       54.86      (1.9%)    0.2% (  -4% -    5%) 0.827
             CombinedAndHighHigh       15.13      (3.5%)       15.16      (2.1%)    0.2% (  -5% -    6%) 0.826
             FilteredOrStopWords       42.92      (1.8%)       43.01      (2.0%)    0.2% (  -3% -    4%) 0.737
                       And3Terms      176.10      (2.2%)      176.58      (1.7%)    0.3% (  -3% -    4%) 0.659
              FilteredOrHighHigh       63.61      (1.8%)       63.80      (2.0%)    0.3% (  -3% -    4%) 0.620
                     CountPhrase        4.25      (1.5%)        4.26      (1.7%)    0.3% (  -2% -    3%) 0.540
                      TermDTSort      291.76      (7.0%)      292.95      (8.6%)    0.4% ( -14% -   17%) 0.868
                        PKLookup      278.97      (3.0%)      280.77      (2.5%)    0.6% (  -4% -    6%) 0.458
                            Term      486.46      (3.3%)      489.90      (3.0%)    0.7% (  -5% -    7%) 0.478
                    CombinedTerm       31.81      (2.5%)       32.04      (3.2%)    0.7% (  -4% -    6%) 0.423
                      DismaxTerm      591.21      (4.3%)      596.75      (3.1%)    0.9% (  -6% -    8%) 0.432
                       CountTerm     9447.78      (4.3%)     9594.27      (5.4%)    1.6% (  -7% -   11%) 0.315
                          IntNRQ      113.26     (10.6%)      117.06     (16.1%)    3.3% ( -21% -   33%) 0.438
                  FilteredIntNRQ      111.99     (10.5%)      116.13     (16.6%)    3.7% ( -21% -   34%) 0.399
         CountFilteredOrHighHigh       62.55      (2.2%)       70.76      (1.6%)   13.1% (   9% -   17%) 0.000
             CountFilteredOrMany        8.48      (3.4%)       10.98      (2.2%)   29.5% (  23% -   36%) 0.000
          CountFilteredOrHighMed       67.76      (1.9%)       87.99      (1.5%)   29.8% (  26% -   33%) 0.000
                 CountAndHighMed      153.76      (3.0%)      273.15      (4.7%)   77.7% (  67% -   88%) 0.000
                CountAndHighHigh       53.66      (2.4%)      129.02      (6.2%)  140.5% ( 128% -  152%) 0.000

jpountz · 2024-12-18T15:09:16Z

On my Apple M3:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                  CountOrHighMed      211.78      (1.9%)      174.58      (0.9%)  -17.6% ( -19% -  -15%) 0.000
                          IntNRQ      201.91      (9.6%)      195.67      (2.9%)   -3.1% ( -14% -   10%) 0.218
                      OrHighRare      519.00     (11.4%)      503.13     (11.4%)   -3.1% ( -23% -   22%) 0.450
                      TermDTSort      362.51      (4.8%)      357.37      (5.4%)   -1.4% ( -11% -    9%) 0.430
                    FilteredTerm      256.14      (3.1%)      252.99      (2.5%)   -1.2% (  -6% -    4%) 0.215
                        Or3Terms      243.43      (5.0%)      240.50      (3.8%)   -1.2% (  -9% -    8%) 0.447
                          Fuzzy2      158.21      (2.2%)      156.61      (2.2%)   -1.0% (  -5% -    3%) 0.195
                          Fuzzy1      198.82      (2.8%)      196.82      (2.7%)   -1.0% (  -6% -    4%) 0.305
                       CountTerm    21457.63      (4.3%)    21241.75      (5.2%)   -1.0% ( -10% -    8%) 0.550
             FilteredOrStopWords       67.81      (5.7%)       67.14      (5.0%)   -1.0% ( -11% -   10%) 0.602
                          OrMany       28.30      (6.0%)       28.02      (5.9%)   -1.0% ( -12% -   11%) 0.640
                     CountPhrase        7.04      (6.1%)        6.97      (5.7%)   -1.0% ( -12% -   11%) 0.641
                     OrStopWords       49.35      (4.2%)       48.89      (4.5%)   -0.9% (  -9% -    8%) 0.548
                DismaxOrHighHigh      186.55      (5.0%)      185.05      (4.4%)   -0.8% (  -9% -    9%) 0.629
             CountFilteredPhrase       46.58      (3.1%)       46.21      (3.1%)   -0.8% (  -6% -    5%) 0.470
                AndMedOrHighHigh       80.49      (6.2%)       79.90      (7.1%)   -0.7% ( -13% -   13%) 0.756
              Or2Terms2StopWords      274.89      (3.8%)      272.96      (3.7%)   -0.7% (  -7% -    7%) 0.597
              FilteredOrHighHigh       93.49      (5.8%)       92.88      (5.1%)   -0.7% ( -10% -   10%) 0.733
                  FilteredOrMany       20.68      (6.5%)       20.54      (5.6%)   -0.6% ( -12% -   12%) 0.763
                        PKLookup      384.44      (5.0%)      382.07      (4.5%)   -0.6% (  -9% -    9%) 0.716
                            Term      860.96      (5.5%)      855.84      (4.9%)   -0.6% ( -10% -   10%) 0.746
      FilteredOr2Terms2StopWords      252.12      (4.6%)      250.66      (4.1%)   -0.6% (  -8% -    8%) 0.709
               FilteredOrHighMed      247.09      (5.1%)      245.88      (4.3%)   -0.5% (  -9% -    9%) 0.768
                        Wildcard      189.41      (4.1%)      188.52      (3.9%)   -0.5% (  -8% -    7%) 0.742
                FilteredOr3Terms      254.42      (5.6%)      253.28      (4.8%)   -0.4% ( -10% -   10%) 0.807
                  FilteredPhrase       47.58      (1.9%)       47.37      (2.2%)   -0.4% (  -4% -    3%) 0.548
                 DismaxOrHighMed      277.65      (3.7%)      276.50      (3.0%)   -0.4% (  -6% -    6%) 0.729
     FilteredAnd2Terms2StopWords      274.15      (4.8%)      273.02      (5.5%)   -0.4% ( -10% -   10%) 0.821
             And2Terms2StopWords      279.32      (2.6%)      278.35      (4.1%)   -0.3% (  -6% -    6%) 0.777
                 CountOrHighHigh      113.83      (1.3%)      113.46      (1.2%)   -0.3% (  -2% -    2%) 0.445
                         Respell      145.07      (2.6%)      144.63      (2.1%)   -0.3% (  -4% -    4%) 0.719
                       OrHighMed      308.19      (4.1%)      307.31      (3.9%)   -0.3% (  -8% -    8%) 0.840
                         Prefix3      342.80      (2.8%)      342.03      (2.1%)   -0.2% (  -4% -    4%) 0.797
               FilteredAnd3Terms      294.41      (4.1%)      294.08      (4.6%)   -0.1% (  -8% -    8%) 0.942
                 AndHighOrMedMed       70.53      (6.0%)       70.47      (6.0%)   -0.1% ( -11% -   12%) 0.964
              CombinedOrHighHigh       27.17      (4.8%)       27.16      (5.4%)   -0.1% (  -9% -   10%) 0.977
                      OrHighHigh       75.74      (4.1%)       75.75      (3.9%)    0.0% (  -7% -    8%) 0.996
                      DismaxTerm      867.09      (7.5%)      867.55      (6.7%)    0.1% ( -13% -   15%) 0.983
               TermDayOfYearSort     2896.80      (2.4%)     2898.74      (1.7%)    0.1% (  -3% -    4%) 0.928
             FilteredAndHighHigh       78.46      (5.9%)       78.58      (6.2%)    0.1% ( -11% -   12%) 0.947
            FilteredAndStopWords       52.07      (7.7%)       52.15      (8.2%)    0.1% ( -14% -   17%) 0.958
                       And3Terms      253.63      (2.7%)      254.13      (3.4%)    0.2% (  -5% -    6%) 0.854
                    AndStopWords       39.57      (3.5%)       39.66      (4.2%)    0.2% (  -7% -    8%) 0.872
              FilteredAndHighMed      179.72      (3.9%)      180.22      (4.3%)    0.3% (  -7% -    8%) 0.845
               CombinedOrHighMed      109.83      (4.7%)      110.18      (4.1%)    0.3% (  -8% -    9%) 0.837
                   TermMonthSort      450.03      (5.5%)      452.74      (5.5%)    0.6% (  -9% -   12%) 0.756
              CombinedAndHighMed       49.10      (5.3%)       49.57      (5.3%)    1.0% (  -9% -   12%) 0.607
                      AndHighMed      193.76      (3.6%)      195.66      (2.6%)    1.0% (  -5% -    7%) 0.377
                     AndHighHigh       75.00      (4.0%)       75.91      (2.4%)    1.2% (  -4% -    7%) 0.299
                          Phrase       23.13      (4.0%)       23.41      (4.0%)    1.2% (  -6% -    9%) 0.389
                   TermTitleSort      148.32      (3.1%)      150.21      (4.8%)    1.3% (  -6% -    9%) 0.375
             CombinedAndHighHigh       14.74      (5.5%)       14.94      (5.1%)    1.4% (  -8% -   12%) 0.469
                    TermGroup100       38.08      (3.8%)       38.78      (3.2%)    1.8% (  -5% -    9%) 0.146
                    TermGroup10K       35.30      (4.4%)       36.04      (4.1%)    2.1% (  -6% -   11%) 0.155
                     TermGroup1M       34.63      (3.8%)       35.50      (3.2%)    2.5% (  -4% -    9%) 0.043
                  TermBGroup1M1P       58.89      (4.4%)       60.45      (3.3%)    2.6% (  -4% -   10%) 0.053
                    TermBGroup1M       42.22      (4.2%)       43.41      (3.8%)    2.8% (  -4% -   11%) 0.046
                    CombinedTerm       51.19      (5.6%)       53.04      (2.0%)    3.6% (  -3% -   11%) 0.016
         CountFilteredOrHighHigh       81.73      (1.7%)       89.18      (1.0%)    9.1% (   6% -   12%) 0.000
             CountFilteredOrMany        9.53      (1.7%)       11.24      (1.0%)   17.9% (  14% -   21%) 0.000
          CountFilteredOrHighMed       99.37      (1.3%)      123.02      (0.9%)   23.8% (  21% -   26%) 0.000
                 CountAndHighMed      206.17      (1.5%)      263.24      (2.5%)   27.7% (  23% -   32%) 0.000
                CountAndHighHigh       67.21      (1.2%)      118.73      (2.1%)   76.7% (  72% -   80%) 0.000

CountOrHighMed has a slowdown because we evaluate counts on disjunctions via their intersection counts sometimes. I'll see if I can tune the heuristics to make it faster.

I also ran all queries from https://tantivy-search.github.io/bench/ and this change was often a big speedup (up to multiple times) and sometimes a small slowdown (< 10%).

jpountz · 2024-12-18T15:41:05Z

I made the heuristic more conservative, results now look like this on my M3 after a few iterations:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                        Wildcard      190.37      (4.6%)      184.79      (5.9%)   -2.9% ( -12% -    7%) 0.432
              FilteredOrHighHigh       95.26      (5.4%)       92.73      (3.5%)   -2.7% ( -10% -    6%) 0.407
                        PKLookup      392.08      (4.1%)      382.21      (5.5%)   -2.5% ( -11% -    7%) 0.460
                         Prefix3      344.05      (1.5%)      335.95      (3.4%)   -2.4% (  -7% -    2%) 0.206
                      TermDTSort      350.30      (4.9%)      344.70      (5.6%)   -1.6% ( -11% -    9%) 0.668
               FilteredOrHighMed      249.33      (4.9%)      245.53      (3.3%)   -1.5% (  -9% -    7%) 0.607
             FilteredOrStopWords       68.15      (5.0%)       67.11      (2.6%)   -1.5% (  -8% -    6%) 0.587
               TermDayOfYearSort     2881.75      (3.9%)     2841.87      (4.2%)   -1.4% (  -9% -    6%) 0.628
                         Respell      146.40      (1.5%)      144.40      (2.8%)   -1.4% (  -5% -    3%) 0.393
                FilteredOr3Terms      254.32      (4.3%)      251.02      (3.7%)   -1.3% (  -8% -    6%) 0.646
                     TermGroup1M       35.51      (3.2%)       35.05      (2.2%)   -1.3% (  -6% -    4%) 0.504
      FilteredOr2Terms2StopWords      250.63      (3.9%)      248.29      (3.0%)   -0.9% (  -7% -    6%) 0.704
                  FilteredOrMany       19.83      (3.2%)       19.64      (4.0%)   -0.9% (  -7% -    6%) 0.720
               FilteredAnd3Terms      296.45      (4.1%)      293.86      (6.0%)   -0.9% ( -10% -    9%) 0.810
     FilteredAnd2Terms2StopWords      276.61      (3.8%)      274.23      (6.0%)   -0.9% ( -10% -    9%) 0.810
                          Fuzzy2      158.83      (1.8%)      157.60      (2.3%)   -0.8% (  -4% -    3%) 0.601
               CombinedOrHighMed      113.92      (2.0%)      113.15      (2.0%)   -0.7% (  -4% -    3%) 0.636
                    TermBGroup1M       43.16      (4.1%)       42.87      (3.2%)   -0.7% (  -7% -    6%) 0.796
                  TermBGroup1M1P       60.47      (2.0%)       60.06      (2.8%)   -0.7% (  -5% -    4%) 0.701
            FilteredAndStopWords       52.74      (6.8%)       52.43      (9.1%)   -0.6% ( -15% -   16%) 0.918
                          Fuzzy1      198.72      (2.4%)      197.56      (2.9%)   -0.6% (  -5% -    4%) 0.754
              FilteredAndHighMed      181.55      (3.8%)      180.71      (5.1%)   -0.5% (  -8% -    8%) 0.884
                   TermTitleSort      154.88      (7.2%)      154.35      (3.3%)   -0.3% ( -10% -   10%) 0.931
                       CountTerm    20770.14      (5.1%)    20709.54      (3.6%)   -0.3% (  -8% -    8%) 0.925
              CombinedOrHighHigh       27.80      (2.5%)       27.75      (2.9%)   -0.2% (  -5% -    5%) 0.931
             And2Terms2StopWords      279.13      (4.4%)      278.82      (1.5%)   -0.1% (  -5% -    6%) 0.961
                 AndHighOrMedMed       71.54      (3.3%)       71.69      (3.7%)    0.2% (  -6% -    7%) 0.931
                  FilteredPhrase       48.21      (2.2%)       48.35      (1.1%)    0.3% (  -2% -    3%) 0.815
                    TermGroup10K       35.71      (4.9%)       35.84      (2.6%)    0.4% (  -6% -    8%) 0.896
             CountFilteredPhrase       47.87      (3.9%)       48.04      (2.7%)    0.4% (  -5% -    7%) 0.876
             FilteredAndHighHigh       79.12      (5.3%)       79.46      (7.8%)    0.4% ( -12% -   14%) 0.928
                 CountOrHighHigh      113.80      (1.3%)      114.40      (1.4%)    0.5% (  -2% -    3%) 0.588
                  CountOrHighMed      213.82      (2.3%)      215.45      (1.9%)    0.8% (  -3% -    5%) 0.609
                    TermGroup100       38.49      (6.1%)       38.79      (4.3%)    0.8% (  -9% -   11%) 0.836
                          OrMany       27.84      (6.3%)       28.07      (3.8%)    0.8% (  -8% -   11%) 0.826
                       And3Terms      252.72      (3.3%)      254.82      (2.3%)    0.8% (  -4% -    6%) 0.674
             CombinedAndHighHigh       15.06      (2.8%)       15.21      (4.8%)    1.0% (  -6% -    8%) 0.721
                    CombinedTerm       52.24      (1.0%)       52.88      (3.1%)    1.2% (  -2% -    5%) 0.445
                    AndStopWords       39.29      (4.0%)       39.81      (3.2%)    1.3% (  -5% -    8%) 0.613
                     OrStopWords       49.24      (4.5%)       49.94      (2.2%)    1.4% (  -5% -    8%) 0.572
                      AndHighMed      194.03      (3.9%)      197.15      (1.1%)    1.6% (  -3% -    6%) 0.430
                     CountPhrase        7.17      (4.1%)        7.28      (2.3%)    1.6% (  -4% -    8%) 0.484
              CombinedAndHighMed       49.90      (3.9%)       50.76      (4.6%)    1.7% (  -6% -   10%) 0.565
                DismaxOrHighHigh      185.41      (3.4%)      188.65      (2.6%)    1.7% (  -4% -    7%) 0.410
                     AndHighHigh       74.90      (4.1%)       76.32      (1.7%)    1.9% (  -3% -    8%) 0.397
                      OrHighRare      538.41      (3.4%)      548.78      (7.7%)    1.9% (  -8% -   13%) 0.649
                 DismaxOrHighMed      276.14      (3.7%)      281.50      (1.8%)    1.9% (  -3% -    7%) 0.345
                        Or3Terms      243.51      (4.5%)      248.53      (0.7%)    2.1% (  -3% -    7%) 0.368
                AndMedOrHighHigh       79.94      (8.7%)       81.70      (4.1%)    2.2% (  -9% -   16%) 0.647
              Or2Terms2StopWords      268.15      (4.0%)      274.04      (1.9%)    2.2% (  -3% -    8%) 0.325
                    FilteredTerm      255.82      (2.0%)      261.55      (0.4%)    2.2% (   0% -    4%) 0.026
                      OrHighHigh       76.05      (4.7%)       77.89      (1.7%)    2.4% (  -3% -    9%) 0.332
                       OrHighMed      307.88      (4.7%)      316.45      (2.5%)    2.8% (  -4% -   10%) 0.299
                          IntNRQ      198.18      (3.2%)      203.97      (5.7%)    2.9% (  -5% -   12%) 0.372
                          Phrase       22.91      (6.4%)       23.67      (3.5%)    3.3% (  -6% -   14%) 0.367
                            Term      869.72      (4.6%)      898.83      (1.5%)    3.3% (  -2% -    9%) 0.170
                   TermMonthSort      427.24      (6.1%)      449.10      (4.1%)    5.1% (  -4% -   16%) 0.163
                      DismaxTerm      810.59     (10.7%)      873.11      (4.1%)    7.7% (  -6% -   25%) 0.180
         CountFilteredOrHighHigh       82.81      (2.6%)       89.79      (1.0%)    8.4% (   4% -   12%) 0.000
                 CountAndHighMed      207.79      (2.4%)      247.26      (3.6%)   19.0% (  12% -   25%) 0.000
             CountFilteredOrMany        9.51      (2.7%)       11.32      (1.1%)   19.0% (  14% -   23%) 0.000
          CountFilteredOrHighMed      100.08      (1.9%)      123.90      (1.1%)   23.8% (  20% -   27%) 0.000
                CountAndHighHigh       67.56      (1.9%)      120.00      (2.1%)   77.6% (  72% -   83%) 0.000

benwtrent · 2024-12-18T18:17:23Z

lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java

+  private final FixedBitSet windowMatches = new FixedBitSet(WINDOW_SIZE);
+  private final FixedBitSet clauseWindowMatches = new FixedBitSet(WINDOW_SIZE);
+  private final DocIdStreamView docIdStreamView = new DocIdStreamView();


one concern I would ahve is memory, but it doesn't seem like this would use any more memory than our other bulk scorers as its restricted to WINDOW_SIZE.

Indeed, it only requires 2*WINDOW_SIZE bits = 1kB.

benwtrent · 2024-12-18T18:22:56Z

lucene/core/src/java/org/apache/lucene/search/BooleanScorerSupplier.java

        || subs.get(Occur.SHOULD).size() <= 1
-        || minShouldMatch > 1) {
+        || minShouldMatch != 1) {


I don't understand the switch to allow minShouldMatch == 0. I would have thought this meant they aren't applied at all in the skipping logic?

Sorry, it's unrelated to this PR. This method is only called when minShouldMatch >= 1, so minShouldMatch > 1 and minShouldMatch != 1 are equivalent, but I liked that != 1 is more defensive.

benwtrent · 2024-12-18T18:30:04Z

lucene/core/src/java/org/apache/lucene/search/BooleanScorerSupplier.java

-    Scorer filterScorer;
-    if (filters.size() == 1) {
-      filterScorer = filters.iterator().next();
+    if (scoreMode == ScoreMode.TOP_SCORES) {


Ah, its possible now to have scoreMode.needsScores() == false now. I suppose checking for TOP_SCORES or scoreModel.needsScores() is the same due to the check on line 307. It just took me a bit to connect the two.

You got it right, I'll add a comment.

jpountz · 2024-12-18T19:58:35Z

lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java

+
+/**
+ * BulkScorer implementation of {@link ConjunctionScorer} that is specialized for dense clauses.
+ * Whenever sensible, it intersects clauses by tran


Note to self, this comment got truncated it seems.

) Now that loading doc IDs into a bit set is much more efficient thanks to auto-vectorization, it has become tempting to evaluate dense conjunctions by and-ing bit sets.

msokolov · 2024-12-19T17:03:19Z

This is great! It makes me wonder if I should try reviving a dense posting encoding I had played around with a while ago where very-high-frequency terms would be encoded in the index using a bitset. If we had that we could use those directly. Testing this is tricky because we don't really have such terms in luceneutil, but I believe they occur "in nature" - for example with enumerated values that have a small number of terms

jpountz · 2024-12-20T12:58:48Z

Nightly benchmarks picked up this change, the bump is pretty cool. :) https://benchmarks.mikemccandless.com/CountAndHighHigh.html I pushed an annotation.

It makes me wonder if I should try reviving a dense posting encoding I had played around with a while ago where very-high-frequency terms would be encoded in the index using a bitset. If we had that we could use those directly.

Yes, this sounds like it could be interesting indeed! I wonder if we should make the decision on a per-block basis rather than for the whole postings, in order to benefit postings that are only dense on some specific ranges of the doc ID space (which can happen when using index sorting or recursive graph bisection).

Another idea that crossed my mind would consist of storing doc IDs as deltas from the first doc ID in the block in a short[] to further take advantage of SIMD (when applicable).

we don't really have such terms in luceneutil

Wouldn't stop words qualify? E.g. I see that "the", "of" and "not" appear in 77%, 78% and 28% of documents respectively.

msokolov · 2024-12-20T14:33:59Z

Yes, this sounds like it could be interesting indeed! I wonder if we should make the decision on a per-block basis rather than for the whole postings, in order to benefit postings that are only dense on some specific ranges of the doc ID space (which can happen when using index sorting or recursive graph bisection).

agreed .. I had previously started out with a global decision and then eventually had a per-block decision. I saw some speedups in artificial tests, but struggled to get wins in more typical workloads.I think there was some overhead from reading the "type" of block and then branching to a specialized implementation? Not sure, but there has been a lot of change now in that area, maybe it will work out better.

Right, stop words! I had previously created some new field (Month), but conjunctions including stop words would probably be a good test

…che#14080) Now that loading doc IDs into a bit set is much more efficient thanks to auto-vectorization, it has become tempting to evaluate dense conjunctions by and-ing bit sets.

@mkhludnev

Bit sets can be faster at advancing and more storage-efficient on dense blocks of postings. This is not a new idea, @mkhludnev proposed something similar a long time ago apache#6116. @msokolov recently brought up (apache#14080) that such an encoding has become especially appealing with the introduction of the `DocIdSetIterator#loadIntoBitSet` API, and the fact that non-scoring disjunctions and dense conjunctions now take advantage of it. Indeed, if postings are stored in a bit set, `#loadIntoBitSet` would just need to OR the postings bits into the bits that are used as an intermediate representation of matches of the query.

@mkhludnev

Bit sets can be faster at advancing and more storage-efficient on dense blocks of postings. This is not a new idea, @mkhludnev proposed something similar a long time ago #6116. @msokolov recently brought up (#14080) that such an encoding has become especially appealing with the introduction of the `DocIdSetIterator#loadIntoBitSet` API, and the fact that non-scoring disjunctions and dense conjunctions now take advantage of it. Indeed, if postings are stored in a bit set, `#loadIntoBitSet` would just need to OR the postings bits into the bits that are used as an intermediate representation of matches of the query.

@mkhludnev

Bit sets can be faster at advancing and more storage-efficient on dense blocks of postings. This is not a new idea, @mkhludnev proposed something similar a long time ago #6116. @msokolov recently brought up (#14080) that such an encoding has become especially appealing with the introduction of the `DocIdSetIterator#loadIntoBitSet` API, and the fact that non-scoring disjunctions and dense conjunctions now take advantage of it. Indeed, if postings are stored in a bit set, `#loadIntoBitSet` would just need to OR the postings bits into the bits that are used as an intermediate representation of matches of the query.

Use the new loadIntoBitSet API to speed up dense conjunctions.

e65b1da

Now that loading doc IDs into a bit set is much more efficient thanks to auto-vectorization, it has become tempting to evaluate dense conjunctions by and-ing bit sets.

jpountz added this to the 10.2.0 milestone Dec 18, 2024

iter

bf7133c

Make heuristic more conservative

9dd389d

benwtrent reviewed Dec 18, 2024

View reviewed changes

benwtrent approved these changes Dec 18, 2024

View reviewed changes

jpountz commented Dec 18, 2024

View reviewed changes

iter

546e613

jpountz merged commit a337d14 into apache:main Dec 19, 2024
5 checks passed

jpountz deleted the speed_up_conjunctions_using_bit_set branch December 19, 2024 14:05

jpountz mentioned this pull request Jan 13, 2025

Encode dense blocks of postings as bit sets. #14133

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use the new `loadIntoBitSet` API to speed up dense conjunctions. #14080

Use the new `loadIntoBitSet` API to speed up dense conjunctions. #14080

jpountz commented Dec 18, 2024

jpountz commented Dec 18, 2024

jpountz commented Dec 18, 2024

jpountz commented Dec 18, 2024

benwtrent Dec 18, 2024

jpountz Dec 18, 2024

benwtrent Dec 18, 2024

jpountz Dec 18, 2024

benwtrent Dec 18, 2024

jpountz Dec 18, 2024

jpountz Dec 18, 2024

msokolov commented Dec 19, 2024

jpountz commented Dec 20, 2024

msokolov commented Dec 20, 2024

Use the new loadIntoBitSet API to speed up dense conjunctions. #14080

Use the new loadIntoBitSet API to speed up dense conjunctions. #14080

Conversation

jpountz commented Dec 18, 2024

jpountz commented Dec 18, 2024

jpountz commented Dec 18, 2024

jpountz commented Dec 18, 2024

benwtrent Dec 18, 2024

Choose a reason for hiding this comment

jpountz Dec 18, 2024

Choose a reason for hiding this comment

benwtrent Dec 18, 2024

Choose a reason for hiding this comment

jpountz Dec 18, 2024

Choose a reason for hiding this comment

benwtrent Dec 18, 2024

Choose a reason for hiding this comment

jpountz Dec 18, 2024

Choose a reason for hiding this comment

jpountz Dec 18, 2024

Choose a reason for hiding this comment

msokolov commented Dec 19, 2024

jpountz commented Dec 20, 2024

msokolov commented Dec 20, 2024

Use the new `loadIntoBitSet` API to speed up dense conjunctions. #14080

Use the new `loadIntoBitSet` API to speed up dense conjunctions. #14080