-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize sort on long field #48804
Optimize sort on long field #48804
Conversation
Optimize sort on numeric long and date fields, when the system property `es.search.long_sort_optimized` is true.
Skip sort optimization if the index has 50% or more data with the same value. When index has a lot of docs with the same value, sort optimization doesn't make sense, as DistanceFeatureQuery will produce same scores for these docs, and Lucene will use the second sort to tie-break. This could be slower than usual sorting.
…4021) This change pre-sort the index reader leaves (segment) prior to search when the primary sort is a numeric field eligible to the distance feature optimization. It also adds a tie breaker on `_doc` to the rewritten sort in order to bypass the fact that leaves will be collected in a random order. I ran this patch on the http_logs benchmark and the results are very promising: ``` | 50th percentile latency | desc_sort_timestamp | 220.706 | 136544 | 136324 | ms | | 90th percentile latency | desc_sort_timestamp | 244.847 | 162084 | 161839 | ms | | 99th percentile latency | desc_sort_timestamp | 316.627 | 172005 | 171688 | ms | | 100th percentile latency | desc_sort_timestamp | 335.306 | 173325 | 172989 | ms | | 50th percentile service time | desc_sort_timestamp | 218.369 | 1968.11 | 1749.74 | ms | | 90th percentile service time | desc_sort_timestamp | 244.182 | 2447.2 | 2203.02 | ms | | 99th percentile service time | desc_sort_timestamp | 313.176 | 2950.85 | 2637.67 | ms | | 100th percentile service time | desc_sort_timestamp | 332.924 | 2959.38 | 2626.45 | ms | | error rate | desc_sort_timestamp | 0 | 0 | 0 | % | | Min Throughput | asc_sort_timestamp | 0.801824 | 0.800855 | -0.00097 | ops/s | | Median Throughput | asc_sort_timestamp | 0.802595 | 0.801104 | -0.00149 | ops/s | | Max Throughput | asc_sort_timestamp | 0.803282 | 0.801351 | -0.00193 | ops/s | | 50th percentile latency | asc_sort_timestamp | 220.761 | 824.098 | 603.336 | ms | | 90th percentile latency | asc_sort_timestamp | 251.741 | 853.984 | 602.243 | ms | | 99th percentile latency | asc_sort_timestamp | 368.761 | 893.943 | 525.182 | ms | | 100th percentile latency | asc_sort_timestamp | 431.042 | 908.85 | 477.808 | ms | | 50th percentile service time | asc_sort_timestamp | 218.547 | 820.757 | 602.211 | ms | | 90th percentile service time | asc_sort_timestamp | 249.578 | 849.886 | 600.308 | ms | | 99th percentile service time | asc_sort_timestamp | 366.317 | 888.894 | 522.577 | ms | | 100th percentile service time | asc_sort_timestamp | 430.952 | 908.401 | 477.45 | ms | | error rate | asc_sort_timestamp | 0 | 0 | 0 | % | ``` So roughly 10x faster for the descending sort and 2-3x faster in the ascending case. Note that I indexed the http_logs with a single client in order to simulate real time-based indices where document are indexed in their timestamp order. Relates #37043
As we don't use cancellableCollector anymore, it should be removed from the expected docs response.
When we optimize sort, we sort segments by their min/max value. As a collector expects to have segments in order, we can not use a single collector for sorted segments. Thus for such a case, we use collectorManager, where for every segment a dedicated collector will be created.
Last results comparing performance of sort Master without optimizationbefore force_merge
after force_merge (without number of segments)
after force_merge to 1 segment
Long sort optimizationbefore force_merge
after force_merge (without number of segments)
after force_merge to 1 segment
Seems that with merging the performance of ascending sort degrades, but still 90th percentile is 3 times better comparing with master. Performance of descending sort is 2 times faster than master |
Pinging @elastic/es-search (:Search/Search) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wonderful.
I'd need to do another round of review to double check whether there are corner cases that we are not covering but this looks great to me. This code would likely be a bit hard to maintain but I believe that the benefits are worth it.
} else { | ||
queryCollector = QueryCollectorContext.createQueryCollector(collectors); | ||
shouldRescore = searchWithCollector(searchContext, searcher, query, collectors, hasFilterCollector, timeoutSet); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Looks like the code on master was not indented properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, sorry then I got it backwards
|
||
private static Query tryRewriteLongSort(SearchContext searchContext, IndexReader reader, | ||
Query query, boolean hasFilterCollector) throws IOException { | ||
if (searchContext.searchAfter() != null) return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe leave a TODO about this one, it'd be nice to be able to handle it in a followup
if (SortField.FIELD_DOC.equals(sField) == false) return null; | ||
} else { | ||
if (searchContext.mapperService().fullName(sFieldName) == null) return null; // could be _script field that uses _score | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we use sortField.needsScores to cover scripted fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like using sortField.needsScores
doesn't work for script fields, as needsScore
checks for type == Type.SCORE
, while for script fields type=Type.CUSTOM
.
For example, in this case we want to avoid running optimization:
"sort": [
{"my_long": "desc"},
{
"_script": {
"script": {
"source": "_score"
}
}
}
]
The type for second sortfield will be CUSTOM
. And to avoid running optimization in this case, we need to check that all sort fields are mapped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the last commit, I have added TODO item to think how we can cover _script sort that doesn't use _score.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left one comment that needs to be addressed but the change looks good to me.
// Add a tiebreak on _doc in order to be able to search | ||
// the leaves in any order. This is needed since we reorder | ||
// the leaves based on the minimum/maxim value in each segment. | ||
newSortFields[newSortFields.length-1] = SortField.FIELD_DOC; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed anymore since we use the shared collector manager ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Jim, indeed shared collector manager with tie on scores compares by docID . I will correct this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jimczi addressed in the last commit
if (SortField.FIELD_DOC.equals(sField) == false) return null; | ||
} else { | ||
if (searchContext.mapperService().fullName(sFieldName) == null) return null; // could be _script field that uses _score | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
With the latest commit 5870fd7 I have just ran benchmarks on Master without optimization
Optimization branch
For the field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some minor nits but the change looks good to me. Thanks @mayya-sharipova
try { | ||
intersectScorerAndBitSet(scorer, liveDocsBitSet, leafCollector, | ||
checkCancelled == null ? () -> { | ||
} : checkCancelled); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: correct indentation
@@ -75,6 +94,8 @@ | |||
*/ | |||
public class QueryPhase implements SearchPhase { | |||
private static final Logger LOGGER = LogManager.getLogger(QueryPhase.class); | |||
public static final boolean SYS_PROP_LONG_SORT_OPTIMIZED = | |||
Booleans.parseBoolean(System.getProperty("es.search.long_sort_optimized", "true")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe change the name to es.search.rewrite_sort
? Can you also add a comment/todo that we should remove this property in 8 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jimczi Do we need to have any user-faced documentation for this? or at least documentation for the support team (e.g. in case sort is misbehaving, disable the optimization)?
// we use collectorManager during sort optimization | ||
// for the sort optimization, we have already checked that there are no other collectors, no filters, | ||
// no search after, no scroll, no collapse, no track scores | ||
// this means we can use TopFieldCollector directly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be replaced with a multiline comment ?
@jimczi Does this PR target 7.6? Looks like ES 7.6 still was not upgraded to the Lucene version where shared collector manager was introduced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to do a more in-depth review. I found one thing that I'd like us to optimize a bit, but otherwise it looks good to me.
collectors.addFirst(topDocsFactory); | ||
|
||
final Collector queryCollector; | ||
if ( searchContext.getProfilers() != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ( searchContext.getProfilers() != null) { | |
if (searchContext.getProfilers() != null) { |
long minValue = LongPoint.decodeDimension(pointValues.getMinPackedValue(), 0); | ||
long maxValue = LongPoint.decodeDimension(pointValues.getMaxPackedValue(), 0); | ||
while (minValue < maxValue) { | ||
long avgValue = Math.floorDiv(minValue, 2) + Math.floorDiv(maxValue, 2); // to avoid overflow first divide each value by 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think avg
is confusing here since it is the mid point, not an average
while (minValue < maxValue) { | ||
long avgValue = Math.floorDiv(minValue, 2) + Math.floorDiv(maxValue, 2); // to avoid overflow first divide each value by 2 | ||
long countLeft = estimatePointCount(pointValues, minValue, avgValue); | ||
long countRight = estimatePointCount(pointValues, avgValue + 1, maxValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder that this method might be quite expensive by calling estimatePointCount in a loop, can we break the for loop as soon as countLeft + countRight <= globalDocCount/2 (which might require to rename this method to something like boolean hasValueWithCountGreaterThan(PointValues values, long threshold)
or something along those lines).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading 5870fd7 again I think I misread your commit, sorry @mayya-sharipova ;). I thought of it as if min and max was never updated. The logic looks good to me now, sorry for the confusion.
return (globalMedianCount >= globalDocCount/2); | ||
} | ||
|
||
static long estimateMedianValue(PointValues pointValues) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the record I think this method is doing the right thing, but we are not computing the median here since calls to estimatePointCount below use the updated values of minValue/maxValue instead of the global min/max.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I completely understood your comment. Median item is such a value that half of values are smaller than it and half is greater than it. With every loop iteration we choosing a part with more counts, and by this converging to a median value. No? You see it differently?
PointValues pointValues = lrc.reader().getPointValues(field); | ||
if (pointValues == null) continue; | ||
int docCount = pointValues.getDocCount(); | ||
if (docCount <= 512) { // skipping small segments as estimateMedianCount doesn't work well on them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (docCount <= 512) { // skipping small segments as estimateMedianCount doesn't work well on them | |
if (docCount <= 512) { // skipping small segments as estimateDocCount doesn't work well on them |
docsNoDupl += docCount; | ||
} | ||
} | ||
return (docsDupl > docsNoDupl); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The per-segment logic looks good to me but I'm less sure about how information is merged across multiple segments. Maybe this is something we can look into improving later.
@elasticmachine run elasticsearch-ci/bwc |
@elasticmachine run elasticsearch-ci/default-distro |
Use shared TopFieldCollector manager for sort optimization. This collector manager is able to exchange minimum competitive score between collectors
5cef7ec
to
bf8e17a
Compare
@elasticmachine run elasticsearch-ci/packaging-sample-matrix |
This reverts commit 79d9b36.
This is a follow up of elastic#48804 where we rewrite numeric sort to use the DistanceFeatureQuery. This change adds another optimization if the query is a `match_all` that instead of using a distance feature query will simply extract the documents directly from the indexed point and early terminate as soon as enough docs have been collected. This optimization has a constant cost so it can be considerably faster than the other optimization since it only needs to visit the BKD-tree of a field and can early terminate as soon as it collected the number of requested hits. Note that this optimization can only work when the query is a match_all and the numeric sort order is not reversed. The pr is in WIP state, it needs more tests and some cleanup but I wanted to open it early in order to discuss whether we should pursue this path or not.
Don't run long sort optimization when index is already sorted on the same field as the sort query parameter. Relates to elastic#37043, follow up for elastic#48804
This rewrites long sort as a `DistanceFeatureQuery`, which can efficiently skip non-competitive blocks and segments of documents. Depending on the dataset, the speedups can be 2 - 10 times. The optimization can be disabled with setting the system property `es.search.rewrite_sort` to `false`. Optimization is skipped when an index has 50% or more data with the same value. Optimization is done through: 1. Rewriting sort as `DistanceFeatureQuery` which can efficiently skip non-competitive blocks and segments of documents. 2. Sorting segments according to the primary numeric sort field(#44021) This allows to skip non-competitive segments. 3. Using collector manager. When we optimize sort, we sort segments by their min/max value. As a collector expects to have segments in order, we can not use a single collector for sorted segments. We use collectorManager, where for every segment a dedicated collector will be created. 4. Using Lucene's shared TopFieldCollector manager This collector manager is able to exchange minimum competitive score between collectors, which allows us to efficiently skip the whole segments that don't contain competitive scores. 5. When index is force merged to a single segment, #48533 interleaving old and new segments allows for this optimization as well, as blocks with non-competitive docs can be skipped. Backport for #48804 Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>
Don't run long sort optimization when index is already sorted on the same field as the sort query parameter. Relates to elastic#37043, follow up for elastic#48804
Rewrite sort on long field (number or date) to Lucene DistanceFeatureQuery.
This allows to skip non-competitive hits leading to speedups on sort.