-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic mapping updates are unboundedly parallel #50670
Comments
Pinging @elastic/es-distributed (:Distributed/CRUD) |
We discussed a couple of options in Slack:
|
I believe we are running into a very similar issue after an upgrade from 6.8.0 > 7.5.1. In 6.8.0 we would see a couple of minutes where the cluster would apply the new mappings and then recover, while in 7.5.1 we see the number of pending mapping changes reach 30,000+. This quickly results in the master node becoming unresponsive and data nodes repeatedly leaving/joining the cluster due to being unable to contact the master node in a timely manner. Since our upgrade, each night at 12 AM (new date based indexes are created at this time) we have had to restart all of our master nodes simultaneously to bring the cluster back to a healthy state. Is there any timeline on a potential fix for this issue being made available, or a recommended workaround? |
@SpencerLN that sounds like this issue indeed. As a general rule, dynamic mappings should be used sparingly in production since they cause indexing to bottleneck on the master. It's much more efficient to use an index template to set up most of the mappings when the index is created, and this is particularly important at the kind of scale that would result in tens of thousands of pending tasks. Dynamic mappings are more appropriate for handling an occasional unexpected field. |
We discussed various options to solving this issue in the distributed sync (and combinations thereof):
To get a fix out quickly, I will look at reintroducing the blocking behavior in a first step. |
Ensures that there are not too many concurrent dynamic mapping updates going out from the data nodes to the master. Closes #50670
Ensures that there are not too many concurrent dynamic mapping updates going out from the data nodes to the master. Closes #50670
Ensures that there are not too many concurrent dynamic mapping updates going out from the data nodes to the master. Closes #50670
Ensures that there are not too many concurrent dynamic mapping updates going out from the data nodes to the master. Closes elastic#50670
@ywelsch seems like the code isn't part of any release (not 7.6 or above) |
Indeed, looks like I missed the backport to the 7.6 branch, and it also missed the 7.5.2 release (it's on the 7.5 branch, but just after that release), probably backported at the time where the new branches were cut. It was backported to 7.x (i.e. future 7.7.0), so will be released as part of that. I will adapt the labels on the PR. |
Before 7.2.0 dynamic mapping updates would block a
write
thread waiting for the master to acknowledge the new mapping. In #39793 we moved to an asynchronous model, freeing up thewrite
thread to carry on with other indexing tasks.One feature of the pre-7.2.0 blocking approach was that the number of
write
threads is limited and this limits the number of parallel dynamic mapping updates pending on the master. The asynchronous model has no such limit and may send a very large number of dynamic mapping updates in a short time since the shard bulks are processed much faster. Furthermore, many indexing operations may require the same mapping update but since they are now generated much more quickly it may be that many of the mapping updates sent to the master are duplicates.Related discussion thread.
The text was updated successfully, but these errors were encountered: