Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently when contending a semaphore (or lock) decrements the query wait time by the elapsed time since the attempt to acquire started.
The overall timeout check for the acquire call is also performed against the query timeout.
When there are many attempts at contending (e.g. you have many processes running
consul lock
) this can cause the timeout to be reduced by far too much far too quickly.With some debug logging in place I ran 8 consul lock processes and then killed the sleep command after a short period of time:
You can see that after less than 2 minutes we've reduced the wait time by over 12 minutes.
The acquire attempt shouldn't timeout until the originally specified timeout has been reached,
and the query wait time should be the original timeout, decremented by the elapsed time so that the query won't block beyond the overall timeout.
With these changes:
Writing tests for this is tricky as you need to have multiple clients contending at once to trigger it.
Fixes #4003 #3262 #2399