Fix: add backoff to driver.go alt-da put request fail logic #11534
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
When the put request to the da-server fails, it the batch submitter puts back the frames into their respective channel builder, and returns
nil
(instead of err). This was probably added to allow the driver to retry sending the blob to the server immediately, without having to wait until the next poll. However, this is very bad behavior as clients should typically linear or exponential backoff when retrying requests, which this code doesn't do.** Solution **
Returning the err will make the for loop return, and wait for the next poll before retrying to send blobs to the da server.
Tests
Have not added any tests. This is a fairly simple change.
It might have unintended consequences though, so I'd be happy to know if there's some tests I should run.
Additional context
Here is my op-batcher sending a gazillion (500 returning) requests in a very short amount of time to the eigenda-proxy (our da-server):
Metadata