-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rate throttling logic to PlatformAPI #103
Conversation
By default when you get a rate-limit event while using the platform-api the method call will fail. Since there's no feedback built into the client, another call might be attempted (for example if the API is being hit via a background worker). This is bad for Heroku because it means we now are seeing a stream of requests that will never complete (because of the rate limit) and it is bad for the end user because they have a flurry of errors that are unexpected and unhandled. This PR builds on top of the project https://github.com/schneems/rate-limit-gcra-client-demo to automatically find a value that the client can sleep for so that it can maximize throughput while still minimizing the number of requests that are rate limited. At a high level when it sees a 429 response from the server, it will sleep and then retry the response. If it gets another 429 then the sleep amount will be multiplicatively increased until it can make a successful request. As the client is able to make successful requests the amount of sleep time is reduced by a subtractive amount based off of the current number of requests allowed (as reported by the server), the amount of time since the last rate limit event, and it's current value. This logic somewhat mirrors TCP "slow start" behavior (though in reverse). In simulations over time we end up seeing from 2-3% of requests rate limited. ![](https://github.com/schneems/rate-limit-gcra-client-demo/blob/master/chart.png) > Graph from README of the simulation readme This PR has been built on the work of other changes added to heroics: - interagent/heroics#95 - interagent/heroics#96 ## Discussion In addition to the implementation, the one last unknown is what the default logging behavior should be. While rate-throttling by default provides a good experience, we need to provide feedback to the user letting them know that it is happening. This PR was paired with @lolaodelola
This commit adds logging when a client is rate throttled. It emits a log to stdout by default and there is a configurable logging block that is made available so that developers can push to a custom endpoint such as honeycomb. Co-authored-by: Lola Odelola <damzcodes@gmail.com>
3d212cf
to
a22e79e
Compare
Instead of maintaining a single purpose rate throttle client wrapper, I ported the code over to a gem `rate_throttle_client` that has it's own set of tests and metrics generating code. This will make it easier to maintain and other libraries can use the code now as well. I'm now also testing configuring the rate throttling logic.
a22e79e
to
ae4c3af
Compare
In a slack conversation:
To mitigate this concern, I'm defaulting the behavior to be "off" by default. The user can either disable the rate throttling logic by following steps outlined in the warning and README, or they can upgrade to version 3.0.0 to opt-in to the new behavior. |
We can release this version 2.3.0 that has a warning/deprecation at the same time as a version 3.0.0 of this gem that sets a default rate throttle strategy. The warning will look like this the first time that developers make an API request: ``` [Warning] Starting in PlatformAPI version 3+, requests will include rate throttling logic to opt-out of this behavior set: `PlatformAPI.rate_throttle = RateThrottleClient::Null.new` to silence this warning and opt-in to this logic, upgrade to PlatformAPI version 3+ ``` If someone sets their own strategy, then they won't get this message. If that sounds good, once this change is accepted. I'll make another PR to rev the version and update the behavior. Once that is approved we can release 2.3.0, then merge in the behavior change and release 3.0.0 shortly after.
1b87f40
to
b88e769
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one suggestion for organizing some "helper" methods for specs. But it's not a blocker, so 👍 from me.
Going to merge this in and prep the 3.0 release PR. I'm currently aiming to release next week. |
As a followup from #103 this PR bumps the major platform-api version and enables rate-throttling by default.
As a followup from #103 this PR bumps the major platform-api version and enables rate-throttling by default.
As a followup from #103 this PR bumps the major platform-api version and enables rate-throttling by default.
The followup PR is #104 |
As a followup from #103 this PR bumps the major platform-api version and enables rate-throttling by default.
As a followup from #103 this PR bumps the major platform-api version and enables rate-throttling by default.
As a followup from #103 this PR bumps the major platform-api version and enables rate-throttling by default.
As a followup from #103 this PR bumps the major platform-api version and enables rate-throttling by default.
This is a continuation of #96 but is done from within the same repo so the tests can run
By default when you get a rate-limit event while using the platform-api the method call will fail. Since there's no feedback built into the client, another call might be attempted (for example if the API is being hit via a background worker). This is bad for Heroku because it means we now are seeing a stream of requests that will never complete (because of the rate limit) and it is bad for the end-user because they have a flurry of errors that are unexpected and unhandled.
This PR builds on top of the project https://github.com/zombocom/rate_throttle_client which was built out of the research from https://github.com/schneems/rate-limit-gcra-client-demo to automatically find a value that the client can sleep for so that it can maximize throughput while still minimizing the number of requests that are rate limited.
At a high-level when the client sees a 429 response from the server, it will sleep and then retry the response. If it gets another 429 then the sleep amount will be multiplicatively increased until it can make a successful request.
As the client is able to make successful requests the amount of sleep time is reduced by a subtractive amount based on the current number of requests allowed (as reported by the server), and it's the current value.
In simulations, over time we end up seeing from 2-3% of requests rate limited.
This PR has been built on the work of other changes added to heroics and platform-api
This PR was paired with @lolaodelola