-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
impl(bigtable): add a generic RateLimiter #13060
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #13060 +/- ##
========================================
Coverage 92.99% 92.99%
========================================
Files 2128 2131 +3
Lines 185258 185383 +125
========================================
+ Hits 172272 172397 +125
Misses 12986 12986
☔ View full report in Codecov by Sentry. |
// increase with time. | ||
using Clock = ::google::cloud::internal::SteadyClock; | ||
|
||
explicit RateLimiter(std::shared_ptr<Clock> clock, double rate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why shared_ptr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was copying
std::shared_ptr<Clock> clock = std::make_shared<Clock>()) |
I think part of the reason is that our FakeClock
has a std::mutex
in it.
mutable std::mutex mu_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue there is the way the Session
constructor defaults clock
. You don't have that problem, and so could take Clock* clock
with a lifetime caveat. Tests should all have a single clock that lasts the lifetime of the test, and non-tests should have a single, global-extent clock, period. No need for shared ownership in either case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm I think I would rather avoid the caveats and leave it as a shared_ptr
.
|
||
using Clock = RateLimiter::Clock; | ||
|
||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/**/*/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
I know /**
is a doxygen / javadoc thing. I find it simpler to write comments the same way for public and private APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Because it is better to be intentional than to act by default. And, it is better to make things easier to read than to write. But, your choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we use /**
more often than /*
for internal documentation. I am not going to change these.
it is better to make things easier to read than to write
Agreed. Don't think it applies here.
/** | ||
* A threadsafe interface for rate limiting. | ||
* | ||
* The caller tells the `RateLimiter` how many permits it wants to acquire. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would start with something like:
The caller needs to acquire a "permit" to perform the operation under rate limits. This class limits the number of
permits issued per period of time, effectively limiting the operation rate.
The caller may acquire more than one permit at a time, if it needs to perform a burst of the operation under rate limit.
More permits become available as time passes, with some maximum to limit the size of bursts.
I think "token" or "credit" is a more common name than "permit", but is not like any of the three is universally accepted.
You may want to quote: https://en.wikipedia.org/wiki/Flow_control_(data)#Open-loop_flow_control or whatever ATM networks called their flow control mechanism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would start with something like:
Thanks, it is what I was trying to say, but clearer.
or whatever ATM networks called their flow control mechanism.
Why do you know these things?
/** | ||
* Set the period. | ||
* | ||
* Note that the current next_ has already been calculated. This new rate will | ||
* not apply to it. The new rate will apply to every `acquire()` after next. | ||
*/ | ||
void set_period(Clock::duration period) { | ||
if (period < Clock::duration::zero()) { | ||
GCP_LOG(FATAL) << "RateLimiter period must be >= 0."; | ||
} | ||
std::lock_guard<std::mutex> lk(mu_); | ||
period_ = period; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a dangerous function. Do you really need it? If not, consider it removing it.
Consider a different name, maybe set_future_period()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need it?
Yeah, I really need it. I need to be able to change the rate on the fly as the server tells me to speed up / slow down.
Consider a different name
meh
@@ -0,0 +1,197 @@ | |||
// Copyright 2023 Google Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Inc.LLC/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in the three files.
// The request can go through immediately. But first, we need to update the | ||
// time the next permit can be given out. | ||
|
||
// Update the stored permits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments have periods, others don't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by only having one comment.
RateLimiter::Clock::duration RateLimiter::acquire(std::int64_t permits) { | ||
auto const now = clock_->Now(); | ||
std::lock_guard<std::mutex> lk(mu_); | ||
if (next_ <= now) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (next_ <= now) { | |
if (next_ > now) { | |
auto wait = next_ - now; | |
next_ += permits * period_; | |
return wait; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moot now, but if (next > now)
was the simpler of the branches. Thanks.
* operation. For example, instead of acquiring one permit per request, you | ||
* might choose to acquire one permit per repeated field in a request. | ||
*/ | ||
Clock::duration acquire(std::int64_t permits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should return the actual number of permits granted and the sleep time? Right now it assumes the caller is using the class correctly, without opportunity for early wake ups from a sleep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should return the actual number of permits granted and the sleep time?
I do not know what you mean. This class always grants all of the permits it is asked for.
I think you are saying: return how many of the requested permits can be granted immediately?
I don't want to do that. If the caller can split up the work, they should split up the calls to acquire()
.
Right now it assumes the caller is using the class correctly
Yeah.
stored_permits_(max_stored_permits), | ||
max_stored_permits_(max_stored_permits) { | ||
if (period_ < Clock::duration::zero()) { | ||
GCP_LOG(FATAL) << "RateLimiter period must be > 0."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/>/>=/
[You might also consider just using the absolute value.]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go with absolute value.
/** | ||
* A threadsafe interface for rate limiting. | ||
* | ||
* The caller tells the `RateLimiter` how many permits it wants to acquire. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now consider s/permits/cycles/ (which I like) or s/permits/tokens/ (which may be more familiar).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are going with "tokens".
|
||
using Clock = RateLimiter::Clock; | ||
|
||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Because it is better to be intentional than to act by default. And, it is better to make things easier to read than to write. But, your choice.
* Note that the current next_ has already been calculated. This new rate will | ||
* not apply to it. The new rate will apply to every `acquire()` after next. | ||
*/ | ||
void set_period(Clock::duration period) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not accept the same period
types as the constructor? (Or, alternatively, why not restrict the constructor type to Clock::duration
? And if you do that, move this implementation to the .cc
file.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks.
why not restrict the constructor type to
Clock::duration
?
- I'd rather
duration_cast
than make any assumptions aboutClock::duration
- I'd rather do the
duration_cast
in this class, than have the caller do it.
RateLimiter::Clock::duration RateLimiter::acquire(std::int64_t permits) { | ||
auto const now = clock_->Now(); | ||
std::lock_guard<std::mutex> lk(mu_); | ||
if (next_ <= now) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems strange that these cases are so different, when you'd like to think that next_ == now
could go either way and have the same effect.
Consider accounting for stored_permits_
within next_
(e.g., (stored_permits_ == 1
, next_ == t
) => (next_ == t - period_
). Then, perhaps, all these distinctions disappear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am frankly embarrassed by how much simpler the code is. Thanks for the help.
// increase with time. | ||
using Clock = ::google::cloud::internal::SteadyClock; | ||
|
||
explicit RateLimiter(std::shared_ptr<Clock> clock, double rate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue there is the way the Session
constructor defaults clock
. You don't have that problem, and so could take Clock* clock
with a lifetime caveat. Tests should all have a single clock that lasts the lifetime of the test, and non-tests should have a single, global-extent clock, period. No need for shared ownership in either case.
next_(clock_->Now() - max_stored_tokens * period_), | ||
max_stored_tokens_(max_stored_tokens) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I entered this comment before, but I don't see it now, so I must have done something wrong. Sorry if it shows up twice.
It looks like you could do some strength reduction by storing max_stored_tokens * period
instead of max_stored_tokens_
by itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Changed the interface to accept the "smoothing interval". This seems nicer than storing some volume that must be updated every time we change the rate. (I am open to other names).
Changed the implementation to start with an empty bank. (This made the tests nicer. I don't think it's a big deal either way).
56531f9
to
094d05b
Compare
Part of the work for #12959
Add a generic interface for rate limiting. Put it in
bigtable
, because that is the only place it is needed for now. We can always move it to common if the need arises.We do not include any Bigtable specific logic (e.g. how to update the rate given a response) in this interface. We will add another class for that later.
This change is