-
Notifications
You must be signed in to change notification settings - Fork 107
Implementing ElementCount and ElementBytes Triggers #734
Conversation
@igorbernstein2 Please have a look. Also Please let me know If we should split this in two part (like separate PR for copied classes eg. |
gax/src/main/java/com/google/api/gax/batching/v2/BatcherImpl.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/v2/BatcherImpl.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/v2/BatchingThreshold.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/v2/BatchingSettings.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/v2/BatchingSettings.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/v2/BatchingSettings.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/v2/NumericThreshold.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/v2/BatchingSettings.java
Outdated
Show resolved
Hide resolved
gax/src/test/java/com/google/api/gax/batching/v2/BatcherImplTest.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/v2/BatchingThreshold.java
Outdated
Show resolved
Hide resolved
Codecov Report
@@ Coverage Diff @@
## master #734 +/- ##
============================================
+ Coverage 77.66% 77.86% +0.19%
- Complexity 1091 1107 +16
============================================
Files 197 199 +2
Lines 4778 4829 +51
Branches 371 379 +8
============================================
+ Hits 3711 3760 +49
Misses 898 898
- Partials 169 171 +2
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #734 +/- ##
============================================
+ Coverage 77.66% 77.72% +0.05%
- Complexity 1091 1094 +3
============================================
Files 197 198 +1
Lines 4778 4799 +21
Branches 371 377 +6
============================================
+ Hits 3711 3730 +19
Misses 898 898
- Partials 169 171 +2
Continue to review full report at Codecov.
|
* Represents the batching settings to use for an API method that is capable of batching. | ||
* | ||
* <p>Each batching client will have to define their own set of default values for these thresholds, | ||
* which would be safest behavior for their jobs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be in indecisive here, but I think that the defaults should be Long.MAX_VALUE. I think will serve better to capture a user's intention when they write Batching.newBuilder().setMaxElements(100).build()
. Specifically when user doesn't set a maximum value, I think they intuition is that there is no max. And now that manual flushing exists and is hooked into close(), I think it's safer to have these defaults.
Also in follow up PRs, I think we should:
- introduce a default timed flush trigger
- add a warning when batchers are garbage collected w/o being closed similar to grpc's channels:
https://github.com/grpc/grpc-java/blob/48ca4527c14a95914f9cb7f58ec72997cb96899a/core/src/main/java/io/grpc/internal/ManagedChannelImpl.java#L1262
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please update the docs & tests to always close the batcher
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the reference. Would sure look into it and raise follow up PR accordingly.
Also, Please have a look at this PR, I have addressed these feedback in the latest commit.
LGTM, @vam-google, can you take a pass? |
/** Get a new builder. */ | ||
public static Builder newBuilder() { | ||
return new AutoValue_BatchingSettings.Builder() | ||
.setElementCountThreshold(Long.MAX_VALUE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Long.MAX_VALUE
is effectively Never. Is it what we want here? More importantly, why do we use long for element count threshold? Do we really intend to support more than 2 billion elements (we most probably will get out of memory long before reaching 2 billion anyways)? Long for bytes threshold is probably Ok, but why for elements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that element bytesize flushing should be disabled by default. Totally fine to downgrade to an int
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vam-google, @igorbernstein2 Thanks! for the review. Have updated element count to be int
type with the default of Integer.MAX_VALUE
.
public static Builder newBuilder() { | ||
return new AutoValue_BatchingSettings.Builder() | ||
.setElementCountThreshold(Long.MAX_VALUE) | ||
.setRequestByteThreshold(Long.MAX_VALUE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use some reasonably big value here (and in element count) instead of MAX_VALUE
? People most probably will go with default values first (yes, everything is documented, but people read documentaiton only after something broke..). The default values can result in OutOfMemoryException
quite easily. And even if not, it may be confusing for people that their requests do not get sent for some reason (because it just keeps accumulating everything).
Another option is to always enforce setting those values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 2 audiences here:
- client developers
- end users
The client developers will configure defaults per service in their StubSettings. Each service will have different profiles. When looking at the code:
BatchingSettings.newBuilder().setMaxElementCount(100).build();
I would be very surprised if it flushed a single large element. I would want it to be spelled out in the StubSetting defaults as
BatchingSettings.newBuilder().setElementCountThreshold(100).setRequestByteThreshold(1024).build();
The other audience is the end users, the expectations with them is that they should use toBuilder() on the client developer provided instance to tweak stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@igorbernstein2 Who are the client developers? Do you mean the manual layer on top of this? If yes, then most of the generated clients don't have that wrapper, so the there is simply no toBuilder()
provided for them by the client developer.
This can also be done in gapic-generator level (generator will be the repo, where the "default" value can be found), ut it seems like an unnecessary complication (why to split in in two places and make more confusing, when it can be just here and documented so people can easily find the default value). Also, currently MAX_VALUE is the default value, and if it actually (in the manual wrapper or gapic-generator) is always set to something different by default, then it is even more confusing (we end up in a situation, when we have a default value, which is by default overridden by another default value).
In general it looks like this value should not have a default value at all, and build()
on the builder should fail if there is not explicit value provided (which i think is perfectly fine, and probably the best solution here, enforcing setting values, which are hard/impossible to have default values for)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with no default
Added two threshold in the `BatcherImpl` which will trigger automatic batching upon breach of any of these. - ElementCount: Number of elements queued up till now. - ElementBytes: Size of the accumulated elements. - Threshold can be set using `BatchingSettings`. - Javadoc fixes to remove v1 reference - Removed unnecessary abstraction and made thresholds values as required Updated BatchingSettings default value to Long.MAX_VALUE
also, added Javadoc at BatcherImpl XOR.
1d16842
to
d632987
Compare
Preconditions.checkState( | ||
settings.getElementCountThreshold() > 0, "elementCountThreshold must be positive"); | ||
Preconditions.checkState( | ||
settings.getRequestByteThreshold() > 0, "requestByteThreshold must be positive"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think allowing 0 is ok here to signify that batching is disabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have updated the BatchingSettings
to accept 0 and added a line in Javadoc for the same.
Now BatcherImpl.Batch#hasAnyThresholdReached
checks the disability.
|
||
boolean hasAnyThresholdReached() { | ||
return (elementThreshold != 0 && elementThreshold <= elementCounter) | ||
|| (bytesThreshold != 0 && bytesThreshold <= byteCounter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want it to immediately flush if threshold is zero, please remove the !=0
clauses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry, I assumed disable meaning disabling the threshold(though that can be done by setting a very high value).
Just a QQ: If the user sets any threshold to 0 then all other thresholds will be ignored. Is this what we want to add here... right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with few comments
This is in continuation of follow up PR for new Batching API(#716)
What This PR contains
Added two new thresholds in the
BatcherImpl.java
which will trigger automatic batch processing upon breach of any one of this threshold.A threshold would be added in BatcherImpl via
v2.BatchingSettings.java
and It will be calculated with the help ofBatchingThreshold.java
,v2.ElementCounter.java
&v2.NumericThreshold.java
.Note: Once this PR is merged, I would raise another followup PR with DelayThreshold/maxDelay.