Consider offering a full jitter backoff policy #8755

dbolduc · 2022-04-14T20:10:57Z

Here is a write up on full jitter. Apparently the java clients do this.

With pure exponential backoff policy we might have backoff ranges like:
req1: (.5, 1)
req2: (1, 2)
req3: (2, 4)
....

With full jitter, these backoff ranges would look like:
req1: (0, 1)
req2: (0, 2)
req3: (0, 4)
....

(make up your own units)

coryan · 2022-04-15T20:03:13Z

For some services (storage comes to mind) a minimum back off is recommended.

coryan · 2022-09-08T18:32:30Z

@dbolduc is going to research some more and make a recommendation.

dbolduc · 2022-09-12T22:20:25Z

This is the code to run the simulation in the article: https://github.com/aws-samples/aws-arch-backoff-simulator/blob/master/src/backoff_simulator.py

I added our backoff algorithm:

34a35,39
> class ExpoBackoffCloudCpp(Backoff):
>     def backoff(self, n):
>         v = self.expo(n)
>         return random.uniform(v/2, v)
>

And a MinJitter (i.e. full jitter, but it doesn't start at 0)

39a45,49
> class ExpoBackoffMinJitter(Backoff):
>     def backoff(self, n):
>         v = self.expo(n)
>         return random.uniform(self.base, v)
>

Results:

To summarize (with these exact settings, in this exact model), min jitter is indistinguishable from full jitter. Our strategy is indistinguishable from the jitter strategies in terms of total calls, but takes more time to complete the work.

I will phone a friend before making a recommendation...

dbolduc · 2023-03-22T22:19:38Z

Seems like other client library languages do full jitter only. Given the supposed, slight performance benefit, I think we should implement min-jitter (which is like full jitter, but slightly more general).

I think we should modify the implementation of the existing ExponentialBackoffPolicy, with minimal behavior changes for the current constructor.

I think our default backoff policy should use full jitter.

Background

Our API accepts three parameters:

minimum delay
maximum delay (Note that this is an overall maximum.)
scaling factor

google-cloud-cpp/google/cloud/internal/backoff_policy.h

Lines 125 to 128 in 9737c4b

    
           template <typename Rep1, typename Period1, typename Rep2, typename Period2> 
        
           ExponentialBackoffPolicy(std::chrono::duration<Rep1, Period1> initial_delay, 
        
                                    std::chrono::duration<Rep2, Period2> maximum_delay, 
        
                                    double scaling)

Aside: I don't agree with the range it sets. I think we should multiply by scaling, not 2. Whatever.

google-cloud-cpp/google/cloud/internal/backoff_policy.h

Line 131 in 9737c4b

current_delay_range_(2 * initial_delay_),

Min-jitter requires four parameters:

minimum delay (0 in the case of full jitter)
initial delay upper bound
maximum delay
scaling factor

Design / Work:

I would break up the work into two PRs:

1. Implement min-jitter

add a new constructor + a minimum_delay_ member.
- I'd probably rename s/current_range_delay_/current_range_upper_bound_/, too.
map the parameters for the existing ctor to the members. (requires thinking!)
update clone() and OnCompletion() implementations.
add/update tests
update documentation

2. Update library defaults

tell the generator to use full-jitter:
https://github.com/googleapis/google-cloud-cpp/blob/main/generator/internal/option_defaults_generator.cc#L156-L157
run the generator

We want the RetryPolicyOption to use a similar policy with a min of ms(0).

We want the value of the PollingPolicyOption to be unchanged, so we should set it manually instead of using the value of the RetryPolicyOption)

"Running the generator" means:

ci/cloudbuild/build.sh -t generate-libraries-pr

It may also be useful to generate only the "golden" files. (instead of 100 libraries). This just speeds up development cycles.

env GENERATE_GOLDEN_ONLY=1 ci/cloudbuild/build.sh -t generate-libraries-pr

- Add a test for floating point numbers - Clarify the naming of current_delay_range_ by adding two parameters (one for the start and one for the end). In the long term we can remove this since we want to implement min jitter in issue googleapis#8755. Then the current_delay_start_ will always equal the initial_delay_.

- Use () around min and max to avoid macro expansion - Add a test for floating point numbers - Clarify the naming of current_delay_range_ by adding two parameters (one for the start and one for the end). In the long term we can remove this since we want to implement min jitter in issue googleapis#8755. Then the current_delay_start_ will always equal the initial_delay_.

coryan · 2023-06-09T13:43:17Z

Can we close this?

alevenberg · 2023-06-09T13:54:31Z

Yes

dbolduc added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Apr 14, 2022

coryan assigned dbolduc Feb 15, 2023

alevenberg mentioned this issue May 8, 2023

Exponential Backoff ranges are incorrect #11107

Closed

dbolduc mentioned this issue May 10, 2023

fix: correct exponential backoff ranges #11529

Merged

dbolduc assigned alevenberg and unassigned dbolduc May 11, 2023

alevenberg mentioned this issue May 18, 2023

feat: add new constructor for exponential backoff policy #11650

Merged

alevenberg mentioned this issue May 30, 2023

feat: use full jitter exp backoff policy in the generator #11748

Merged

alevenberg closed this as completed Jun 9, 2023

dbolduc mentioned this issue Oct 18, 2023

Backoff timing verification can flake googleapis/cloud-bigtable-clients-test#115

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider offering a full jitter backoff policy #8755

Consider offering a full jitter backoff policy #8755

dbolduc commented Apr 14, 2022

coryan commented Apr 15, 2022

coryan commented Sep 8, 2022

dbolduc commented Sep 12, 2022

dbolduc commented Mar 22, 2023 •

edited

Loading

coryan commented Jun 9, 2023

alevenberg commented Jun 9, 2023

Consider offering a full jitter backoff policy #8755

Consider offering a full jitter backoff policy #8755

Comments

dbolduc commented Apr 14, 2022

coryan commented Apr 15, 2022

coryan commented Sep 8, 2022

dbolduc commented Sep 12, 2022

dbolduc commented Mar 22, 2023 • edited Loading

Background

Design / Work:

1. Implement min-jitter

2. Update library defaults

coryan commented Jun 9, 2023

alevenberg commented Jun 9, 2023

dbolduc commented Mar 22, 2023 •

edited

Loading