Exponential Backoff ranges are incorrect #11107

dbolduc · 2023-03-25T17:56:57Z

We have a scaling factor, but we multiply the initial range by 2, instead.

google-cloud-cpp/google/cloud/internal/backoff_policy.h

Line 131 in 9737c4b

current_delay_range_(2 * initial_delay_),

Moreover, we do not check if we have hit the maximum until after generating the first backoff

google-cloud-cpp/google/cloud/internal/backoff_policy.cc

Lines 45 to 55 in 9737c4b

    
           std::uniform_int_distribution<microseconds::rep> rng_distribution( 
        
               current_delay_range_.count() / 2, current_delay_range_.count()); 
        
           // Randomized sleep period because it is possible that after some time all 
        
           // client have same sleep period if we use only exponential backoff policy. 
        
           auto delay = microseconds(rng_distribution(*generator_)); 
        
           current_delay_range_ = microseconds(static_cast<microseconds::rep>( 
        
               static_cast<double>(current_delay_range_.count()) * scaling_)); 
        
           if (current_delay_range_ >= maximum_delay_) { 
        
             current_delay_range_ = maximum_delay_; 
        
           } 
        
           return duration_cast<milliseconds>(delay);

This leads to kooky behavior. Let's say I have an ExponentialBackoffPolicy(10s, 11s, 1.1).

The first range will be: [10s, 20s], which does not respect the maximum.

All subsequent ranges will be: [5.5s, 11s], which does not respect the minimum.

Lol.

The text was updated successfully, but these errors were encountered:

alevenberg · 2023-05-08T17:54:23Z

To further recap what Darren described:
Current behavior
ExponentialBackoffPolicy(10s, 11s, 1.1)
First call: [10s, 20s]
Second call: [5.5s, 11s]
Third call: [5.5s, 11s]

Intended behavior
ExponentialBackoffPolicy(10s, 11s, 1.1)
First call: [10s, 11s]
Second call: [10s, 11s]
Third call: [10s, 11s]

A more general case:
Current behavior
ExponentialBackoffPolicy(1s, 11s, 2)
First call: [1s, 2s]
Second call: [2s, 4s]
Third call: [8s, 16s]
Fourth call: [5.5s, 11s]
Fifth call: [5.5s, 11s]

Intended behavior
ExponentialBackoffPolicy(1s, 11s, 2)
First call: [1s, 2s]
Second call: [1s, 4s]
Third call: [1s, 8s]
Fourth call: [1s, 11s]
Fifth call: [1s, 11s]

#8755 describes the initial implementation decision conversation. I will move forward with implementing min-jitter.

dbolduc · 2023-05-08T18:09:48Z

I will move forward with implementing min-jitter.

errrr..... I do not recommend this. I think you should fix one bug at a time, starting with this bug. Yes #8755 is juicier, but you will benefit from starting on a smaller problem.

I would say the intended behavior we want for the "more general case" (as far as this bug is concerned) is the current behavior:

ExponentialBackoffPolicy(1s, 11s, 2) gives...

call #	backoff range
1	[1s, 2s]
2	[2s, 4s]
3	[4s, 8s]
4	[5.5s, 11s]
N	[5.5s, 11s]

dbolduc added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p2 Moderately-important priority. Fix may not be included in next release. labels Mar 25, 2023

alevenberg self-assigned this May 8, 2023

alevenberg linked a pull request May 8, 2023 that will close this issue

fix: correct exponential backoff ranges #11529

Merged

alevenberg mentioned this issue May 9, 2023

fix: correct exponential backoff ranges #11529

Merged

alevenberg closed this as completed in #11529 May 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exponential Backoff ranges are incorrect #11107

Exponential Backoff ranges are incorrect #11107

dbolduc commented Mar 25, 2023

alevenberg commented May 8, 2023

dbolduc commented May 8, 2023

Exponential Backoff ranges are incorrect #11107

Exponential Backoff ranges are incorrect #11107

Comments

dbolduc commented Mar 25, 2023

alevenberg commented May 8, 2023

dbolduc commented May 8, 2023