Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix token bucket script #12

Merged
merged 3 commits into from
Mar 6, 2024
Merged

Conversation

fredrikgyll
Copy link
Contributor

@fredrikgyll fredrikgyll commented Mar 6, 2024

The limiter, as written, will force all consumers to sleep on average time_between_slots seconds, regardless of all other parameters and number of workers.

Problem

The first time the script is called, it sets the next available slot to "now + time_between_slots" due to the line slot = now + time_between_slots. This immediately schedules the next "available" token a full time_between_slots into the future, regardless of the amount of tokens that exist.

Then, when the script is called again, if tokens have been used or the time has passed, it still returns the incorrectly advanced "next slot" time, continuously enforcing a delay of, at a minimum, time_between_slots into the future, again ignoring the actual amount of tokens available.

Solution

The updated script correctly calculates the amount of slots that have passed since the last token was added and refills tokens accordingly. It also advanses slots according to slots passed and only tells the consumer to wait when the bucket is empty.

Demo

Before: Adding 3 items
from django_redis import get_redis_connection
from limiters import SyncTokenBucket

# Allow 30 requests per second
limiter = SyncTokenBucket(
    name='test2',
    refill_frequency=1,
    refill_amount=30,
    capacity=30,
    max_sleep=30,
    connection=get_redis_connection(),
)

# these prints should run without any delay. There are 30 tokens in the bucket
for i in range(3):
    with limiter:
        print(i)

0
[03/06/24 11:42:23] INFO     limiters.token_bucket [None] - Sleeping 0.999407 seconds    token_bucket.py:57
                             (test2)
1
[03/06/24 11:42:24] INFO     limiters.token_bucket [None] - Sleeping 0.994734 seconds    token_bucket.py:57
                             (test2)
2
[03/06/24 11:42:25] INFO     limiters.token_bucket [None] - Sleeping 0.998275 seconds    token_bucket.py:57
                             (test2)
3
After: Adding 60 items
from django_redis import get_redis_connection
from limiters import SyncTokenBucket

# Allow 30 requests per second
limiter = SyncTokenBucket(
    name='test2',
    refill_frequency=1,
    refill_amount=30,
    capacity=30,
    max_sleep=30,
    connection=get_redis_connection(),
)

# these prints should run with *one* delay. There are 30 tokens in the bucket.
for i in range(60):
    with limiter:
        print(i)

[03/06/24 18:54:49] INFO     limiters.token_bucket [None] - Sleeping 0.019292 seconds (test2)                                                                      token_bucket.py:57
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
                    INFO     limiters.token_bucket [None] - Sleeping 0.988805 seconds (test2)                                                                      token_bucket.py:57
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
Simulating a worker pool
# token_bucket_test.py
import time
from multiprocessing.pool import Pool
from random import random

from django_redis import get_redis_connection
from limiters import SyncTokenBucket

limiter = SyncTokenBucket(
    name='test1',
    refill_frequency=1,
    refill_amount=10,
    capacity=10,
    max_sleep=30,
    connection=get_redis_connection(),
)


def f(x):
    with limiter as s:
        time.sleep(random())  # some blocking work, max 1 sec
        return s


if __name__ == '__main__':
    t0 = time.time()
    r = 50
    with Pool(8) as p:
        total_sleep = sum(p.imap(f, range(r)))
    t1 = time.time()
    d = t1 - t0
    print(f'{r} tokens took {d:.2f} seconds = {r/d:.2f} tokens/s')
    print(f'Average wait: {total_sleep/r:.2f} s')

Before:

python token_bucket_test.py
50 tokens took 12.47 seconds = 4.01 tokens/s
Average wait: 1.15 s

After:

python token_bucket_test.py
50 tokens took 5.66 seconds = 8.83 tokens/s
Average wait: 0.13 s

@fredrikgyll fredrikgyll requested a review from klette March 6, 2024 12:14
@fredrikgyll fredrikgyll self-assigned this Mar 6, 2024
sourcery-ai[bot]

This comment was marked as spam.

@fredrikgyll fredrikgyll force-pushed the fredrikgy/fix-tokan-bucket-script branch from 7b360a7 to 492e51e Compare March 6, 2024 12:37
@fredrikgyll fredrikgyll force-pushed the fredrikgy/fix-tokan-bucket-script branch from 492e51e to a4fe739 Compare March 6, 2024 12:59
@fredrikgyll fredrikgyll requested review from magnudae, larsjr, rix1 and kmkr March 6, 2024 13:03
Copy link
Member

@klette klette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@fredrikgyll fredrikgyll merged commit 3a420e0 into main Mar 6, 2024
4 checks passed
@fredrikgyll fredrikgyll deleted the fredrikgy/fix-tokan-bucket-script branch March 6, 2024 20:50
@larsjr
Copy link
Contributor

larsjr commented Mar 7, 2024

Thanks for fixing this! And for adding a great PR description with demo 👏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants