Add draft of throttled copy #258

bearrito · 2024-03-03T16:00:53Z

Adds ability to do throttled copy operations related to filling the cache.
Problem
Basically I'm trying to replicate what curl offers - https://everything.curl.dev/usingcurl/transfers/rate-limiting

We had discussed before doing this store/server side. I'm not sure that works for our case.

We have many devices 500-1500 per facility (warehouse)
Each device is generally doing work, so we need to dribble in updates in the background
Devices are wifi aware so if they are in a better part of the facility they can stream faster.
The store wouldn't have knowledge of IP to device. They store wouldn't know if the device is busy or in a good wifi area.

Approach
I think I took a pretty standard approach for rate-limiting workers. I do think it would have been better if I was able to get the size of the chunks. Right now waiting a fixed period essentially just limits the number of chunks per second, which will get your roughly (Average Chunk Size ) / second of bytes. At least with this way there is no coordination between workers in summing up bytes.

folbricht · 2024-03-04T11:16:01Z

Having this feature is a good idea, it could be quite useful. But I think this could be implemented a bit more generically.

How about:

Making the Limiter into a wrapper for a "Store" and "WriteStore". Then you wouldn't need to make any changes to the Copy function at all, the store passed into it would be rate-limited
In the Limiter, use https://pkg.go.dev/golang.org/x/time/rate to do the actual limiting rather than a custom sleep. This would also allow you to limit based on the number of bytes in the chunk. Probably have to make sure that the burst size is equal/bigger than than max-chunk-size
Could add a separate commandline flag for write-limit and read-limit in bytes per second

It'd be nice if we could make that more generic than just the copy function and expand to all, though it might be hard to specify this on the commandline since there could be several stores involved in a single command. Perhaps we could just expect this to be defined in store config file (https://github.com/folbricht/desync?tab=readme-ov-file#configuration).

Thoughts?

bearrito · 2024-03-04T14:47:35Z

No problem. I'll obviously defer to your expertise in the matter.
I'll have to look at the code-base more, but I guess I need to figure out how to provide backpressure from the write store.

Edit: Are you good with me limiting this to the LocalStore first?

folbricht · 2024-03-05T12:47:52Z

You don't need to change the LocalStore at all. You'll just make a wrapper for any kind of store. The steps would be roughly:

Make a new struct like RateLimitStore or similar (in a new file)
The struct should have at least these methods https://github.com/folbricht/desync/blob/master/store.go#L16-L20 to implement the Store interface. For a good example take a look at swapstore.go, it's very close to what you need. Of course yours wouldn't be about swapping the underlying store, but limiting the GetChunk() method. Initially you could just pass the methods through to the wrapped store.
Your struct would have a constructor like this func NewRateLimitStore(s Store) *RateLimitStore {
Take a look at how the source-store is intitialized in https://github.com/folbricht/desync/blob/master/cmd/desync/cache.go#L103-L107, after that line you'd do something like s = NewRateLimitStore(s) which would wrap whatever the read store is. We can later take a look at how to generalize this more if needed, like if we want to be able to limit the writing part as well.

Hope that helps, let me know if not and I can whip up a template on a branch which you can then complete.

folbricht · 2024-03-10T11:13:04Z

ratelimitstore.go

+	ctx, cancel:= context.WithTimeout(context.Background(), s.options.timeout)
+	defer cancel()
+
+	if s.options.immediateOrFail{


Do you actually need immediateOrFail ? Seems unexpected to cause a failure because the rate-limit was exceeded.

I could go either way. I guess I'd lean towards removal.

Related:
I'm not sure what the overall failure handling should be, I guess a client normally wants to wait, but I don't like naked waits with no timeout ability. Perhaps timeouts should be default large.

We could have two paths, one where we use a context with timeout and one without. But that's another code path.

It'd certainly be nice to have a context on all these functions but that's a major refactoring of the library that should be done in a separate PR.

The wait really depends on the rate-limit here, and I don't think it should hang forever. Adding a timeout seems overly complex and I'd remove that. It might be possible to calculate some upper limit based on the rate-limit used and the max chunk size perhaps.

To confirm my understanding:

Remove the timeout from the context.

Allow the ops to wait indefinitely, with no timeout.

I'm ultimately only concerned with throttling cache operations against my local store, so I think that's fine for me. However, I haven't looked at the serving code much, is there a chance that this affects HTTP handling?

Correct, no custom timeout. This can be implemented some day when contexts are added to all store operations.

This shouldn't affect HTTP stores at all, since we're not really limiting the data-rate, but the speed at which whole chunks can be read/stored. So if the rate-limiter allows one chunk to be accessed, that will be transferred at max speed which means the HTTP timeouts apply like normal. Chunk downloading is limited to the rate-limit on average, not for individual chunks.

I added a test with the HttpHandler. There is a way to reproduce an error.

goroutine 125 [select]: golang.org/x/time/rate.(*Limiter).wait(0xc0001df360, {0xe27338, 0x13ceb80}, 0x1, {0x0?, 0x0?, 0x139e0c0?}, 0xd56f30) /home/bstrausser/go/pkg/mod/golang.org/x/time@v0.5.0/rate/rate.go:285 +0x3e5 golang.org/x/time/rate.(*Limiter).WaitN(0x79e845e27a40b76?, {0xe27338, 0x13ceb80}, 0x269063b6f56504d5?) /home/bstrausser/go/pkg/mod/golang.org/x/time@v0.5.0/rate/rate.go:248 +0x50 github.com/folbricht/desync.RateLimitedStore.HasChunk({{0xe27dc0, 0xc0003f2b00}, 0xc0001df360, {0x3fc999999999999a, 0x1}}, {0x76, 0xb, 0xa4, 0x27, 0x5e, ...}) /home/bstrausser/Git/desync/ratelimitstore.go:59 +0x7e github.com/folbricht/desync.HTTPHandler.head({{{0xceccfb, 0x5}, 0x1, {0x0, 0x0}}, {0xe273e0, 0xc00040cd50}, 0x0, {0xc0003f0990, 0x1, ...}, ...}, ...) /home/bstrausser/Git/desync/httphandler.go:77 +0x4e github.com/folbricht/desync.HTTPHandler.ServeHTTP({{{0xceccfb, 0x5}, 0x1, {0x0, 0x0}}, {0xe273e0, 0xc00040cd50}, 0x0, {0xc0003f0990, 0x1, ...}, ...}, ...) /home/bstrausser/Git/desync/httphandler.go:47 +0x266 net/http.serverHandler.ServeHTTP({0xc001bda720?}, {0xe25808?, 0xc00061c2a0?}, 0x6?) /usr/local/go/src/net/http/server.go:2938 +0x8e net/http.(*conn).serve(0xc0006a0c60, {0xe275d8, 0xc00040d1a0}) /usr/local/go/src/net/http/server.go:2009 +0x5f4 created by net/http.(*Server).Serve in goroutine 9 /usr/local/go/src/net/http/server.go:3086 +0x5cb

That's just part of the stacktrace. Is there a repro on your branch? What test can I run to see it?

When I went back to look I couldn't repro.

folbricht · 2024-03-10T11:15:07Z

ratelimitstore.go

+
+func (s RateLimitedLocalStore) GetChunk(id ChunkID) (*Chunk, error) {
+
+	return s.wrappedStore.GetChunk(id)


If you wanted to limit reads as well (and I think we should), it would make sense to first call GetChunk() on the underlying store, then wait after we have the data. That isn't super-precise in terms of limiting since there could be bursts of maxChunkSize * n at startup but over time it should even out.

Agreed. I'll invert the order and add in support for reads.

folbricht · 2024-03-10T11:16:36Z

ratelimitstore.go

+	immediateOrFail bool
+}
+
+type RateLimitedLocalStore struct {


Why the Local in the name. This could limit any kind of store.

Correct. I'll fix. I think I was originally wrapping a local store only.

bearrito · 2024-03-10T19:42:09Z

Add throttle on reads.

folbricht · 2024-04-13T13:48:59Z

ratelimitstore.go

+
+func (s RateLimitedStore) StoreChunk(chunk *Chunk) error {
+
+	// This isn't ideal because what I'm really interested is in size over the wire.


I think you can actually achieve this by using a limit of "bytes per second" (must be equal or larger than max-chunk-size). The here in StoreChunk() you can

b, err := chunk.Data() if err != nil { return err } err = s.limiter.WaitN(ctx, len(b)) if err != nil { return RateLimitedExceeded } return s.wrappedStore.StoreChunk(chunk)

It's a little more interesting in GetChunk() since you don't know how large it is. One way would be to WaitN(ctx, <avg-chunk-size>) before calling the wrapped store, or WaitN(ctx, <size of data>) after actually pulling the chunk from the wrapped store. The latter can cause spikes early on when concurrency is used, but should average to the rate you set over time.

As for HasChunk(), not sure if you really need to limit it, it's very small. You can just use 1 like you have now if you want to limit it too, could do that before calling the wrappedStore.

bstrausser added 2 commits March 3, 2024 10:47

Add draft of throttled copy

f1ac460

Try fix failing test

80627b2

bstrausser added 5 commits March 5, 2024 14:16

Create limiting store

4e9f68e

Add more test + remove deadline for now

0fc9e60

Cleanup some of the test code duplication

1ccde5b

Fix cache copy call

0702186

Add in timeouts

dd156c5

folbricht reviewed Mar 10, 2024

View reviewed changes

Add throttle /w tests on Reads

56497cc

bstrausser added 3 commits March 17, 2024 17:19

Remove timeout + add http tests

3560177

Add integration tests with copy

681a6f5

Remove test case

eb52eb8

folbricht reviewed Apr 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add draft of throttled copy #258

Add draft of throttled copy #258

bearrito commented Mar 3, 2024

folbricht commented Mar 4, 2024

bearrito commented Mar 4, 2024 •

edited

Loading

folbricht commented Mar 5, 2024

folbricht Mar 10, 2024

bearrito Mar 10, 2024 •

edited

Loading

folbricht Mar 16, 2024

bearrito Mar 16, 2024

folbricht Mar 17, 2024

bearrito Mar 17, 2024

folbricht Mar 18, 2024

bearrito Apr 10, 2024

folbricht Mar 10, 2024

bearrito Mar 10, 2024

folbricht Mar 10, 2024

bearrito Mar 10, 2024

bearrito commented Mar 10, 2024

folbricht Apr 13, 2024


		func (s RateLimitedLocalStore) GetChunk(id ChunkID) (*Chunk, error) {

		return s.wrappedStore.GetChunk(id)


		func (s RateLimitedStore) StoreChunk(chunk *Chunk) error {

		// This isn't ideal because what I'm really interested is in size over the wire.

Add draft of throttled copy #258

Are you sure you want to change the base?

Add draft of throttled copy #258

Conversation

bearrito commented Mar 3, 2024

folbricht commented Mar 4, 2024

bearrito commented Mar 4, 2024 • edited Loading

folbricht commented Mar 5, 2024

Choose a reason for hiding this comment

bearrito Mar 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bearrito commented Mar 10, 2024

Choose a reason for hiding this comment

bearrito commented Mar 4, 2024 •

edited

Loading

bearrito Mar 10, 2024 •

edited

Loading