Add prefix cache aware routing #641

varungup90 · 2025-02-07T00:04:01Z

No description provided.

pkg/cache/cache.go

pkg/plugins/gateway/algorithms/least_kv_cache.go

pkg/utils/util.go

pkg/plugins/gateway/algorithms/prefix_cache.go

Jeffwan · 2025-02-08T00:42:54Z

pkg/cache/cache.go

+			end = len(unMatchedTokens)
+		}
+		chunk := unMatchedTokens[i:end]
+		prefixHash := xxhash.Sum64(IntArrayToByteArray(chunk))


here. it just consider the current block size? just like to confirm this is not 100% same as vLLM's solution right? their 1st block hash is part of 2nd hash input

Yes, hash only for current block. No link list kind of behavior that vllm has. For our usecase we do not need that link list behavior.

if that case, we need to do up to O(n) calculations? n=number of blocks

Yes. If you are referring to that link list behavior could prevent O(n) calculations then it wont be the case. Total computations stays the same.

it's a little bit different. LinkedList you can do binary search etc for optimization. In this way, we can only do O(n).

Let me look into this, but for our usecase we need to evaluate all blocks to ensure a 50%+ hit rate.

gaocegege

Some nits.

pkg/utils/util.go

pkg/cache/cache.go

Jeffwan

Let's move a little bit faster and track those TODOs in separate issues

Add prefix cache aware routing

4249013

varungup90 changed the title ~~Add prefix cache aware routing~~ WIP: Add prefix cache aware routing Feb 7, 2025

varungup90 added 3 commits February 6, 2025 16:04

Merge branch 'main' into add-prefix-cache

a7df900

end to end stiching

1e454be

Merge branch 'main' into add-prefix-cache

825dab7

varungup90 changed the title ~~WIP: Add prefix cache aware routing~~ Add prefix cache aware routing Feb 7, 2025

varungup90 added 4 commits February 7, 2025 10:59

fix lint errors

12f168b

nit

75ad50d

add integ test for prefix caching

72980f7

add constants

e7baa38

Jeffwan reviewed Feb 8, 2025

View reviewed changes

address review comments

c470cc1

gaocegege reviewed Feb 9, 2025

View reviewed changes

pkg/utils/util.go Show resolved Hide resolved

pkg/cache/cache.go Show resolved Hide resolved

add prefix cache eviction

c6a01e6

Jeffwan approved these changes Feb 10, 2025

View reviewed changes

add unit test for prefix cache eviction

5cb876a

varungup90 merged commit f40c973 into main Feb 10, 2025
9 of 10 checks passed

varungup90 deleted the add-prefix-cache branch February 10, 2025 22:09

This was referenced Feb 14, 2025

[router] LSH based prefix cache aware router #672

Open

[router] Use string instead of token ids #673

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add prefix cache aware routing #641

Add prefix cache aware routing #641

varungup90 commented Feb 7, 2025

Jeffwan Feb 8, 2025

varungup90 Feb 8, 2025

Jeffwan Feb 9, 2025

varungup90 Feb 9, 2025 •

edited

Loading

Jeffwan Feb 10, 2025

varungup90 Feb 10, 2025

gaocegege left a comment

Jeffwan left a comment •

edited

Loading

Add prefix cache aware routing #641

Add prefix cache aware routing #641

Conversation

varungup90 commented Feb 7, 2025

Jeffwan Feb 8, 2025

Choose a reason for hiding this comment

varungup90 Feb 8, 2025

Choose a reason for hiding this comment

Jeffwan Feb 9, 2025

Choose a reason for hiding this comment

varungup90 Feb 9, 2025 • edited Loading

Choose a reason for hiding this comment

Jeffwan Feb 10, 2025

Choose a reason for hiding this comment

varungup90 Feb 10, 2025

Choose a reason for hiding this comment

gaocegege left a comment

Choose a reason for hiding this comment

Jeffwan left a comment • edited Loading

Choose a reason for hiding this comment

varungup90 Feb 9, 2025 •

edited

Loading

Jeffwan left a comment •

edited

Loading