Cache docs: update #32929

zucchini-nlp · 2024-08-22T04:34:07Z

What does this PR do?

Updates the docs with the feedback from community, makes some points more clear and fixes the docstring. I will run the doctest locally and see if we can trigger it by CI here

Fixes #32919

HuggingFaceDocBuilderDev · 2024-08-22T04:53:27Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker · 2024-08-22T13:50:45Z

docs/source/en/kv_cache.md

+>>> tokenizer = AutoTokenizer.from_pretrained(model_id)
+
+>>> INITIAL_PROMPT = "You are a helpful assistant. "
+>>> prompt_cache = DynamicCache()


I would rather promote the use of Static here but both are fine!

Oh, actually I recently found this doesn't work, the copy.deepcopy fails even after making it a torch Module. @gante confirmed that it's expected to fail, so I'll remove this section

yeah we need to fix the copying issue (and add a test!)

Done, the reason was that model forward should be run without grad, otherwise the key/values are non-leaf tensors. Fixed the example and verified it runs

Let me know if you have any comments, otherwise will merge

On cache reuse (copying a cache object): #33297

The problem is a bit more complex than no_grad on some cache classes :P

Mmm, does that mean the current deepcopy doesn't copy all tensors from the list when we use dynamic cache?

dynamic cache should be okay with no_grad at the moment 🤗 other caches, however, have objects that can't be copied (e.g. in the offloaded caches, the cuda stream can't be copied)

I'm writing a PR that lifts the no_grad requirement and handles the other corner cases. And, more importantly, adds tests!

gante

LGTM, thank you for addressing these issues 💪

gante · 2024-08-23T14:10:01Z

docs/source/en/kv_cache.md

+>>> tokenizer = AutoTokenizer.from_pretrained(model_id)
+
+>>> INITIAL_PROMPT = "You are a helpful assistant. "
+>>> prompt_cache = DynamicCache()


yeah we need to fix the copying issue (and add a test!)

zucchini-nlp · 2024-09-03T11:50:01Z

Added a slow test for cache copying

* some changes * more updates * fix cache copy * nits * nits * add tests

zucchini-nlp added 2 commits August 14, 2024 09:45

some changes

f2c99ed

more updates

4dc9c2f

zucchini-nlp requested a review from gante August 22, 2024 04:34

zucchini-nlp requested a review from ArthurZucker August 22, 2024 06:59

ArthurZucker approved these changes Aug 22, 2024

View reviewed changes

gante approved these changes Aug 23, 2024

View reviewed changes

zucchini-nlp added 5 commits August 30, 2024 20:40

fix cache copy

bfaed89

nits

0a102f4

Merge branch 'main' into cache_docs

1f32e75

nits

0aacf77

Merge remote-tracking branch 'upstream/main' into cache_docs

d4fce37

zucchini-nlp force-pushed the cache_docs branch from b44473d to d4fce37 Compare September 3, 2024 11:35

add tests

f0b1db6

zucchini-nlp merged commit ebbe8d8 into huggingface:main Sep 4, 2024
10 checks passed

BernardZach pushed a commit to BernardZach/transformers that referenced this pull request Dec 5, 2024

Cache docs: update (huggingface#32929)

95c123a

* some changes * more updates * fix cache copy * nits * nits * add tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache docs: update #32929

Cache docs: update #32929

zucchini-nlp commented Aug 22, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Aug 22, 2024

ArthurZucker Aug 22, 2024

zucchini-nlp Aug 22, 2024

gante Aug 23, 2024

zucchini-nlp Aug 30, 2024

zucchini-nlp Aug 30, 2024

gante Sep 4, 2024 •

edited

Loading

zucchini-nlp Sep 4, 2024

gante Sep 4, 2024

gante left a comment

gante Aug 23, 2024

zucchini-nlp commented Sep 3, 2024

Cache docs: update #32929

Cache docs: update #32929

Conversation

zucchini-nlp commented Aug 22, 2024 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Aug 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante Sep 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zucchini-nlp commented Sep 3, 2024

zucchini-nlp commented Aug 22, 2024 •

edited

Loading

gante Sep 4, 2024 •

edited

Loading