-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache: don't show warning in forward passes when past_key_values
is None
#33541
Conversation
past_key_values = DynamicCache.from_legacy_cache(past_key_values) | ||
logger.warning_once( | ||
"We detected that you are passing `past_key_values` as a tuple of tuples. This is deprecated and " | ||
"will be removed in v4.47. Please use an appropriate `Cache` class " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bumped the deprecation to v4.47, we some key models like T5 are still missing
next_cache = next_decoder_cache if use_cache else None | ||
if return_legacy_cache: | ||
next_cache = next_cache.to_legacy_cache() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy/paste from llama
(on some models, this pattern was slightly different)
"Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/internal/generation_utils#transformers.Cache)" | ||
) | ||
return_legacy_cache = False | ||
if use_cache and not isinstance(past_key_values, Cache): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: not self.training
was removed.
If we are training and we pass past_key_values
as tuple of tuples, we definitely want to see the warning -- the code will break in the near future
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing, this is way much better than checking for self.training
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Joao!
logger.warning_once( | ||
"We detected that you are passing `past_key_values` as a tuple of tuples. This is deprecated and " | ||
"will be removed in v4.47. Please use an appropriate `Cache` class " | ||
"(https://huggingface.co/docs/transformers/internal/generation_utils#transformers.Cache)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit not really related to the PR but to the link which was already here before)
Linking to the Cache
class is cool but you have to scroll down a bit to see an example. Would it be possible to link to a migration doc/example showcasing how a previously written code with past key values as a tuple of tuples can be adapted to be sent to the model?
The more copy-pastable the example, the less friction there will be here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LysandreJik good point!
I've added a tiny section to our cache docs about the legacy cache and how to convert it to/from the new format, with an example (cc @zucchini-nlp). This warning now points to that section in the docs.
(will merge after confirming the docs with the doc builder)
EDIT: for some reason, the doc builder is not updating its contents, despite the doc job being successful 🤔 I'm going to merge and double-check the merged results
EDIT2: it worked :) https://huggingface.co/docs/transformers/main/en/kv_cache#legacy-cache-format
What does this PR do?
Because of the transition from tuple of tuples to
Cache
instances, we were throwing a warning when convertingpast_key_values
to the new cache format in the forward passes.One of those situations was when
use_cache=True
andpast_key_values is None
... but there is nothing to convert there. In fact, most of the times, the user didn't even specify the argument (see test script below). Moreover, after the transition is complete, we want to keep the defaultpast_key_values=None
argument.As such, this PR removes the warning when
past_key_values=None
.Fixes #33489
Test script:
Before:
Now: no warning :)