Improve greedy search memory usage #32895

regisss · 2024-08-20T08:48:05Z

What does this PR do?

When doing greedy search, inputs go through _expand_inputs_for_generation here where they are expanded (see here). As the expand size is always 1 in the case of greedy search, torch.repeat_interleave do not modify the inputs. However, it does increase the memory usage as the input to torch.repeat_interleave is cloned.

Here is a code snippet to check this behaviour:

import torch

a = torch.ones(1000, 1000, 1000, device="cuda")
print(torch.cuda.max_memory_allocated())

expand_size = 1
a = a.repeat_interleave(expand_size, dim=0)
print(torch.cuda.max_memory_allocated())

which returns

4000000000
8000000000

Thus, if the expand size is 1, we can return the model inputs before calling torch.repeat_interleave. That's the change introduced in this PR.

More context in this Slack thread: https://huggingface.slack.com/archives/C01N44FJDHT/p1723827436938589

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2024-08-20T09:08:02Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts · 2024-08-20T09:40:30Z

cc @gante @zucchini-nlp

regisss · 2024-08-20T09:45:13Z

CI failed but it doesn't seem to be related to this PR

zucchini-nlp

Wow, interesting finding! Thanks for handling

gante

Thank you for the fix 🙏

(CI is failing for reasons of our knowledge, will take care of rebasing and merging when the root cause is fixed cc @amyeroberts )

amyeroberts

Thanks to adding this improvement!

Do not call torch.repeat_interleave if expand_size is 1

regisss marked this pull request as ready for review August 20, 2024 09:45

regisss requested a review from gante August 20, 2024 09:45

zucchini-nlp approved these changes Aug 21, 2024

View reviewed changes

gante approved these changes Aug 21, 2024

View reviewed changes

amyeroberts approved these changes Aug 22, 2024

View reviewed changes

gante force-pushed the enhance_greedy_search_memory branch from c0169a1 to 77d8384 Compare August 22, 2024 12:08

Do not call torch.repeat_interleave if expand_size is 1

23f74a1

gante force-pushed the enhance_greedy_search_memory branch from 77d8384 to 23f74a1 Compare August 22, 2024 13:18

gante merged commit 99d67f1 into huggingface:main Aug 22, 2024
21 checks passed

regisss deleted the enhance_greedy_search_memory branch August 22, 2024 16:37

BernardZach pushed a commit to BernardZach/transformers that referenced this pull request Dec 5, 2024

Improve greedy search memory usage (huggingface#32895)

53e8866

Do not call torch.repeat_interleave if expand_size is 1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve greedy search memory usage #32895

Improve greedy search memory usage #32895

regisss commented Aug 20, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Aug 20, 2024

amyeroberts commented Aug 20, 2024

regisss commented Aug 20, 2024

zucchini-nlp left a comment

gante left a comment

amyeroberts left a comment

Improve greedy search memory usage #32895

Improve greedy search memory usage #32895

Conversation

regisss commented Aug 20, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Aug 20, 2024

amyeroberts commented Aug 20, 2024

regisss commented Aug 20, 2024

zucchini-nlp left a comment

Choose a reason for hiding this comment

gante left a comment

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

regisss commented Aug 20, 2024 •

edited

Loading