Update modeling_mamba2.py, fix pad size #32599

klae01 · 2024-08-11T11:15:57Z

What does this PR do?

The update involves a change in how pad_size is calculated. Originally, pad_size was computed as self.chunk_size - (seq_len % self.chunk_size). The new formula ensures that pad_size will be the remainder of the subtraction self.chunk_size - seq_len % self.chunk_size and then a modulus with self.chunk_size. This change is intended to fix an issue where unnecessary padding was added when seq_len % self.chunk_size == 0. This behavior optimizes the padding process and reduces computational overhead.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

amyeroberts

Hi @klae01 - thanks for opening this PR!

Could you expand on what this PR is addressing a bit more - ideally linking to a related github issue or providing a code snippet which reproduces the issue?

In the previous definition

pad_size = self.chunk_size - (seq_len % self.chunk_size)

pad_size will always be 0 <= pad_size <= chunk_size.

I'm guessing the change is really so the padding is 0 if the sequence can be divided perfectly by chunk_size?

vasqu · 2024-08-14T19:41:12Z

@amyeroberts Original author who created this in mamba2 (for transformers). Yup, it's to reduce the padding size to 0 if we have seq_len % chunk_size == 0. The way it is now, we have an unnecessary padding of size chunk_size so fixing this definitely saves compute 😄.

It was originally meant to pad seq_len to a multiple of chunk_size (due to the algorithm working in a fixed chunk len fashion). So pad_size is the remaining length to get to a multiple of it.

LGTM cc @molbap for Mamba2

amyeroberts · 2024-08-15T11:07:44Z

@vasqu OK great, thanks for confirming and clarifying!

@klae01 Could you:

Update the PR description to reflect this
Rebase on main to include upstream changes? This should resolve the failing tests

vasqu · 2024-08-15T14:31:10Z

We should wait for rebase until #32694 is merged or something similar. There's currently a bug in cached generation with input_embeds.

klae01 · 2024-08-15T21:24:12Z

Hi @vasqu, @amyeroberts,

Thank you for the feedback and guidance. I noticed that the force push I made may have been a bit premature, especially considering the advice to wait until PR #32694 (or a similar PR) is merged to avoid potential issues with cached generation.

To align with the suggestion, I’ll hold off on any further rebasing or force pushes until PR #32694 is merged. Once that’s done, I’ll rebase my branch on the updated main and ensure everything is in order.

Please let me know if there’s anything else I should do in the meantime. I appreciate your time and help with reviewing this PR!

Thanks!

vasqu · 2024-08-16T14:35:39Z

@klae01 Thanks for bearing with me. You can only wait for now I guess but I'll ping you again when the other PR is merged and then this should be a quick rebase and go 👀

vasqu · 2024-08-19T14:26:06Z

@klae01 The patch has been merged 😄 thanks for waiting!

klae01 · 2024-08-20T02:37:18Z

@amyeroberts I've rebased my branch now that PR #32694 is merged and force-pushed the updates. Everything should be up-to-date. Let me know if anything else is needed!

vasqu · 2024-08-23T10:30:09Z

@klae01 Could you push an empty commit with the message [run_slow] mamba2. That should trigger the CI.

cc @amyeroberts for slow run

vasqu · 2024-08-23T10:31:38Z

Ah and maybe rebase before, we removed a flag in #32686

HuggingFaceDocBuilderDev · 2024-08-27T12:42:33Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts

Change looks good to me - thanks!

Just looking into how we can enable the tests cc @ydshieh

ydshieh · 2024-08-28T08:55:41Z

Change looks good to me - thanks!

Just looking into how we can enable the tests cc @ydshieh

Hi, I am lacking a bit of the context. Do you still wait my inputs ..? If so could you elaborate a bit more the situation to resolve?

amyeroberts · 2024-08-28T09:41:28Z

@ydshieh Apologies for not providing context! This was one of the related issues with tests failing because of access to a gated model. I'll trigger a re-run of the slow tests, as I think this was resolved (by you - thank you!) yesterday

ydshieh · 2024-08-28T15:43:11Z

OK! I also sent you a Slack DM regarding CI bot :-)

vasqu · 2024-09-16T22:32:30Z

Any updates on this PR?

gentle ping @amyeroberts

amyeroberts · 2024-09-17T10:42:27Z

@vasqu Apologies for the delay. The runner should now have access to run the mistral tests. I re-triggered the tests, but it seems they still failed, I'm guessing because retriggering uses the same settings i.e. tokens. @vasqu Could you push another [run-slow] commit to trigger these? IT should all pass and then we can merge

vasqu · 2024-09-17T11:01:51Z

No worries, I'm not the author so can't commit.

@klae01 Could you push an empty commit again, i.e. [run-slow] mamba2. This time it should clear the CI 🤞

klae01 · 2024-09-17T15:12:29Z

Hello, @amyeroberts

I've added the [run-slow] mamba2 commit as requested, and I see that all CI tests have passed. Please let me know if there's anything else I should address.

Thank you!

klae01 · 2024-09-18T04:26:31Z

Hello @amyeroberts,

I've added the [run-slow] mamba2 commit to re-trigger the tests, but it seems the CI tests are failing due to permission issues. Checking the logs, the tests are failing because they cannot access the gated repository:

Cannot access gated repo for url https://huggingface.co/mistralai/Mamba-Codestral-7B-v0.1/resolve/refs%2Fpr%2F9/config.json.
Access to model mistralai/Mamba-Codestral-7B-v0.1 is restricted. You must have access to it and be authenticated to access it. Please log in.

Could you please advise on how to resolve this issue? Thank you!

amyeroberts · 2024-09-18T10:11:53Z

@klae01 Could you add the @require_read_token decorator on the tests which are failing? This should properly propogate the required access token

amyeroberts · 2024-09-19T18:14:01Z

@klae01 It's a bit annoying, but anything after [run-slow] will be interpreted as a model name (so in this case it's looking for the models "add" and "@require_read_token"

Could you push an empty commit with [run_slow] mamba2?

Fix pad_size calculation to ensure it's less than self.chunk_size

…en propagation

klae01 · 2024-09-20T01:31:10Z

Hello, @amyeroberts.

I’ve just pushed a commit that rebases my branch with the latest changes from the main branch. I expect that these updates should properly resolve the issue with mamba2.

By the way, could you let me know how long the run-slow tests typically take to execute? Also, where can I find documentation on commit conventions (or guidelines) for transformers?

ydshieh · 2024-09-20T09:55:39Z

Hi @klae01

We don't have a public doc for that yet. We are trying to add a comment like below

Before merging this pull request, slow tests CI should be triggered. To enable this:

Add the run-slow label to the PR
When your PR is ready for merge and all reviewers' comments have been addressed, push an empty commit with the command [run-slow] followed by a comma separated list of all the models to be tested, i.e. [run_slow] model_to_test_1, model_to_test_2
A transformers maintainer will then approve the workflow to start the tests

ydshieh · 2024-09-20T09:56:26Z

Let's ignore the multi-gpu job (it's an infra issue)

ydshieh · 2024-09-20T10:01:25Z

@amyeroberts

It's a bit annoying, but anything after [run-slow] will be interpreted as a model name

I can change to make it less annoying, like [run-slow] .... models=[model_1, model_2] .... if that is more flexible

amyeroberts · 2024-09-20T10:40:33Z

I can change to make it less annoying, like [run-slow] .... models=[model_1, model_2] .... if that is more flexible

@ydshieh I think it's hard to find a good solution tbh -- most of the time the commit to trigger the tests are empty as they're requested at the end of PR review, so just [run-slow] model_1, model_2, ... etc. is easier. Happy to go for [run-slow] .... models=[model_1, model_2] though if you think it's better as it's more explicit

@klae01 On top of what @ydshieh has said above, regarding your question: By the way, could you let me know how long the run-slow tests typically take to execute? the answer is it depends on a few things:

When they're actually triggered. A HF employee has to approve the workflow, so there's going to be time between your commit and when it "starts". It's done this way for security reasons
Availability of compute - we sometimes have to wait for runners to become available
The tests themselves. Some models have lots of slow tests, or the model being tested is very large, both of which will increase the completion time.

You can see in the last single-gpu run for mamba2 that running the tests themselves took about 26s, but initializing the container for testing took 4m 27s.

amyeroberts · 2024-09-20T10:40:53Z

All passing. Thanks for your patience and the fix @klae01!

* Update modeling_mamba2.py Fix pad_size calculation to ensure it's less than self.chunk_size * [run_slow] mamba2 * [run-slow] mamba2 * [run-slow] Add @require_read_token decorator to failing tests for token propagation * [run_slow] mamba2

amyeroberts reviewed Aug 12, 2024

View reviewed changes

klae01 force-pushed the patch-1 branch from d435cec to 3f3e207 Compare August 15, 2024 21:02

klae01 force-pushed the patch-1 branch from 3f3e207 to 9d999c6 Compare August 20, 2024 02:31

amyeroberts added the run-slow label Aug 20, 2024

klae01 force-pushed the patch-1 branch from 9d999c6 to 15a32ce Compare August 23, 2024 11:20

klae01 requested a review from amyeroberts August 23, 2024 11:26

amyeroberts approved these changes Aug 27, 2024

View reviewed changes

molbap mentioned this pull request Sep 18, 2024

fix tests with main revision and read token #33560

Merged

klae01 and others added 2 commits September 20, 2024 10:09

Update modeling_mamba2.py

a6ea696

Fix pad_size calculation to ensure it's less than self.chunk_size

[run_slow] mamba2

467407a

klae01 added 3 commits September 20, 2024 10:09

[run-slow] mamba2

7a38543

[run-slow] Add @require_read_token decorator to failing tests for tok…

65f6929

…en propagation

[run_slow] mamba2

c5b4e64

klae01 force-pushed the patch-1 branch from 2c28814 to c5b4e64 Compare September 20, 2024 01:19

amyeroberts merged commit ec1424c into huggingface:main Sep 20, 2024
16 of 17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update modeling_mamba2.py, fix pad size #32599

Update modeling_mamba2.py, fix pad size #32599

klae01 commented Aug 11, 2024 •

edited

Loading

amyeroberts left a comment

vasqu commented Aug 14, 2024 •

edited

Loading

amyeroberts commented Aug 15, 2024

vasqu commented Aug 15, 2024

klae01 commented Aug 15, 2024

vasqu commented Aug 16, 2024

vasqu commented Aug 19, 2024

klae01 commented Aug 20, 2024

vasqu commented Aug 23, 2024

vasqu commented Aug 23, 2024

HuggingFaceDocBuilderDev commented Aug 27, 2024

amyeroberts left a comment

ydshieh commented Aug 28, 2024

amyeroberts commented Aug 28, 2024

ydshieh commented Aug 28, 2024

vasqu commented Sep 16, 2024

amyeroberts commented Sep 17, 2024

vasqu commented Sep 17, 2024

klae01 commented Sep 17, 2024

klae01 commented Sep 18, 2024

amyeroberts commented Sep 18, 2024

amyeroberts commented Sep 19, 2024

klae01 commented Sep 20, 2024

ydshieh commented Sep 20, 2024

ydshieh commented Sep 20, 2024

ydshieh commented Sep 20, 2024 •

edited

Loading

amyeroberts commented Sep 20, 2024

amyeroberts commented Sep 20, 2024

Update modeling_mamba2.py, fix pad size #32599

Update modeling_mamba2.py, fix pad size #32599

Conversation

klae01 commented Aug 11, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

amyeroberts left a comment

Choose a reason for hiding this comment

vasqu commented Aug 14, 2024 • edited Loading

amyeroberts commented Aug 15, 2024

vasqu commented Aug 15, 2024

klae01 commented Aug 15, 2024

vasqu commented Aug 16, 2024

vasqu commented Aug 19, 2024

klae01 commented Aug 20, 2024

vasqu commented Aug 23, 2024

vasqu commented Aug 23, 2024

HuggingFaceDocBuilderDev commented Aug 27, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

ydshieh commented Aug 28, 2024

amyeroberts commented Aug 28, 2024

ydshieh commented Aug 28, 2024

vasqu commented Sep 16, 2024

amyeroberts commented Sep 17, 2024

vasqu commented Sep 17, 2024

klae01 commented Sep 17, 2024

klae01 commented Sep 18, 2024

amyeroberts commented Sep 18, 2024

amyeroberts commented Sep 19, 2024

klae01 commented Sep 20, 2024

ydshieh commented Sep 20, 2024

ydshieh commented Sep 20, 2024

ydshieh commented Sep 20, 2024 • edited Loading

amyeroberts commented Sep 20, 2024

amyeroberts commented Sep 20, 2024

klae01 commented Aug 11, 2024 •

edited

Loading

vasqu commented Aug 14, 2024 •

edited

Loading

ydshieh commented Sep 20, 2024 •

edited

Loading