Unhandled 'num_items_in_batch' in Mistral model #34575

gheinrich · 2024-11-02T08:29:27Z

System Info

Transformer version: 4.46.0
Model: nvidia/Mistral-NeMo-Minitron-8B-Base

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

When calling the forward method on the NeMo Mistral model, the following exception occurs:

[rank2]:   File "/lustre/fsw/portfolios/llmservice/users/gheinrich/anaconda3/envs/vila/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1582, in _call_impl
[rank2]:     result = forward_call(*args, **kwargs)
[rank2]: TypeError: MistralForCausalLM.forward() got an unexpected keyword argument 'num_items_in_batch'

Expected behavior

The forward() method should use num_items_in_batch for the loss calculation.

The text was updated successfully, but these errors were encountered:

This PR enables handling loss keyword arguments in the Mistral forward() method. Specifically, if `num_items_in_batch` is passed, the value is used to properly normalize the loss value. This relates to the Gradient Accumulation fix (huggingface#34191) Fixes huggingface#34575

arivero · 2024-11-29T01:37:43Z

and careful here, not to break multi gpu... With Llama 3.1 I get usually an error when dividing loss/num_items_in_batch, they tend to go to different gpus.

github-actions · 2024-12-23T08:05:33Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

gheinrich added the bug label Nov 2, 2024

gheinrich mentioned this issue Nov 2, 2024

Handle num_items_in_batch in Mistral's forward #34576

Open

github-actions bot closed this as completed Dec 31, 2024

Bachstelze mentioned this issue Jan 22, 2025

forward() got an unexpected keyword argument 'num_items_in_batch' #35838

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unhandled 'num_items_in_batch' in Mistral model #34575

Unhandled 'num_items_in_batch' in Mistral model #34575

gheinrich commented Nov 2, 2024

arivero commented Nov 29, 2024

github-actions bot commented Dec 23, 2024

Unhandled 'num_items_in_batch' in Mistral model #34575

Unhandled 'num_items_in_batch' in Mistral model #34575

Comments

gheinrich commented Nov 2, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

arivero commented Nov 29, 2024

github-actions bot commented Dec 23, 2024