Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Handle num_items_in_batch in Mistral's forward
This PR enables handling loss keyword arguments in the Mistral forward() method. Specifically, if `num_items_in_batch` is passed, the value is used to properly normalize the loss value. This relates to the Gradient Accumulation fix (#34191) Fixes #34575
- Loading branch information