-
Notifications
You must be signed in to change notification settings - Fork 27.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing timestamp offset using Whisper with pipeline and sequential decoding #34210
Comments
cc @eustlb, since you're working on other Whisper fixes :) |
Any fix ? |
@ylacombe can you see a quick fix here? If it's more of a pipeline issue and you don't have any ideas, let me know and I can take it |
Hey @dintifla and @dineshveguru , thanks for your message. cc @eustlb, seems linked to #34537 but it's not exactly the same issue, any ideas why it happen ? |
Hey @ylacombe As far as I analyzed it is because
hence, the calculation in your linked PR works differently. |
Indeed @dintifla, you're right on the cause, I'll dig into this tomorrow |
Turns out it's not because The error you observed actually happens because To fix this, we have to be careful about how we set the time offset. @eustlb, you've worked a lot on this in #34537. Could you a TL;DR on the different cases we can observe? Notably, it seems that we're facing the case:
where |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
This issue still persists (v4.47.1). Please re-open it. |
Fixed in #35750 that will be merged ASAP! Thanks a lot for raising this issue, and thanks a lot for your patience 🤗 |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
System Info
transformers
version: 4.45.2Who can help?
@Rocketknight1 @gante @ylacombe
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
pip install transformers==4.45.2
Setup a Whisper pipeline using
chunk_length_s=0
(which is sequential long-form decoding according to the model card (at least for large-v3)) andreturn_timestamps=True
Transcribe an audio longer than 30s
See that the timestamps start at 0.0s after 30s
Expected behavior
The timestamps should be correct, also if the audio is longer than 30s (as if the chunked-algorithm is used):
The output is from above script using
chunked=True
The text was updated successfully, but these errors were encountered: