Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ctc_loss: fix backward when 2d target tensor is larger than max_target_length #20971

Closed
wants to merge 6 commits into from

Conversation

t-vi
Copy link
Collaborator

@t-vi t-vi commented May 26, 2019

Previously, we didn't work when 2d target tensors had extra columns at the end. Now we just ignore those.
Also fix the confusion in the doc example regarding the number of classes.

Thank you, @ypw-rich for the report with reproducing example.

Fixes: #20522

@pytorchbot pytorchbot added module: autograd Related to torch.autograd, and the autograd engine in general module: cuda Related to torch.cuda, and CUDA support in general module: nn Related to torch.nn module: operators labels May 26, 2019
@soumith
Copy link
Member

soumith commented May 26, 2019

@pytorchbot rebase this please

@soumith
Copy link
Member

soumith commented May 26, 2019

@pytorchbot merge this please

@t-vi
Copy link
Collaborator Author

t-vi commented May 27, 2019

The windows failure looks real. :/

@t-vi
Copy link
Collaborator Author

t-vi commented May 27, 2019

I'm missing something, but how would gradcheck work for non-deterministic backwards?

https://github.com/pytorch/pytorch/blob/master/torch/autograd/gradcheck.py#L288

@t-vi
Copy link
Collaborator Author

t-vi commented May 27, 2019

So the test is flaky but not necessarily more flaky than before. Nonetheless fixing the flakiness after #20980 will be good.

facebook-github-bot pushed a commit that referenced this pull request May 29, 2019
…20980)

Summary:
gradcheck currently includes a determinism check (although only trying twice and seeing if results match).
This can lead to flaky tests, e.g. in #20971, but also #13818.
This adds nondet_tol for both gradcheck and gradgradcheck. It does not change / reenable any tests yet.
Pull Request resolved: #20980

Differential Revision: D15530129

Pulled By: soumith

fbshipit-source-id: 04d7f85b5b59cd62867820c74b064ba14f4fa7f8
@t-vi t-vi changed the title fix backward when 2d target tensor is larger than max_target_length ctc_loss: fix backward when 2d target tensor is larger than max_target_length May 29, 2019
@t-vi
Copy link
Collaborator Author

t-vi commented May 29, 2019

@pytorchbot rebase this please

@t-vi
Copy link
Collaborator Author

t-vi commented May 29, 2019

I think this is good to merge: The only change after approval is adding the nondet_tol to the gradcheck in the test. The CI seems is OK (i.e. the win failure is unrelated, the three pending checks are just waiting on GitHub's bookkeeping).

@t-vi
Copy link
Collaborator Author

t-vi commented May 29, 2019

@pytorchbot merge this please

@pytorchbot pytorchbot added the merge-this-please Was marked for merge with @pytorchbot merge this please label May 29, 2019
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

zdevito pushed a commit to zdevito/ATen that referenced this pull request May 29, 2019
…t_length (#20971)

Summary:
Previously, we didn't work when 2d target tensors had extra columns at the end. Now we just ignore those.
Also fix the confusion in the doc example regarding the number of classes.

Thank you, ypw-rich for the report with reproducing example.

Fixes: #20522
Pull Request resolved: pytorch/pytorch#20971

Differential Revision: D15535481

Pulled By: ezyang

fbshipit-source-id: 397e44e20165fc4fa2547bee9390d4c0b688df93
@facebook-github-bot
Copy link
Contributor

@ezyang merged this pull request in aa42742.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-this-please Was marked for merge with @pytorchbot merge this please Merged module: autograd Related to torch.autograd, and the autograd engine in general module: cuda Related to torch.cuda, and CUDA support in general module: nn Related to torch.nn open source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nn.CTCLoss RuntimeError on GPU
6 participants