Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Pytorch Lightning to stable release version #130

Closed
wants to merge 1 commit into from
Closed

Update Pytorch Lightning to stable release version #130

wants to merge 1 commit into from

Conversation

SeanNaren
Copy link

@SeanNaren SeanNaren commented Oct 31, 2020

As discussed in #128, moving lightning to a stable release version should bring stability for more users.

I tried to run the tests but it seems there are hardcoded paths within them, would it be possible to verify that this PL version works with the tests?

@ibeltagy
Copy link
Collaborator

ibeltagy commented Nov 3, 2020

Thank you, @SeanNaren, for the PR. I see that v1.0.4 doesn't solve the TPU checkpointing issue Lightning-AI/pytorch-lightning#2700 yet. It also seems that Lightning-AI/pytorch-lightning#2407 has been opened and closed multiple times, so I don't really know if it was fixed or not.

would it be possible to verify that this PL version works with the tests?

I don't think so. These are difficult issues to test using unit tests. I was hoping I can rely on the unit tests of PL but it seems they don't have enough coverage (e.g. resolved issues being reopened again), and that they are usually broken. Do you have better solutions other than manually trying everything?

@SeanNaren
Copy link
Author

hey thanks :) Didn't know those two issues were around.

I don't think so. These are difficult issues to test using unit tests
Do you have better solutions other than manually trying everything?

I guess so, I've been there! You do have tests in this folder though:

https://github.com/allenai/longformer/blob/master/tests/test_integration.py

Unfortunately PL test coverage won't cover any custom code built on top like models etc, so was hoping that your unit tests worked! PL has a large set of tests for all environments, including TPUs.

This is certainly fixed and there hasn't been activity since 1.0.0 (because its not an issue) Lightning-AI/pytorch-lightning#2407

Regarding Lightning-AI/pytorch-lightning#2700 this has been a long standing issue I admit! I think this is being worked on in Lightning-AI/pytorch-lightning#4309 I'll ping the author to get an update here.

Overall I think your experience with PL has been primarily before the 1.0 release. 1.0 is stabler and will provide users with a lot of fixes that you've probably not run into yet on your older version...

If you'd prefer to wait till the checkpointing issue is done and integrated into a minor release, I can update the branch when this happens. Or if you prefer to stay at 0.8 that's also fine, just let me know :)

@ibeltagy ibeltagy closed this Dec 2, 2020
@ibeltagy ibeltagy deleted the branch allenai:encoderdecoder December 2, 2020 06:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants