-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pytorch v2.6.0 #326
pytorch v2.6.0 #326
Conversation
…nda-forge-pinning 2025.01.18.07.29.32
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( I do have some suggestions for making it better though... For recipe/meta.yaml:
This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/13297069508. Examine the logs at this URL for more detail. |
This looks better than expected so far. Still have to double check dependency changes. Happy if someone could do that (even if it's just noting which bounds changed relative to the current recipe) |
Aarch builds fail with
|
don't rely on PKG_BUILDNUM resolving this correctly, which is either racy, or implicitly depends on a separate render pass after setting build.number
Sigh, since when is conda-build applying patches through
|
bd0bec7
to
022f063
Compare
otherwise conda breaks ``` conda_build.exceptions.RecipeError: Mismatching hashes in recipe. Exact pins in dependencies that contribute to the hash often cause this. Can you change one or more exact pins to version bound constraints? Involved packages were: Mismatching package: libtorch (id cpu_generic_habf3c96_0); dep: libtorch 2.6.0.rc7 *0; consumer package: pytorch ```
on osx-64
skip? |
Yeah, this minor accuracy violation indeed sounds skippable, but I've deprioritised this PR until we get the windows builds for 2.5 fixed (and ideally your #318 merged as well). |
ok, good to know |
worth pointing out that as of 6 days time, pypi will have an up to date pytorch package whereas conda won't. Will have a look at that other PR |
Are you talking about rc's, or are we not looking at the same index? 2.6.0 GA hasn't been published AFAICT. Or are you saying that 2.6.0 will be released in 6 days? In any case, this is no reason to rush. We didn't have windows packages for years, and I'm more concerned about fixing them, than lagging behind the PyPI release a bit (and we've often lagged for months in the past; this has gotten much better with the open-gpu server, but it still happens; 2.5.0 was released Oct 18th last year, we had first builds on Nov 3rd). |
yes
100% |
I think https://github.com/conda-forge/triton-feedstock might need to be updated as part of this. Some info on which commit to use here: pytorch/pytorch#145120 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look over and can't see anything I disagree with.
Agree that the tests and also the patches are getting quite extensive. Suggest that we work to reduce that in future PRs perhaps.
I don't think it's that bad TBH. We're skipping 35 tests (out of 8000+), across all architectures, and the vast majority of those skips are specific to one or two variants in our matrix. That's roughly 0.1-0.2% of tests getting skipped on a given platform.
💯 I've started an effort to reduce the skips (see #353 & #353). Reducing the patches would need a dedicated effort on the part of the respective authors to upstream their fix. I'll try upstreaming the Overall, both the patches and the skips are aiming to be the minimal set necessary to successfully build and test pytorch. I'd love to see them reduced as fast as possible, but for now I don't have a better alternative. |
CI will finish in about 7h - could I get some approvals or review until then? 🙏 Or at least a written "ok to proceed"? 🙃 |
For now, I haven't reviewed in-depth enough to be able to say "approval", I think - could do that tomorrow if it's needed. |
Co-Authored-By: H. Vetinari <h.vetinari@gmx.com>
Suggested-By: Michał Górny <mgorny@gentoo.org>
Oh boy, nevermind. Something is very wrong on linux+CUDA:
It seems that many (56) are due to |
There's another failure that looks like it could have been due to the OOMs, but surprisingly, it fails in exactly the same way on MKL/openblas, and with exactly the same (catastrophic) accuracy violation:
stacktrace
|
A lot of the skips look like they're there to cope with CI limitations, basically |
Sure, some portion of those are unavoidable, but basically all skips point at an issue that should be fixed in our packaging, or ideally upstream (in the test framework and/or the actual implementation). I kinda prefer having that transparency (and as a reminder that there's more work to be done), than simply removing the test modules in question and declaring it a done deal. For example, the torchinductor tests you added in #318 were painful to get passing, but they did show that the |
Since the linux64+openblas+CUDA job had already passed for f14feb5 (with no essential changes since then), the only thing to do is to remove the pytorch-cpu-feedstock/recipe/meta.yaml Lines 502 to 503 in be20390
As such, I'm not planning to do a new push here (will shift the skip when merging). I'm planning to merge this in ~9-12 hours unless there are other comments @conda-forge/pytorch-cpu; I don't doubt that we'll keep iterating, so even post-merge comments won't take long to get picked up. |
🎉 thanks a lot for the huge amount of work on this over the last weeks @h-vetinari . This is really great. |
Build the release candidates
Linux CI cancelled until builds for #322 are live