Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove e3nn version pin #555

Open
Andrew-S-Rosen opened this issue Aug 17, 2024 · 17 comments
Open

Remove e3nn version pin #555

Andrew-S-Rosen opened this issue Aug 17, 2024 · 17 comments
Labels
enhancement New feature or request

Comments

@Andrew-S-Rosen
Copy link
Contributor

This is a feature request to ultimately remove the version pin of e3nn==0.4.4 in setup.cfg to allow for more flexible versions (if possible).

@ilyes319
Copy link
Contributor

ilyes319 commented Aug 22, 2024

I tested it, and we seem to support the latest e3nn version. This will not be an easy decision, as it could weaken code interoperability, especially given how poorly TorchScript behaves. In which context is this problematic for you?

@Andrew-S-Rosen
Copy link
Contributor Author

Andrew-S-Rosen commented Aug 22, 2024

In my personal opinion, interoperability should be dependent on the individual packages being used. So, if mace supports all versions of e3nn but Package B doesn't, then Package B should be the one with a specific restriction on e3nn. But I understand that reality is a bit messier than this.

For me, it is not a huge issue. I package several machine learning workflows in my code quacc, which includes but is not limited to MACE. I allow users to pip install quacc[ml] to install the ML dependencies, but because mace is pinned to a specific version of e3nn, and some other libraries require e3nn>0.4.4 (I unfortunately don't remember which), they couldn't be installed together even though they are potentially compatible.

Edit: It was @zulissimeta who originally reported this to me.

@corinwagen
Copy link

Agreed, this would be really nice. e3nn==0.4.4 is now almost three years old, and newer packages like fairchem require e3nn>=0.5.0.

@ilyes319
Copy link
Contributor

It is unlikely we will support it in the short term because it breaks backward compatibility. You are welcome to install 0.5 or later, the package will still work.

@Andrew-S-Rosen
Copy link
Contributor Author

Andrew-S-Rosen commented Oct 19, 2024

Ah, that's unfortunate. I don't know that I agree with the philosophy here, but I can understand the motivation. In what way does it break backwards compatability? Do energies change with the newer version of e3nn? If so, shouldn't mace want to use this presumably more accurate model architecture? If it is incompatible, a major version bump with an appropriate CHANGELOG would hopefully address this.

It is not practical to install 0.5 or later from a package management standpoint. I can't put e3nn>=0.5 in pyproject.toml, for instance, because that would conflict with mace if it is also in there.

@gabor1
Copy link
Collaborator

gabor1 commented Oct 19, 2024

Can someone explain to me the issue? Surely all packages should allow the broadest possible (tested) set of versions to maximise flexibility within the constraint of correctness

@gabor1
Copy link
Collaborator

gabor1 commented Oct 19, 2024

This does not preclude defining a more restricted subset of packages for vanilla installs whose goal is to simplify the installation process and minimise package conflicts. But this goal should not be conflated with the goal of allowing flexibility of someone wants it

@ilyes319
Copy link
Contributor

e3nn changed the phase convention of the CG making the code not backward compatible and bunch of tests failing. There are ways to solve this, but it will be in the next release.

@Andrew-S-Rosen
Copy link
Contributor Author

Ah okay, thanks for the clarity! I had assumed based on your prior comment that everything was still passing. Of course, failing tests should be a clear sign not to change the version yet.

@ilyes319
Copy link
Contributor

The plan is to release 0.3.7 when all the bugs are fixed and then bump the version to 0.4.0 and update e3nn version to the latest, updating the tests. I would say a 2 months timeline.

@corinwagen
Copy link

Thanks for the update; makes total sense!

@ilyes319 ilyes319 added the enhancement New feature or request label Oct 30, 2024
@bkmi
Copy link

bkmi commented Dec 3, 2024

I'm curious if there is progress on this. I imagine it's a rather large change actually. All models would either have to be retrained, or the basis transformation would have to be applied regularly throughout the network.

Any thoughts?

@vsimkus
Copy link

vsimkus commented Feb 14, 2025

I'll add that because of the dependency on e3nn==0.4.4 mace currently cannot be used with torch>=2.6.0. The cause is this line in e3nn==0.4.4, which breaks with torch>=2.6.0 due to a vulnerability patch that breaks backwards compatibility of torch.load. This is addressed in e3nn>=0.5.0.

@ilyes319
Copy link
Contributor

I have added a patch in 3.10 and mace perfectly works with 2.6. Did you actually get an error from the latest main?

@vsimkus
Copy link

vsimkus commented Feb 17, 2025

Yes, it fails when trying to import MACECalculator due to from e3nn import o3.

In [2]: import torch
   ...: print(torch.__version__)
   ...: import e3nn
   ...: print(e3nn.__version__)
   ...: import mace
   ...: print(mace.__version__)
   ...: from mace.calculators import MACECalculator
2.6.0
0.4.4
0.3.10
---------------------------------------------------------------------------
UnpicklingError                           Traceback (most recent call last)
....
File ~/miniforge3/envs/core/lib/python3.11/site-packages/mace/calculators/mace.py:17
     15 from ase.calculators.calculator import Calculator, all_changes
     16 from ase.stress import full_3x3_to_voigt_6_stress
---> 17 from e3nn import o3
     19 from mace import data
     20 from mace.cli.convert_e3nn_cueq import run as run_e3nn_to_cueq
....
UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, do those steps only if you trust the source of the checkpoint.
....

@ilyes319
Copy link
Contributor

that should work, it is in the CI. Can you try to train?

@vsimkus
Copy link

vsimkus commented Feb 18, 2025

I have added a patch in 3.10 and mace perfectly works with 2.6. Did you actually get an error from the latest main?

Ah yes, sorry your previous comment confused me. It works on main but the patch is not in 0.3.10. Thanks!

Looking forward to 0.4.0 where this is properly addressed allowing newer versions of e3nn!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants