-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introducing the Lux model for parameters and state #12
Conversation
Thanks for the PR. For the main points it looks good! I did not look the all inner workings of the Also, I would like this to become a breaking release (v0.2). |
@tjjarvinen Thank you for your feedback. Waiting on these bug corrections then. |
This is now blocking for me to proceed with the new ace models. Would be great to get it reviewed and merged. |
You can do review it too! We need couple of them anyways. I will try to go trough the macro. But I presume it is in good shape. The macro does need more documentation on how it works. But we can do that in the future. |
Of course I will as soon as somebody has done a first pass please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for taking this long, but there are a lot of points to check with this.
I am ok with the idea and marked some points on the code that could be looked onto.
But then I have also two bigger issues. One onto which I would like to have some comments from others too. And other that is just implementing tests.
1 Lux interface
We add state and parameters and in the return structure we give out state.
For us this is logical. We return internal state (electronic structure) and we can then calculate some additional features out of it. Parameters like Lennard-Jones terms are not returned.
This is a contradiction to Lux where state is the one you applie learning onto - we don't want to learn internal state, but instead we want to learn parameters.
So, what I am saying here is that the Lux model does not apply straight to our case and then the question is how do we deal with this?
2 Tests
If we are defining the new interface to return state
and the function calls to allow calls with 5 positional arguments, then we need to have tests for these also.
|
||
# Low-level call, implemented by the user. | ||
AtomsCalculators.calculate(AtomsCalculators.Energy(), hydrogen, LowLevelCalculator(nothing), | ||
state = 1.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this is not a correct call. state = 1.0
is now considered as a keyword and not the argument state.
julia> AtomsCalculators.calculate(AtomsCalculators.Energy(), hydrogen, LowLevelCalculator(1), 1, state=2)
(energy = 0.0 Eₕ, state = nothing)
#but
julia> AtomsCalculators.calculate(AtomsCalculators.Energy(), hydrogen, LowLevelCalculator(1), 1, 2)
(energy = 0.0 Eₕ, state = 2)
I think the state could in principle contain part of the output of the model, provided it is invalidated on the next call, or when the model is called with a new input. I actually quite like this usage of at1 = # some structure
at2 = # another structure
v1, st = virial(at1, calc, ps, st)
# st.virial, st.energy, st.forces are all computed during this call.
E1 = energy(at1, calc, ps, st)
# no new computation to be done, just return st.energy
# (same with forces)
E2 = energy(at2, calc, ps, st)
# new input structure at2, so the content of st must be invalidated. To be clear, I'm not advocating for this yet. I am just thinking out loud -- is this the intention? And if yes, I don't really see how it contradicts the Lux model?
I don't understand this statement. In My understanding of Lux is that you do output, st = model(input, ps, st) you learn parameters |
Re tests, I agree of course. |
The issue here is most likely what the terms mean. Like if one implementation uses state for learning and other does not. Will this lead to some issue? I am concerned that this leads to some problems in the future. In any case we should have more documentation that defines on what to do with these two ( |
And one thing I forgot from the review. We need functions |
I agree with that. I already have those in ACEpotentials and could switch over to extending methods here. |
Exactly, and this is the intention in this PR. Just to clarify (inspired from Why we wrote Lux:
Hope this clarifies the intent. |
@tjjarvinen and @cortner Thans for the feedback! Concerning the I can write some more documentation and examples to clarify the situation. |
I thought we had agreed to not do that. But I'm totally open to revisiting this. The reason we felt we can have a mixed model is that users never want to see An alternative approach that was also discussed and which might be preferrable is to provide a simple wrapper struct StatefulCalculator
calc
ps
st
end I am not able to anticipate all potential problems with either approaches. I immagine both have advantages and disadvantages. |
We also have the high level interface that needs to have parameters and state in the calculator structure. So, we need to have the function calls to extract the state and parameters too. If we go for the pure Lux formulation then we need to separate the calculators to low and high levels, which is not good. |
I think at a minimum there is no harm - for now - in allowing calculators to store ps and st. Provided the low level interface strictly follows Lux rules. If we learn otherwise we can revisit this discussion. |
While calculators contain |
In my memory we also advocated for a mixed model, where high-level is convenient and low-level is lux-like. This of course has a little bit the risk of suggesting some wrong expectations about how the mechanics works, so we have to be strict in code examples and document properly. I would suggest for example that the low-level calls ignore internal state and parameters and that the high-level calls do not take state and parameters at all, but require you to externally take care of updating the state yourself. I.e. I advocate a |
I agree with the above comments about the state and parameters issue.
So my implementation suggestion is the following:
If you agree with those points, I'll begin working on an implementation. |
We should also make sure that low level calculators work with high level calls and force it in the test set. For usability point it is very important that all low level calculators work with high level calls. |
Agreed. Do you think that what is currently being done in It seems to me that these tests cover both low-level and high-level calls for energy, forces and virial. |
Can somebody give me a brief summary what is still needed to finish this PR please? |
@@ -32,6 +57,8 @@ Each of the individual calls have two common inputs: `AtomsBase.AbstractSystem` | |||
- First input is either `Energy()`, `Forces()` or `Virial()` | |||
- Second is `AtomsBase.AbstractSystem` compatible structure | |||
- Third is `calculator` structure | |||
- Fourth is `parameters` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flag parameters and state and optional?
I didn't see anything in the documentation or the PR about providing calls of the form frc, vir = calculate( ( Forces(), Virial(), ), sys, calc, ...) Is this not intended to be supported? From the user's point of view the most common calls will be
|
It is also not entirely clear to me how to make the frc = # allocate
frc, vir = calculate!( (forces = frc, virial = nothing), sys, calc, ...)
E, frc, vir = calculate!( (energy = nothing, forces = frc, virial = nothing), sys, calc, ...) To be clear, I'm not happy with the If I missed something when skimming the PR, please point it out. Was the |
Calculate interface does not have these at this moment. It has always been discussed as what to do, but not implemented. You can call res = AtomsCalculators.energy_forces(sys, calc; kwords...)
res.energy # energy
res.forces # forces
# note there can be something else too - this is allowed by the interface
res = AtomsCalculators.energy_forces_virial(sys, calc; kwords...)
res.energy # energy
res.forces # forces
res.virial # virial
# note there can be something else too - this is allowed by the interface Also, the order might not be There are no |
One more comment - I am getting the impression that we are giving calc developers the option of implementing either interface. My first instinct is that this is not helpful. It will create confusion on which one to implement. I think there should be a very clear default, and anybody not following that proceeds at their own risk. Ideally that default is relatively simple. My personal preference is that this should be a non-allocating low-level interface involving params and states and that the everything else gets derived from that automagically. (e.g. through macros) |
The way I wrote the call, the order would become unambiguous so one could in principle enforce it. It may or may not be a good idea but it is possible.
Yes, I'm aware. I'm just listing what are typical use-cases. For most use-cases this might not be needed since energy normally doesn't cost anything. But what about a calculator that has no energy? E.g. force mixing. You could return |
The issue with only having low level (lux call) that is implemented, is that developers then need to understand lux and that will just give new people impression that the interface is just for esoteric persons. Basically what they mostly want is just an easy way to create a custom calculator and get going, and not to bother with lux stuff. So, the whole point here is that you can do it easy, and if you need the extra features the low level interface offers, then you can put the extra effort to implement it. |
You are probably right. But the documentation needs to be crystal clear about it. |
I think I've been derailing the discussion here a little bit and I apologize for this. I'm just concerned that with the slow speed of development and review I don't want another major revision of the interface immediately after merging the current one. Having thought about this again a bit longer, could there be an alternative way to make calculators non-allocating by allowing the caller to pass in pre-allocated forces as a keyword argument? It would look something like that. # Implementation
function forces( sys, calc, ...; frc = zero_forces(sys, calc), kwargs...)
# ...
return frc
end
# allocating usage
frc = forces(sys, calc)
# non-allocating usage
frc = # allocate by whatever means, e.g. bumper
forces(sys, calc; frc = frc) I think this would have the huge advantage of requiring only one interface and we can get rid of I haven't thought through what the best option could be for the |
There are problems with using keywords like that. First, the keyword Currently the intended interface usage is that keywords are used to give control commands to individual calculators. Calculators can ignore these, if they wish. Arguments on the other hand are intended to be reacted by every calculator. Second issue is that, are there any Julia programs that mutate keywords? Or is it safe from language point? suggestion for the issueThat being said, could you simply do function myforces(sys, calc, ...; frc = zero_forces(sys, calc), kwargs...)
return forces!(frc, sys, calc; kwargs...)
end You could do that in your own package without need to change AtomsCalculators interface itself. We could even have this call in AtomsUtilityCalculators with the current version of AtomsCalculators. calculator interfaceFor calculator interface I suggest doing mutable struct Forces{T}
f::T
# add builder that makes sure f is either nothing or valid force array
end
# allocating call
F_allocating = Forces(nothing) # or simply F()
calculate(F_allocating, sys, calc, ...; kwargs...)
# non allocating call
F_non_allocating = Forces( zero_forces(sys, calc) )
calculate(F_non_allocating, sys, calc, ...; kwargs...) This would need an changes in calculator implementation and we would need to change the macro etc. |
You could also write a macro that would do something like @forces sys calc f_pre_allocated (args...) (kwargs...) |
Part of my proposal was to avoid having too many interface functions. if it were just about First, I don't see the problem of requiring all calculators to accept the But secondly, I think one could even do this without that requirement. Once can simply make it clear in the interface that there is no guarantee that the returned forces are the same array as was passed in and therefore to require usage as follows # Implementation
function forces( sys, calc, ...; frc = zero_forces(sys, calc), kwargs...)
# ...
return frc
end
# allocating usage
frc = forces(sys, calc)
# non-allocating usage
frc_alc = # allocate by whatever means, e.g. bumper
frc = forces(sys, calc; frc = frc_alc) and analogous usage of I actually think this doesn't even require any change to the interface, only documentation and can be entirely optional. To be clear I think it would be kind of a corner-case usage where the allocation of the force vector creates a measurable slow-down. But I have encountered such cases and would therefore like to have that option if possible without creating too much additional complexity. |
Yes, you can implement that with the current implementation. That is completely valid, but optional. |
It covers it well with the current interface. But the new calculator interface returns state and that has to be taken into account. Also, some kind of tests that it is possible to have state and parameters, as arguments need to be implemented too. |
Ok, then I'll include those tests. |
To summarize the discussion up to now:
While I agree that 3. is important, I feel it is a separate issue, and that it can be somehow adressed by using states. |
@cortner I am available on coming Wednesday or Thursday for a call. |
I'll look at my schedule and get back to you on Zulip. Regarding "most" - you are thinking of electronic structure. In the empirical and ML potentials world this is not normally done. |
parameters through `potential_energy`, `forces`, `forces!` and `virial` and low-level calls | ||
with user-specifiable parameters through `calculate`. | ||
|
||
The low-level `calculate` interface follows the [Lux](https://lux.csail.mit.edu/stable/) model for parameters and state. This means that calculators are **immutable** structs that are passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calculators will NOT be immutable. But they need to ACT as immutable. This is a subtle but important difference.
@CedricTravelletti -- following up from our discussion ... to make it easier for me to contribute to this PR (especially docs), please create a new |
@cortner Ready for branch transfer. I will add yesterday's meeting notes to the new PR discussion. |
This introduces state inside AtomsCalculators by follwing the Lux model.
This PR is a followup to the Zulip discussion following the April 10th 2024 CESMIX and JuliaMolSim community meeting. It also builds on the discussion in #11 and adresses the same needs.
Philosophy
The basic idea behing this PR is to separate high-level and low-level calls. High-level calls are defined as the ones to
potential_energy
,forces
,forces!
andvirial
, whereas the low-level ones are the ones tocalculate
.High-level calls preserve backward compatiblity, while low level calls now follow the Lux model, by using immutable calculators, accepting parameters and state arguments and returning state.
Detailed documentation is bundled with the PR. We refer to it for details.