-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Flat and differential runtime config files #6693
Conversation
Replace the JSON runtime config file per version with a base `parameters.txt` file plus differntial files per version. The `RuntimeConfig` remains untouched. But it is now loaded from the flat list of parameters, instead of through serde derived code. The direct benefit of this is better readibilty for config changes and an easier and less error-prone process of doing such changes. Indirectly, this could be the first step to further clean up code around gas costs. (1) the gas profile could use these parameters to track costs instead of its own `enum Cost`. (2) The params-estimator can link estimations to params more easily. Test plan --------- The usage of `RuntimeConfigStore` has not changed, only how it is created. Therefore, tests can ensure all stored `RuntimeConfig` versions are exactly the same as before. And this should be sufficient to confirm this change has no side-effects. To implement this, we keep the old JSON files in the `legacy_configs` directory. The test `test_old_and_new_runtime_config_format_match` then iterates through all existing versions and compares the serde deserialization with configs loaded from TXT files.
core/primitives-core/src/config.rs
Outdated
alt_bn128_g1_sum_element: read_parameter(params, Parameter::WasmAltBn128G1SumElement), | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's maybe move these bulky statements to the parameter
module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean all the constructors? (from_parameters
, test
and free
) Or just the new one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My latest commit moves all new constructors to parameter_table
but left other constructors where they are. Not sure if this is what you had in mind.
MaxLocalsPerContract, | ||
} | ||
|
||
pub fn load_parameters_from_txt(arg: &[u8]) -> BTreeMap<Parameter, String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub fn load_parameters_from_txt(arg: &[u8]) -> BTreeMap<Parameter, String> { | |
pub(crate) fn load_parameters_from_txt(arg: &[u8]) -> BTreeMap<Parameter, String> { |
I think it makes sense to avoid pub
as much as possible. Cross-crate deps are the only deps that matter, so they should stand out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I defined this in primitives-core
crate, where as all callers are in primitives
crate. So your suggestion doesn't work out of the box. But is also seems wrong to have code define on crate A and only call it in crate B.
However, the type Parameter
must be in primitives-core
because it should be the new source-of-truth replacing some types in core/primitives-core/src/profile.rs
. And primitives-core
is not allowed to depend on other crates, according to our architecture.md.
So, in my latest commit, I moved all the logic to construct over to primitives
. To still have the constructor syntax like VMLimitConfig::from_parameters(params)
, I defined an extension trait FromParameterTable
. (I can also rewrite to params.vm_limit_config()
if you think that's better.)
This refactor sort of works but also brings new problems. I previously liked that each config struct had all constructors (for testing and for production) right next to each other, whereas now they are defined in different crates. And there are other things, like wasmer2_stack_limit_default
that used to be private but now turned public in the refactor.
Not sure if it is an overall improvement or not.
I wrote up the motivation behind this in a separate issue: #6710 |
The new type `ParameterTable` wraps the BTreeMap<Parameter,String> that holds a list of parameter - value pairs. Move all logic around the parameter table to the primitives crate, as opposed to `primitives-core` where it was before. Also includes other small fixes from review comments.
The `ParameterTable` now contains `serde_json::Value` instead of a `String`. To fill it in, a simple custom parsing function is used. The method `runtime_config_json` converts the table to a JSON representation using the correct structure. Derived serde code then translates to a `RuntimeConfig`.
Okay, this is ready for re-review. You probably want to look through all files again but here is the summary of my changes.
I think the big-picture questions left for me are:
|
// We do not have "arbitrary_precision" serde feature enabled, thus we | ||
// can only store up to `u64::MAX`, which is `18446744073709551615` and | ||
// has 20 characters. | ||
if raw_number.len() < 20 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we happy with the behaviour here when somebody specifies 10^19 which is 20 digits long but is still 263.11 and fits into u64? serde_json
does have an error code for out-of-range numbers internally but sadly it isn't exposed :(
If not, I'm worried that serde_json might not actually be the right tool for us tbh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I think I am okay with that. Mostly because I am committed to not touch the RuntimeConfig
and its old deserialization code in this PR. This makes it easier to argue that this is a non-observable change, based on the tests provided.
What I find more troublesome is that values using #[serde(with = "u128_dec_format")]
must use a string, not a number, regardless of whether it fits in a u64
. I tend towards keeping it as-is for now and fixing it when needed. (It will fail in many tests if we try to use values that are shorter than 20 characters for a value parsed with u128_dec_format
.)
But I am curious, do you have alternatives in mind that could be used in our case instead of serde_json
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably isn't terribly difficult to implement a custom deserializer just for the values. An example of an implementation that does this is the aforementioned config-rs
(I encourage to at least skim throug the link here). It has handling for u128 built-in too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I see, a custom value parser would make sense. Also thanks for the link with the example.
As things stand right now, I don't see a strong motivation to do it right now. As long as we have JSON as intermediary, we have to store numbers as strings anyway and as long as we don't lose precision on the way, I think it doesn't matter too much where the cut-off point is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks pretty good. I imagine we can change the exact implementation of value deserializer later on if we find that json is not actually suitable for our needs.
That said, please give other reviewers some time to look through this still.
Thanks! Yes, I am not really in a hurry here. Since some people are taking time off this week, I will probably wait to merge this until next week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for making config updates more simple and transparent!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some minor style comments, but they are optional and can be followed-up!
@@ -0,0 +1,151 @@ | |||
# Gas economics config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a good idea to add a preamble along the lines of
# Protocol config for the *initial* version of the protocol.
# Diffs are applied to get a config at a specifiv protocol version
(note -- wording is very provisional, please refine it :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea, I've written something along those lines.
.expect("Failed parsing initial testnet config") | ||
} | ||
|
||
pub(crate) fn from_parameters_txt(txt_file: &str) -> Result<RuntimeConfig, InvalidConfigError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
txt_file
is a bad name:
- ambiguity between Path and String (contents of the file)
- this doesn't have to be File
Maybe just impl FromStr for RuntimeConfig
and impl FromStr for ParameterTable
?
Also, does this need to be pub(crate)
? Another alternative would be to just new(t: ParameterTable)
, if this doesn't increase visibility of the table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done impl FromStr for ParameterTable
and impl FromStr for ParameterTableDiff
.
But for RuntimeConfig
, I went with new
, also replacing fn from_parameters(params: &ParameterTable) -> Result<Self, serde_json::Error>
on the way.
if value.is_empty() { | ||
return Ok(serde_json::Value::Null); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this check feels like somethig worth pushing to the caller perhaps. Just a gut automatic reaction, maybe this is wrong in context :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about this. Most of the parameters are number types and I would rather have it as null
when they are unspecified. Also, serde will not like that, I think.
} | ||
|
||
#[test] | ||
fn test_old_runtime_config_data() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice assuarance that we didn't change values in the process of migration! Let's remove it once the PR is merged (that is, I think the only value of this test is to check this specific PR, and, once that it, wec an remove the test without the loss of confidence? )!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
`ci.sh`: No longer used because the buildkite pipeline has ben archived. The same task, among others, is performed by continuous estimation. `compare_costs.py` and `safety_multiplier.py`: Both operate on the replaced JSON runtime config format. We have not used these scripts in a long time, since estimations did not produce complete JSONs anyway. But there always was the chance that we might go that way again. Now, however, that near#6693 replaced the JSON format, these scipts are truely obsolete and better of deleted.
`ci.sh`: No longer used because the buildkite pipeline has ben archived. The same task, among others, is performed by continuous estimation. `compare_costs.py` and `safety_multiplier.py`: Both operate on the replaced JSON runtime config format. We have not used these scripts in a long time, since estimations did not produce complete JSONs anyway. But there always was the chance that we might go that way again. Now, however, that #6693 replaced the JSON format, these scripts are truly obsolete and better of deleted.
Replace the JSON runtime config file per version with a base
parameters.txt
file plus differential files per version.The
RuntimeConfig
remains untouched. But it is now loaded from theflat list of parameters, that converts it to JSON structured data following
the exact layout we use in code, that is then converted to an actual struct
using serde derived code.
The new TXT files are written in a custom format for key-values separated
by a column. Keys must match a defined parameter and values are strings
or integers. Differential files include old and new values with a
->
betweenthem.
The direct benefit of this is better readability for config changes and
an easier and less error-prone process of doing such changes.
Indirectly, this could be the first step to further clean up code
around gas costs.
(1) the gas profile could use these parameters to track costs instead
of its own
enum Cost
.(2) The params-estimator can link estimations to params more easily.
Test plan
The usage of
RuntimeConfigStore
has not changed, only how it iscreated. Therefore, tests can ensure all stored
RuntimeConfig
versions are exactly the same as before. And this should be sufficient
to confirm this change has no side-effects.
To implement this, we keep the old JSON files in the
legacy_configs
directory. The test
test_old_and_new_runtime_config_format_match
theniterates through all existing versions and compares the serde
deserialization with configs loaded from TXT files. This is all we need to
verify correctness of the parameters loaded through the new files.
The new TXT file format also comes with custom parsing logic. Therefore,
a few unit tests to make sure it behaves the way we think in all
circumstances.