-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Formalize template rendering context #4823
Conversation
Signed-off-by: Christopher Maier <cmaier@chef.io>
Previously, the data we made available to users in Habitat templates was based directly on the JSON serialization of several internal data structures. This needlessly coupled internal workings of Habitat with the public interface we expose to users. This often made it difficult to know exactly what data was presented to users, and made it difficult to safely change that data. Here, we explicitly decouple the internal data structures of the Supervisor from what is exposed to users. To do this in an efficient and clear way, we introduce a number of "templating proxies", which effectively wrap internal data structures, holding on to their relevant data using `Cow` types to reduce memory footprint (this also makes it easier to instantiate the templating proxies in tests, since we can simply provide owned copies of the important data). Not only does this give us a clear, typed way to refer to the data, but it allows us to implement `Serialize` on these types, which lets us control what is exposed to users in a very fine-grained and targeted way. For instance, `{{svc.me.persistent}}` should really be `{{svc.me.permanent}}`; before this change, we would need to introduce another field to `census::CensusMember` named `permanent`, and it would be unclear why. Now, however, we can simply modify the templating proxy's `Serialize` implementation to render that same field multiple times under different names, making schema evolution and deprecation much easier to reason about. Note, however, that no user-facing changes are introduced in this commit; it is a pure refactoring. By clearly separating out the templating data, we now have an opportunity to revisit some of the internal data structures that were previously serving "double duty"; it may be possible to simplify them now. Additionally, this commit introduces a JSON Schema definition of the entire render context. This is currently used in tests to validate our code, but could be used in the future as a source for generated documentation on the website, and possibly as part of a future template validation tool. Fixes #4761 Signed-off-by: Christopher Maier <cmaier@chef.io>
@@ -33,7 +33,7 @@ const PATH_KEY: &'static str = "PATH"; | |||
static LOGKEY: &'static str = "PK"; | |||
|
|||
#[derive(Clone, Debug, Deserialize, Serialize)] | |||
pub struct Env(HashMap<String, String>); | |||
pub struct Env(pub HashMap<String, String>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scanning other changes I can't see why this was made public. Btw, this struct does implement Deref<Target = HashMap<String, String>>
so you could call deref()
to get at it's insides if you need it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was actually the opposite; I have a HashMap
, but I needed to turn it into an Env
for testing, without having to mock out an installed package with a RUNTIME_ENVIRONMENT
file.
Should have implemented From
, I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I'd suggest, yep!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
struct SystemInfo<'a> { | ||
version: Cow<'a, String>, | ||
member_id: Cow<'a, String>, | ||
ip: IpAddr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are the IpAddr
s, ports and permanent
not Cow
types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's because IpAddr
, integers, and bool all implement Copy
so this satisfied the compiler just fine. @cm I don't think memory really matters here since we're talking 1 to ~16 bytes per each of these types, but you do mean to point to something allocated elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, they all implement Copy
... I'll make them Cow
s for consistency (and future safety).
where | ||
S: Serializer, | ||
{ | ||
let mut map = serializer.serialize_map(Some(20))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing there's not any good way to derive this 20
or even catching that it needs updating if the Package
struct changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, but I'll review the Serde docs again to make sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I looked a bit and it wasn't terribly clear. It said that some serializers needed to know the size a priori, but what happened if it was not required and potentially differed from the number actually added seemed ambiguous.
My guess is that it's just a hint and if it's too low we'll just incur an additional alloc, but I think to really know we'd have to use my old master's advice: "use the source, Luke"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it sounds like "some serializers" need the hint, but others' don't. Apparently, the JSON serializer is one of the latter, as I can pass None
as the hint, and all the tests still pass.
If we were just doing a straight serialization of the struct, then we could just use the number of fields in the struct (though really, we'd probably just derive the serialization at that point). However, I envision adding new serialized fields that may be derived from existing fields (e.g., renaming of some fields, while keeping the old one for backwards compatibility, etc.), which means the value won't be so easily computable, apart from a programmer counting.
These types are pretty strongly JSON focused now, seeing as how they're ultimately validated by JSON schema 😄... we could just drop the size hint for these implementations. I know that we do add the hint in our other Serialize
implementations, though... probably a good practice to keep them, just for future-proofing's sake; I don't know what formats we may end up wanting to serialize to in the future.
.members() | ||
.iter() | ||
.map(|m| SvcMember::from_census_member(m)) | ||
.collect::<Vec<SvcMember<'a>>>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to build with just
.collect(),
Presumably the from_census_member
is a strong enough hint for the type-inference to work. Is it valuable to be more explicit here, or would it be preferable to change to the simpler version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code went through several iterations before I landed on the final implementation... one of those previous iterations required the type hint, but I guess the final one doesn't!
Thanks for catching this; I'll remove it.
Signed-off-by: Christopher Maier <cmaier@chef.io>
A previous incarnation of this code needed the type hint; this incarnation doesn't. Signed-off-by: Christopher Maier <cmaier@chef.io>
These `Serialize` implementations are all explicitly focused around JSON serialization, which does not require a length hint for map serialization. By using `None`, we remove one more thing a programmer has to remember to change when updating these implementations in the future. Signed-off-by: Christopher Maier <cmaier@chef.io>
This preserves the information-hiding we want in real code, but allows an escape hatch for easily creating test instances without having to mock out an entire package installation. Signed-off-by: Christopher Maier <cmaier@chef.io>
1ccb918
to
4a06d0e
Compare
This fixes a subtlety introduced with the addition of a `Health` field to the `CensusMember` struct (PR #4823). Previously, there was a small window where the health of a `CensusMember` was in an indeterminate state, before it's health was reconciled with Butterfly data. By using the `Health` enum, all new `CensusMember`s were created as `Alive`. Arguments could be made for the correctness or incorrectness of this decision, and we're all agreed that having four booleans to represent this data is not what we ultimately want. However, since we don't want to introduce subtle bugs into this part of the code in the course of solidifying our template data structures, we're backing this change out for now. Signed-off-by: Christopher Maier <cmaier@chef.io>
Previously, the data we made available to users in Habitat templates
was based directly on the JSON serialization of several internal data
structures. This needlessly coupled internal workings of Habitat with
the public interface we expose to users. This often made it difficult
to know exactly what data was presented to users, and made it
difficult to safely change that data.
Here, we explicitly decouple the internal data structures of the
Supervisor from what is exposed to users. To do this in an efficient
and clear way, we introduce a number of "templating proxies", which
effectively wrap internal data structures, holding on to their
relevant data using
Cow
types to reduce memory footprint (this alsomakes it easier to instantiate the templating proxies in tests, since
we can simply provide owned copies of the important data). Not only
does this give us a clear, typed way to refer to the data, but it
allows us to implement
Serialize
on these types, which lets uscontrol what is exposed to users in a very fine-grained and targeted
way. For instance,
{{svc.me.persistent}}
should really be{{svc.me.permanent}}
; before this change, we would need to introduceanother field to
census::CensusMember
namedpermanent
, and itwould be unclear why. Now, however, we can simply modify the
templating proxy's
Serialize
implementation to render that samefield multiple times under different names, making schema evolution
and deprecation much easier to reason about. Note, however, that no
user-facing changes are introduced in this commit; it is a pure
refactoring.
By clearly separating out the templating data, we now have an
opportunity to revisit some of the internal data structures that were
previously serving "double duty"; it may be possible to simplify them
now.
Additionally, this commit introduces a JSON Schema definition of the
entire render context. This is currently used in tests to validate our
code, but could be used in the future as a source for generated
documentation on the website, and possibly as part of a future
template validation tool.
Fixes #4761