Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formalize template rendering context #4823

Merged
merged 6 commits into from
Mar 29, 2018

Conversation

christophermaier
Copy link
Contributor

Previously, the data we made available to users in Habitat templates
was based directly on the JSON serialization of several internal data
structures. This needlessly coupled internal workings of Habitat with
the public interface we expose to users. This often made it difficult
to know exactly what data was presented to users, and made it
difficult to safely change that data.

Here, we explicitly decouple the internal data structures of the
Supervisor from what is exposed to users. To do this in an efficient
and clear way, we introduce a number of "templating proxies", which
effectively wrap internal data structures, holding on to their
relevant data using Cow types to reduce memory footprint (this also
makes it easier to instantiate the templating proxies in tests, since
we can simply provide owned copies of the important data). Not only
does this give us a clear, typed way to refer to the data, but it
allows us to implement Serialize on these types, which lets us
control what is exposed to users in a very fine-grained and targeted
way. For instance, {{svc.me.persistent}} should really be
{{svc.me.permanent}}; before this change, we would need to introduce
another field to census::CensusMember named permanent, and it
would be unclear why. Now, however, we can simply modify the
templating proxy's Serialize implementation to render that same
field multiple times under different names, making schema evolution
and deprecation much easier to reason about. Note, however, that no
user-facing changes are introduced in this commit; it is a pure
refactoring.

By clearly separating out the templating data, we now have an
opportunity to revisit some of the internal data structures that were
previously serving "double duty"; it may be possible to simplify them
now.

Additionally, this commit introduces a JSON Schema definition of the
entire render context. This is currently used in tests to validate our
code, but could be used in the future as a source for generated
documentation on the website, and possibly as part of a future
template validation tool.

Fixes #4761

Signed-off-by: Christopher Maier <cmaier@chef.io>
Previously, the data we made available to users in Habitat templates
was based directly on the JSON serialization of several internal data
structures. This needlessly coupled internal workings of Habitat with
the public interface we expose to users. This often made it difficult
to know exactly what data was presented to users, and made it
difficult to safely change that data.

Here, we explicitly decouple the internal data structures of the
Supervisor from what is exposed to users. To do this in an efficient
and clear way, we introduce a number of "templating proxies", which
effectively wrap internal data structures, holding on to their
relevant data using `Cow` types to reduce memory footprint (this also
makes it easier to instantiate the templating proxies in tests, since
we can simply provide owned copies of the important data). Not only
does this give us a clear, typed way to refer to the data, but it
allows us to implement `Serialize` on these types, which lets us
control what is exposed to users in a very fine-grained and targeted
way. For instance, `{{svc.me.persistent}}` should really be
`{{svc.me.permanent}}`; before this change, we would need to introduce
another field to `census::CensusMember` named `permanent`, and it
would be unclear why. Now, however, we can simply modify the
templating proxy's `Serialize` implementation to render that same
field multiple times under different names, making schema evolution
and deprecation much easier to reason about. Note, however, that no
user-facing changes are introduced in this commit; it is a pure
refactoring.

By clearly separating out the templating data, we now have an
opportunity to revisit some of the internal data structures that were
previously serving "double duty"; it may be possible to simplify them
now.

Additionally, this commit introduces a JSON Schema definition of the
entire render context. This is currently used in tests to validate our
code, but could be used in the future as a source for generated
documentation on the website, and possibly as part of a future
template validation tool.

Fixes #4761

Signed-off-by: Christopher Maier <cmaier@chef.io>
@@ -33,7 +33,7 @@ const PATH_KEY: &'static str = "PATH";
static LOGKEY: &'static str = "PK";

#[derive(Clone, Debug, Deserialize, Serialize)]
pub struct Env(HashMap<String, String>);
pub struct Env(pub HashMap<String, String>);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scanning other changes I can't see why this was made public. Btw, this struct does implement Deref<Target = HashMap<String, String>> so you could call deref() to get at it's insides if you need it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was actually the opposite; I have a HashMap, but I needed to turn it into an Env for testing, without having to mock out an installed package with a RUNTIME_ENVIRONMENT file.

Should have implemented From, I suppose.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I'd suggest, yep!

Copy link
Collaborator

@reset reset left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tenor-132321516

Exactly what we needed. Thank you for finishing this work!

struct SystemInfo<'a> {
version: Cow<'a, String>,
member_id: Cow<'a, String>,
ip: IpAddr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the IpAddrs, ports and permanent not Cow types?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because IpAddr, integers, and bool all implement Copy so this satisfied the compiler just fine. @cm I don't think memory really matters here since we're talking 1 to ~16 bytes per each of these types, but you do mean to point to something allocated elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, they all implement Copy... I'll make them Cows for consistency (and future safety).

where
S: Serializer,
{
let mut map = serializer.serialize_map(Some(20))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing there's not any good way to derive this 20 or even catching that it needs updating if the Package struct changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, but I'll review the Serde docs again to make sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I looked a bit and it wasn't terribly clear. It said that some serializers needed to know the size a priori, but what happened if it was not required and potentially differed from the number actually added seemed ambiguous.

My guess is that it's just a hint and if it's too low we'll just incur an additional alloc, but I think to really know we'd have to use my old master's advice: "use the source, Luke"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it sounds like "some serializers" need the hint, but others' don't. Apparently, the JSON serializer is one of the latter, as I can pass None as the hint, and all the tests still pass.

If we were just doing a straight serialization of the struct, then we could just use the number of fields in the struct (though really, we'd probably just derive the serialization at that point). However, I envision adding new serialized fields that may be derived from existing fields (e.g., renaming of some fields, while keeping the old one for backwards compatibility, etc.), which means the value won't be so easily computable, apart from a programmer counting.

These types are pretty strongly JSON focused now, seeing as how they're ultimately validated by JSON schema 😄... we could just drop the size hint for these implementations. I know that we do add the hint in our other Serialize implementations, though... probably a good practice to keep them, just for future-proofing's sake; I don't know what formats we may end up wanting to serialize to in the future.

.members()
.iter()
.map(|m| SvcMember::from_census_member(m))
.collect::<Vec<SvcMember<'a>>>(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to build with just

                    .collect(),

Presumably the from_census_member is a strong enough hint for the type-inference to work. Is it valuable to be more explicit here, or would it be preferable to change to the simpler version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code went through several iterations before I landed on the final implementation... one of those previous iterations required the type hint, but I guess the final one doesn't!

Thanks for catching this; I'll remove it.

Signed-off-by: Christopher Maier <cmaier@chef.io>
A previous incarnation of this code needed the type hint; this
incarnation doesn't.

Signed-off-by: Christopher Maier <cmaier@chef.io>
These `Serialize` implementations are all explicitly focused around
JSON serialization, which does not require a length hint for map
serialization. By using `None`, we remove one more thing a programmer
has to remember to change when updating these implementations in the
future.

Signed-off-by: Christopher Maier <cmaier@chef.io>
This preserves the information-hiding we want in real code, but allows
an escape hatch for easily creating test instances without having to
mock out an entire package installation.

Signed-off-by: Christopher Maier <cmaier@chef.io>
@christophermaier christophermaier force-pushed the cm/4761-refactor-rendering-context branch from 1ccb918 to 4a06d0e Compare March 29, 2018 15:28
@baumanj
Copy link
Contributor

baumanj commented Mar 29, 2018

babies-everywhere

s/babies/cows/

New changes look great!

@christophermaier christophermaier merged commit baf410a into master Mar 29, 2018
@christophermaier christophermaier deleted the cm/4761-refactor-rendering-context branch March 29, 2018 17:58
christophermaier added a commit that referenced this pull request Mar 30, 2018
This fixes a subtlety introduced with the addition of a `Health` field
to the `CensusMember` struct (PR #4823). Previously, there was a small
window where the health of a `CensusMember` was in an indeterminate
state, before it's health was reconciled with Butterfly data. By using
the `Health` enum, all new `CensusMember`s were created as `Alive`.

Arguments could be made for the correctness or incorrectness of this
decision, and we're all agreed that having four booleans to represent
this data is not what we ultimately want. However, since we don't want
to introduce subtle bugs into this part of the code in the course of
solidifying our template data structures, we're backing this change
out for now.

Signed-off-by: Christopher Maier <cmaier@chef.io>
@christophermaier christophermaier mentioned this pull request Oct 25, 2019
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants