Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structured Tracing Serialization/Deserialization #2113

Closed
wants to merge 21 commits into from

Conversation

jamesmunns
Copy link
Contributor

@jamesmunns jamesmunns commented May 9, 2022

This implements a more structured version of tracing-serde, for now in a separate folder as tracing-serde-structs. This is intended to eventually replace tracing-serde (perhaps within this PR).

At the moment, this is almost functionally compatible with the previous use cases of tracing-serde, which seems to only/mainly have been used to produce JSON representations of tracing data.

This change allows for:

  • The ability to serialize and deserialize tracing data - though this is one-way, once you serialize FROM tracing-core types TO tracing-serde types, you can only ever deserialize that BACK into tracing-serde types.
  • The ability to support non-json formats, like postcard
  • The ability to deserialize on std and no_std targets

My primary use case for this is as mentioned in #275, though it may help for other cases, such as for IPC.

When comparing with my (very basic) end-to-end test located in this repo, I see the following "format diff" for serde-json:

- '{"name":"outer_span","target":"tracing_playground","level":"TRACE","module_path":"tracing_playground","file":"src/main.rs","line":34,"fields":[],"is_span":true,"is_event":false}'
+ '{"name":"outer_span","target":"tracing_playground","level":"TRACE","module_path":"tracing_playground","file":"src/main.rs","line":34,"fields":[],"is_span":true,"is_event":false}'
- '{"metadata":{"name":"outer_span","target":"tracing_playground","level":"TRACE","module_path":"tracing_playground","file":"src/main.rs","line":34,"fields":[],"is_span":true,"is_event":false},"parent":null,"is_root":false}'
+ '{"metadata":{"name":"outer_span","target":"tracing_playground","level":"TRACE","module_path":"tracing_playground","file":"src/main.rs","line":34,"fields":[],"is_span":true,"is_event":false},"parent":null,"is_root":false}'
- '[1]'
+ '{"id":1}'
- '[1]'
+ '{"id":1}'
- '{"name":"my span","target":"tracing_playground::do_thing","level":"TRACE","module_path":"tracing_playground::do_thing","file":"src/main.rs","line":74,"fields":[],"is_span":true,"is_event":false}'
+ '{"name":"my span","target":"tracing_playground::do_thing","level":"TRACE","module_path":"tracing_playground::do_thing","file":"src/main.rs","line":74,"fields":[],"is_span":true,"is_event":false}'
- '{"metadata":{"name":"my span","target":"tracing_playground::do_thing","level":"TRACE","module_path":"tracing_playground::do_thing","file":"src/main.rs","line":74,"fields":[],"is_span":true,"is_event":false},"parent":null,"is_root":false}'
+ '{"metadata":{"name":"my span","target":"tracing_playground::do_thing","level":"TRACE","module_path":"tracing_playground::do_thing","file":"src/main.rs","line":74,"fields":[],"is_span":true,"is_event":false},"parent":null,"is_root":false}'
- '[2]'
+ '{"id":2}'
- '{"name":"event src/main.rs:76","target":"tracing_playground::do_thing","level":"INFO","module_path":"tracing_playground::do_thing","file":"src/main.rs","line":76,"fields":["message"],"is_span":false,"is_event":true}'
+ '{"name":"event src/main.rs:76","target":"tracing_playground::do_thing","level":"INFO","module_path":"tracing_playground::do_thing","file":"src/main.rs","line":76,"fields":["message"],"is_span":false,"is_event":true}'
- '{"metadata":{"name":"event src/main.rs:76","target":"tracing_playground::do_thing","level":"INFO","module_path":"tracing_playground::do_thing","file":"src/main.rs","line":76,"fields":["message"],"is_span":false,"is_event":true},"message":"something has happened!"}'
+ '{"fields":{"message":{"Debug":"something has happened!"}},"metadata":{"name":"event src/main.rs:76","target":"tracing_playground::do_thing","level":"INFO","module_path":"tracing_playground::do_thing","file":"src/main.rs","line":76,"fields":["message"],"is_span":false,"is_event":true},"parent":null}'
- '[2]'
+ '{"id":2}'

Before this gets merged, I can think of the following needs/wants to haves:

  • A more detailed "conceptual" review, to make sure this is how y'all want it to be
  • Decide whether tracing-serde-structs should totally replace tracing-serde
  • A more robust set of validation examples, rather than my "miniature pasted example from the docs", to make sure there are no unexpected edge cases
  • Some amount of tests (probably derived from the above validation examples)
  • Some amount of perf testing (probably derived from the above validation examples)
  • Maybe a squash/rebase (I don't keep my commits pristine)

Tagging @hawkw.

@jamesmunns
Copy link
Contributor Author

Looking at the MSRV failures (there's a lot), it seems MOST of them are lacking min-const-generics, which stabilized in 1.51, based on heapless' use of them

@jamesmunns
Copy link
Contributor Author

Some other test failures (doc tests) are due to the crate "rename" for now. I can fix them, or wait until we decide whether to replace the existing tracing-serde.

@hawkw
Copy link
Member

hawkw commented May 9, 2022

Regarding some of the additional checkboxes for merging:

Decide whether tracing-serde-structs should totally replace tracing-serde

I think it would be ideal to replace the existing tracing-serde crate with the new implementation, provided we don't produce backwards-incompatible JSON. If we can't produce compatible JSON...we'll have to think on it a bit more. (When I say "backwards compatible" here, I think changing field ordering and adding new fields is fine, but removing an existing field or changing it in a meaningful way might cause problems for existing consumers)

Maybe a squash/rebase (I don't keep my commits pristine)

Don't worry about this one...we always squash branches when merging. See https://github.com/tokio-rs/tracing/blob/master/CONTRIBUTING.md#commit-squashing for details :)


#[repr(usize)]
#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq, Serialize, Deserialize)]
#[allow(non_camel_case_types)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think we could alternatively use

Suggested change
#[allow(non_camel_case_types)]
#[serde(rename_all = "UPPERCASE")]

here if we want to use the standard enum naming conventions. For the tracing_core::Level type, the reason they are all uppercase is because they are associated constants rather than enum variants, but that isn't the case here, so we should probably just follow the enum naming convention...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also we probably want to slap a

Suggested change
#[allow(non_camel_case_types)]
#[allow(non_camel_case_types)]
#[non_exhaustive]

on this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the upstream tracing Level/LevelInner is not-non_exhaustive, does it make sense to do here?

tracing-serde-structs/src/lib.rs Outdated Show resolved Hide resolved
tracing-serde-structs/src/lib.rs Outdated Show resolved Hide resolved
@jamesmunns
Copy link
Contributor Author

Okay, applied suggestions (modulo the non_exhaustive Level one), the next big question is what to do with the old tracing-serde::fields module I removed.

I didn't realize it was being used by other parts of the code (whoops), but now it seems a little out of place.

I'd say the two options are:

  1. Bring it back, maintaining a "second way to serialize these fields"
  2. Update the other libraries not to use the ::fields/AsMap items, and instead just use the Serialize* methods, which will already produce a map of that data (though also with other data).

I also updated the MSRV Versions check to use Rust 1.51.0, though the 1.53.0 MSRV used by tracing-appender would also be fine (and save you some CI time).

It seems that the ::fields issue is now the main tests-passing blocker.

args: --all --exclude=tracing-appender --exclude=tracing-serde

# TODO: remove this once tracing's MSRV is bumped.
check-msrv-serde:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hawkw let me know if you want me to roll this into the next item to save CI time. Technically, it does build just fine on 1.51 tho, if you want to keep those separate (probably doesn't matter for v0.2 release)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Either way, I should fix the copy/paste error in the comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, i would really prefer not to do a global MSRV bump, since tracing/tracing-core are dependencies of tokio, and i don't want to have to bump tokio's MSRV. having separate MSRVs for non-core crates is annoying, but it's probably preferable --- esp. since we would want to backport the tracing-serde changes to v0.1.x.

this is part of why i think it's probably a good idea to eventually move all of the non-core crates (such as tracing-serde) into their own repos...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, to pull in heapless, we need 1.51. I could write my own collection types on top of a fixed size array (or use an older version of heapless I guess?) but it'd have relatively low value. You know your MSRV guarantees better than me tho.

Comment on lines 191 to 201
#[cfg(not(feature = "std"))]
type TracingVec<T> = heapless::Vec<T, 32>;

#[derive(Debug)]
pub struct SerializeField(Field);
#[cfg(not(feature = "std"))]
type TracingMap<K, V> = heapless::FnvIndexMap<K, V, 32>;

impl Serialize for SerializeField {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
serializer.serialize_str(self.0.name())
}
}
#[cfg(feature = "std")]
type TracingVec<T> = std::vec::Vec<T>;

#[derive(Debug)]
pub struct SerializeFieldSet<'a>(&'a FieldSet);
#[cfg(feature = "std")]
type TracingMap<K, V> = std::collections::HashMap<K, V>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is kind of unfortunate, but changing a publicly-visible type based on a feature flag feels kind of like a violation of the "feature flags should be additive" guideline.

in particular, the From<TracingVec<...>> and similar impls are concerning, because if you have some crate in a dependency tree that depends on tracing-serde with default-features = false, you can call the From impl with a heapless::Vec...but if some other crate in the dependency tree enables the std feature flag, that From impl now expects a different type and everything breaks.

I think we may have to figure out some solution where TracingVec and TracingMap are opaque newtypes which expose the same APIs on all feature combinations, and then have conversion methods from foreign types which can be feature-flagged. that way, enabling the std feature doesn't break any previously compiling code.

i think this means that the heapless dependency may have to be non-optional, which...isn't ideal, but i can live with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not against this idea, and I see what you're saying.

I think "make it a newtype, and impl From hvec always, and svec if std is active" isn't a bad plan. I can give it a try tomorrow.

Again, since we always have a known size of 32, I could impl the vec/map types locally, but I wouldn't be super happy about it (heapless is much more battle tested, I'd probably just end up copy/pasting the impl from there).

args: --all --exclude=tracing-appender --exclude=tracing-serde

# TODO: remove this once tracing's MSRV is bumped.
check-msrv-serde:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, i would really prefer not to do a global MSRV bump, since tracing/tracing-core are dependencies of tokio, and i don't want to have to bump tokio's MSRV. having separate MSRVs for non-core crates is annoying, but it's probably preferable --- esp. since we would want to backport the tracing-serde changes to v0.1.x.

this is part of why i think it's probably a good idea to eventually move all of the non-core crates (such as tracing-serde) into their own repos...

@hawkw
Copy link
Member

hawkw commented May 10, 2022

  • Bring it back, maintaining a "second way to serialize these fields"

  • Update the other libraries not to use the ::fields/AsMap items, and instead just use the Serialize* methods, which will already produce a map of that data (though also with other data).

I think if we can easily upgrade the other libraries to use the new code without breaking a bunch of stuff, that would be ideal? But, if the new serializers for those types include additional data, I think it might still be worth having an API for serializing stuff as just a map, without the intention of round-tripping it?

@jamesmunns
Copy link
Contributor Author

jamesmunns commented May 10, 2022

I think it might still be worth having an API for serializing stuff as just a map, without the intention of round-tripping it?

Yeah, that's really the two sides of it:

  • Do you have two versions that do almost the same thing, leading to oddly duplicated code? (or at least duplicated entry points?), OR
  • Do you make users get "more than they need", and pass the "filter the data (e.g. a subfield) to what you actually need" cost down to your callers?

It seems like you're leaning more towards the first option, which is fine for me. Both options require work, so I wanted to check with you before I did either :)

@jamesmunns
Copy link
Contributor Author

As discussed with @hawkw, it is unlikely this can be done in a way that is drop-in compatible with the current tracing-serde + serde-json formatting.

I've opened #2152, which adds JUST the accessor functions introduced in this PR, and set up a new crate, tracing-serde-structured, which will serve as a stop gap until breaking changes in tracing-serde's format is acceptable.

The formatting differences are discussed in the README. These may be possible to overcome, but I don't see immediately how. The main issue is that currently, tracing-serde flattens the contents of fields into the serialized structure, which does not play nicely with binary formats like postcard, which don't like structs/maps with unknown numbers of fields.

I'm happy to iterate on my approach in the tracing-serde-structured crate, which has less of a "breaking change burden" than the upstream tracing-serde crate. Feel free to point folks my way, and let me know if there is renewed interest in upstreaming this work.

@jamesmunns jamesmunns closed this Jun 7, 2022
hawkw pushed a commit that referenced this pull request Jun 7, 2022
## Motivation

This PR adds two new accessor functions that are useful for creating a
structured serde implementation for tracing.

This is a sort of "distilled" version of
#2113, based on the `v0.1.x`
branch.

As it is unlikely that "structured serde" will be 1:1 compatible with
the existing JSON-based `tracing-serde` (or at least - I'm not sure how
to do it in a reasonable amount of effort), these functions will allow
me to make a separate crate to hold me over until breaking formatting
changes are possible in `tracing-serde`.

CC @hawkw, as we've discussed this pretty extensively
hawkw pushed a commit that referenced this pull request Jun 22, 2023
## Motivation

This PR adds two new accessor functions that are useful for creating a
structured serde implementation for tracing.

This is a sort of "distilled" version of
#2113, based on the `v0.1.x`
branch.

As it is unlikely that "structured serde" will be 1:1 compatible with
the existing JSON-based `tracing-serde` (or at least - I'm not sure how
to do it in a reasonable amount of effort), these functions will allow
me to make a separate crate to hold me over until breaking formatting
changes are possible in `tracing-serde`.

CC @hawkw, as we've discussed this pretty extensively
davidbarsky pushed a commit that referenced this pull request Sep 26, 2023
This PR adds two new accessor functions that are useful for creating a
structured serde implementation for tracing.

This is a sort of "distilled" version of
#2113, based on the `v0.1.x`
branch.

As it is unlikely that "structured serde" will be 1:1 compatible with
the existing JSON-based `tracing-serde` (or at least - I'm not sure how
to do it in a reasonable amount of effort), these functions will allow
me to make a separate crate to hold me over until breaking formatting
changes are possible in `tracing-serde`.

CC @hawkw, as we've discussed this pretty extensively
davidbarsky pushed a commit that referenced this pull request Sep 27, 2023
This PR adds two new accessor functions that are useful for creating a
structured serde implementation for tracing.

This is a sort of "distilled" version of
#2113, based on the `v0.1.x`
branch.

As it is unlikely that "structured serde" will be 1:1 compatible with
the existing JSON-based `tracing-serde` (or at least - I'm not sure how
to do it in a reasonable amount of effort), these functions will allow
me to make a separate crate to hold me over until breaking formatting
changes are possible in `tracing-serde`.

CC @hawkw, as we've discussed this pretty extensively
davidbarsky pushed a commit that referenced this pull request Sep 27, 2023
This PR adds two new accessor functions that are useful for creating a
structured serde implementation for tracing.

This is a sort of "distilled" version of
#2113, based on the `v0.1.x`
branch.

As it is unlikely that "structured serde" will be 1:1 compatible with
the existing JSON-based `tracing-serde` (or at least - I'm not sure how
to do it in a reasonable amount of effort), these functions will allow
me to make a separate crate to hold me over until breaking formatting
changes are possible in `tracing-serde`.

CC @hawkw, as we've discussed this pretty extensively
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants