Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add detail about reaggregation for safe attribute removal #1649

Closed
wants to merge 1 commit into from

Conversation

jmacd
Copy link
Contributor

@jmacd jmacd commented Apr 27, 2021

Fixes #1297

Changes

Adds information about how to re-aggregate metric streams for safe semantic attribute removal.
Depends on: #1618 #1646 #1648.

@jmacd jmacd requested review from a team April 27, 2021 00:11
Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to see a better example/image, but not a hard ask. Otherwise a few questions about whether we need to more clearly call out additional identity things.

I think you nailed the important points already, so consider this a non-blocking ask for changes.

Reaggregation should be applied after resolving overlaps, otherwise
overlapping points will be reaggregated together incorrectly.

After resolving overlaps, the first step in reaggregation is to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can we have an example of what this looks like / means? I can do a drawing if you'd like :)

Points written to the output stream use `StartTimeUnixNano` to convey
an unbroken sequence of observations of the aggregated streams. When
the reaggregating processor forgets a stream because of the staleness
of its inpouts, it should be sure to use a new `StartTimeUnixNano`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/inpouts/inputs/

However, it makes me think inpout is a way better term than inout. CORBA was wrong.

of its inpouts, it should be sure to use a new `StartTimeUnixNano`
if the stream is ever resumed.

Reaggregating processors are responsible for ensuring that their
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this just means time overlap.

Do we want to call out "stream overlap" i.e. understanding whether or not the new metric stream being output from an aggregation processor overrides/conflicts with another stream?

I think implicitly we've already defined that as an "error scenario".


When a new output stream is produced by the reaggregation process, the
reaggregation processor is responsible for remembering the state of
its input streams (for correct ovelrap handling) and its start time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ovelrap/overlap

overlapping points will be reaggregated together incorrectly.

After resolving overlaps, the first step in reaggregation is to
perform temporal alignment on the desired streams to the desired
Copy link
Contributor

@victlu victlu Apr 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we are doing temporal alignment, I assume we need to account for attribute combinations to know how to proportionally allocate to the correct temporal slot? If so, it seems we need to do safe attribute removal in that stage first.

Thus, should we do attribute removal first, and then temporal alignment? Or worst, need to consider both temporal and attributes at the same time?

@jsuereth jsuereth added area:data-model For issues related to data model spec:metrics Related to the specification/metrics directory labels May 4, 2021
Comment on lines +387 to +388
output point. Next, apply the natural aggregation function
corresponding to the stream point kind, yielding output point values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker for this PR, I want to get more perspectives regarding "the natural aggregation function corresponding to the stream point kind" - do we think there will always be a natural aggregation across all dimensions, or it varies depending on which dimensions and combinations we are dealing with?

For example, it makes sense to see the total voltage on the Toyota 4Runner is 12 Volts (6 cells, each 2.0V), but it doesn't make sense to say the total voltage across a 4Runner and F150 is 23.4 Volts:

voltage (Volts) the battery cell make model year
2.0 1 Toyota 4Runner 2021
2.0 2 Toyota 4Runner 2021
2.0 3 Toyota 4Runner 2021
2.0 4 Toyota 4Runner 2021
2.0 5 Toyota 4Runner 2021
2.0 6 Toyota 4Runner 2021
1.8 1 Ford F-150 2017
2.0 2 Ford F-150 2017
2.0 3 Ford F-150 2017
1.6 4 Ford F-150 2017
2.0 5 Ford F-150 2017
2.0 6 Ford F-150 2017

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I think we have a goal of having a natural aggregation function across dimensions (and that's fundamentally why we have Gauge, Sum, Histogram and Aggregation Temporality, etc.), I think you call out a great point that effectively there are things you can do with data that don't make any sense.

I'm viewing our data types the same I do as programming data types: They are useful, up to a point. If we figure out an elegant way to express all possible operations, great. If we only define some useful limited operations and semantics for 80% of cases, I think we're ok.

My answer to your question is: "Are the natural aggregations we've defined useful in MOST (~80%) of scenarios, or do we find we have a lot of caveats?". I'm still leaning towards "these are useful most of the time".

Copy link
Member

@reyang reyang May 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have enough understanding of the usage space.
But so far my gut feeling is "80% time it will be useful, and 20% time it will be confusing" and that's why I asked the question and also clarified that it is not a blocker. (and interestingly, @jsuereth we have the same number 80% 😆)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed during the May 11th, 2021 Metrics Data Model SIG meeting.

The default sum behavior makes logical sense in most of the cases, and it is good to have a default behavior + allow adjustment is the default behavior is not what people wants.

@jmacd also mentioned that dimensions/attributes coming from the resource would normally have different semantic than the dimensions/attributes coming from the API.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label May 19, 2021
@jmacd
Copy link
Contributor Author

jmacd commented May 25, 2021

I will return to this after #1648 and a new PR about temporal alignment merge first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:data-model For issues related to data model spec:metrics Related to the specification/metrics directory Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metrics: Requirements for safe attribute removal
5 participants