Skip to content
This repository has been archived by the owner on Mar 15, 2024. It is now read-only.

Record metrics to prometheus #25

Merged
merged 7 commits into from
Mar 13, 2023
Merged

Record metrics to prometheus #25

merged 7 commits into from
Mar 13, 2023

Conversation

hannahhoward
Copy link
Collaborator

@hannahhoward hannahhoward commented Mar 8, 2023

Goals

Put some metrics into the event recorder so we can verify that we've got a prometheus flow moving end to end.

Implementation

For now I just took the metrics recorder from Lassie directly and did a best effort translation to try to record some metrics. Some of these have a few deficiencies but it's all going to change once we ship filecoin-project/lassie#138

For discussion

Remove that summary endpoint, cause it never worked and was a terrible idea

@hannahhoward hannahhoward force-pushed the feat/record-metrics branch from dfe659a to 570ded3 Compare March 8, 2023 23:53
Copy link
Member

@masih masih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One blocker about metrics endpoint being public. Otherwise looks good. Left some questions and suggestions.

Thank you for replacing the summary endpoint 👍

eventrecorder/recorder.go Outdated Show resolved Hide resolved
eventrecorder/recorder.go Show resolved Hide resolved
handleQueryAskEvent()
case types.QueryAskedFilteredCode:
handleQueryAskFilteredEvent()
}
Copy link
Member

@masih masih Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log warning on default case. This would be useful to detect missing cases when lassie types are updated but this code is not.

eventrecorder/recorder.go Outdated Show resolved Hide resolved
eventrecorder/recorder.go Outdated Show resolved Hide resolved
eventrecorder/recorder.go Outdated Show resolved Hide resolved
eventrecorder/recorder.go Show resolved Hide resolved
redesign metrics to target opentelemetry, handle events in intelligent ways, and produce usable
lassie metrics
metrics/tempdata/tempdata.go Outdated Show resolved Hide resolved
metrics/metrics.go Outdated Show resolved Hide resolved
metrics/events.go Outdated Show resolved Hide resolved
hannahhoward and others added 2 commits March 11, 2023 11:49
Co-authored-by: Masih H. Derkani <m@derkani.org>
Co-authored-by: Masih H. Derkani <m@derkani.org>
atomic.StoreUint32(&t.failedCount, 0)
}

var tempDataPool = sync.Pool{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gather that we're just relying on the sheer volume of traffic to wipe out the errors introduced by GC wiping away our data occasionally, giving us not-quite-accurate finality reports? With a very busy service, I wonder whether GC is going to get really disruptive here and make this very inaccurate?

@hannahhoward hannahhoward merged commit 00b9507 into main Mar 13, 2023
@hannahhoward hannahhoward deleted the feat/record-metrics branch March 13, 2023 19:11
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants