-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add metrics for tracking index shipper operations #6278
add metrics for tracking index shipper operations #6278
Conversation
@@ -0,0 +1,36 @@ | |||
package downloads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,25 @@ | |||
package uploads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
old metrics.go for reference https://github.com/grafana/loki/blob/440fb7d74cad481bebb222b21fc063fa56c2f5ac/pkg/storage/stores/shipper/uploads/metrics.go
openExistingFileFailuresTotal
is only relevant for boltdb-shipper
so it is already added to it here
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. Approving it to unblock.
Started one small discussion about partial table upload failures. Can you clarify please?
for _, table := range tm.tables { | ||
err := table.Upload(ctx) | ||
if err != nil { | ||
status = statusFailure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: can this be confusing when only few tables failed to upload. But in the metrics, we record as though everything is failed?
tm.metrics.tablesUploadOperationTotal.WithLabelValues(status).Inc()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We track the overall upload operation. We can count individual table upload operations as well but the investigator would anyways have to check the logs to find more details about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
What this PR does / why we need it:
In PR #6226, we refactored the code to make the
boltdb-shipper
index client use the genericindexshipper
to manage the index on object storage. This PR takes care of follow up action to add metrics toindexshipper
.I have made sure we have the same metrics exposed as before by using
WrapRegistererWithPrefix
, i.e. when usingindexshipper
forboltdb-shipper
, metrics would be prefixed withloki_boltdb_shipper_
while fortsdb
, the prefix would beloki_tsdb_shipper_