Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add metrics for tracking index shipper operations #6278

Merged
merged 1 commit into from
Jun 1, 2022

Conversation

sandeepsukhani
Copy link
Contributor

What this PR does / why we need it:
In PR #6226, we refactored the code to make the boltdb-shipper index client use the generic indexshipper to manage the index on object storage. This PR takes care of follow up action to add metrics to indexshipper.

I have made sure we have the same metrics exposed as before by using WrapRegistererWithPrefix, i.e. when using indexshipper for boltdb-shipper, metrics would be prefixed with loki_boltdb_shipper_ while for tsdb, the prefix would be loki_tsdb_shipper_

@sandeepsukhani sandeepsukhani requested a review from a team as a code owner June 1, 2022 10:28
@@ -0,0 +1,36 @@
package downloads
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,25 @@
package uploads
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old metrics.go for reference https://github.com/grafana/loki/blob/440fb7d74cad481bebb222b21fc063fa56c2f5ac/pkg/storage/stores/shipper/uploads/metrics.go
openExistingFileFailuresTotal is only relevant for boltdb-shipper so it is already added to it here

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

Copy link
Contributor

@kavirajk kavirajk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Approving it to unblock.

Started one small discussion about partial table upload failures. Can you clarify please?

for _, table := range tm.tables {
err := table.Upload(ctx)
if err != nil {
status = statusFailure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: can this be confusing when only few tables failed to upload. But in the metrics, we record as though everything is failed?
tm.metrics.tablesUploadOperationTotal.WithLabelValues(status).Inc()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We track the overall upload operation. We can count individual table upload operations as well but the investigator would anyways have to check the logs to find more details about it.

Copy link
Contributor

@vlad-diachenko vlad-diachenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants