Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cache usage statistics #6317

Merged
merged 11 commits into from
Jun 7, 2022

Conversation

dannykopping
Copy link
Contributor

@dannykopping dannykopping commented Jun 6, 2022

What this PR does / why we need it:
This PR adds cache statistics for the chunk, index, and result caches. The statistics are collected in the same mechanism that is used currently for ingester/querier stats. The PR also adds some pertinent info to metrics.go which shows cache usage on a per-query, per-tenant basis.

This PR should be read with #6289.

Running logcli query <other args> --stats will now return extra stats:

Cache.Chunk.Requests 		 24
Cache.Chunk.EntriesRequested 	 36
Cache.Chunk.EntriesFound 	 36
Cache.Chunk.EntriesStored 	 0
Cache.Chunk.BytesSent 		 0 B
Cache.Chunk.BytesReceived 	 50 MB
Cache.Index.Requests 		 22
Cache.Index.EntriesRequested 	 22
Cache.Index.EntriesFound 	 22
Cache.Index.EntriesStored 	 0
Cache.Index.BytesSent 		 0 B
Cache.Index.BytesReceived 	 20 kB
Cache.Result.Requests 		 6
Cache.Result.EntriesRequested 	 6
Cache.Result.EntriesFound 	 6
Cache.Result.EntriesStored 	 0
Cache.Result.BytesSent 		 0 B
Cache.Result.BytesReceived 	 23 kB

Which issue(s) this PR fixes:
#6318

Special notes for your reviewer:
Statistics were not being collected correctly in the queryrange.StatsCollectorMiddleware. I believe the statistics were being discarded by the pkg/querier/queryrange/queryrangebase/results_cache.go middleware, but creating a new stats context in the StatsCollectorMiddleware seems to have fixed the issue. This code is pretty hard for me to reason about, so I would appreciate @cyriltovena's review on that since he was the original author, I believe. See baa239c.

Checklist

  • Documentation added
  • Tests updated
  • Is this an important fix or new feature? Add an entry in the CHANGELOG.md.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/upgrading/_index.md

Danny Kopping added 6 commits June 6, 2022 11:16
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
@dannykopping dannykopping changed the title Add cache statistics Add cache usage statistics Jun 6, 2022
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
ChunkCache CacheType = "chunk"
IndexCache = "index"
ResultCache = "result"
WriteDedupeCache = "write-dedupe"
Copy link
Contributor Author

@dannykopping dannykopping Jun 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the write dedupe cache here for the sake of completeness, but it's not yet displayed along with the others.

Documenting function

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0.1%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@dannykopping dannykopping marked this pull request as ready for review June 6, 2022 14:04
@dannykopping dannykopping requested a review from a team as a code owner June 6, 2022 14:04
Copy link
Contributor

@DylanGuedes DylanGuedes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few nits but looks great

CHANGELOG.md Outdated Show resolved Hide resolved
pkg/logqlmodel/stats/context.go Show resolved Hide resolved
Copy link
Contributor

@MasslessParticle MasslessParticle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome!

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Copy link
Contributor Author

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! Addressed your comments

CHANGELOG.md Outdated Show resolved Hide resolved
pkg/logqlmodel/stats/context.go Show resolved Hide resolved
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@dannykopping dannykopping mentioned this pull request Jun 6, 2022
4 tasks
Copy link
Contributor

@DylanGuedes DylanGuedes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Danny Kopping added 2 commits June 7, 2022 12:48
…tion

If we keep the stats collection in pkg/storage/chunk/cache/instrumented.go, then any implementation that wraps it will cause the stats collected to be incomplete. For example: NewBackground(cacheName, cfg.Background, Instrument(cacheName, cache, reg), reg)) - the background cache requests are not collected

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
-        distributor	-0.3%
+            querier	0%
+ querier/queryrange	0.1%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0.5%
+               loki	0%

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

-           ingester	-0.1%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0.1%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@dannykopping dannykopping merged commit 36e0979 into grafana:main Jun 7, 2022
@dannykopping dannykopping deleted the dannykopping/cache-stats branch June 7, 2022 14:01
dannykopping pushed a commit to dannykopping/loki that referenced this pull request Jun 7, 2022
* Adding cache statistics

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding metrics to metrics.go

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Creating new stats context for use in metric queries middleware

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Clean up unnecessary log fields

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Fixing tests

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding stats tests

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* CHANGELOG entry

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Appeasing the linter

Documenting function

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Moving CHANGELOG entry to appropriate section

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Implementing a stats collector cache wrapper to simplify stats collection

If we keep the stats collection in pkg/storage/chunk/cache/instrumented.go, then any implementation that wraps it will cause the stats collected to be incomplete. For example: NewBackground(cacheName, cfg.Background, Instrument(cacheName, cache, reg), reg)) - the background cache requests are not collected

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Fixing tests

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
dannykopping pushed a commit that referenced this pull request Jun 7, 2022
* Adding cache statistics

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding metrics to metrics.go

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Creating new stats context for use in metric queries middleware

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Clean up unnecessary log fields

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Fixing tests

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Adding stats tests

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* CHANGELOG entry

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Appeasing the linter

Documenting function

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Moving CHANGELOG entry to appropriate section

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Implementing a stats collector cache wrapper to simplify stats collection

If we keep the stats collection in pkg/storage/chunk/cache/instrumented.go, then any implementation that wraps it will cause the stats collected to be incomplete. For example: NewBackground(cacheName, cfg.Background, Instrument(cacheName, cache, reg), reg)) - the background cache requests are not collected

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Fixing tests

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
dannykopping pushed a commit to dannykopping/loki that referenced this pull request Jun 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants