diff --git a/docs/sources/best-practices/_index.md b/docs/sources/best-practices/_index.md index 9d0d4636f0fcd..65e2af0539df0 100644 --- a/docs/sources/best-practices/_index.md +++ b/docs/sources/best-practices/_index.md @@ -6,11 +6,11 @@ weight: 400 Loki is under active development, and we are constantly working to improve performance. But here are some of the most current best practices for labels that will give you the best experience with Loki. -## 1. Static labels are good +## Static labels are good Things like, host, application, and environment are great labels. They will be fixed for a given system/app and have bounded values. Use static labels to make it easier to query your logs in a logical sense (e.g. show me all the logs for a given application and specific environment, or show me all the logs for all the apps on a specific host). -## 2. Use dynamic labels sparingly +## Use dynamic labels sparingly Too many label value combinations leads to too many streams. The penalties for that in Loki are a large index and small chunks in the store, which in turn can actually reduce performance. @@ -26,13 +26,13 @@ What you want to avoid is splitting a log file into streams, which result in chu It’s not critical that every chunk be full when flushed, but it will improve many aspects of operation. As such, our current guidance here is to avoid dynamic labels as much as possible and instead favor filter expressions. For example, don’t add a `level` dynamic label, just `|= “level=debug”` instead. -## 3. Label values must always be bounded +## Label values must always be bounded If you are dynamically setting labels, never use a label which can have unbounded or infinite values. This will always result in big problems for Loki. Try to keep values bounded to as small a set as possible. We don't have perfect guidance as to what Loki can handle, but think single digits, or maybe 10’s of values for a dynamic label. This is less critical for static labels. For example, if you have 1,000 hosts in your environment it's going to be just fine to have a host label with 1,000 values. -## 4. Be aware of dynamic labels applied by clients +## Be aware of dynamic labels applied by clients Loki has several client options: [Promtail](https://github.com/grafana/loki/tree/master/docs/sources/clients/promtail) (which also supports systemd journal ingestion and TCP-based syslog ingestion), [Fluentd](https://github.com/grafana/loki/tree/master/fluentd/fluent-plugin-grafana-loki), [Fluent Bit](https://github.com/grafana/loki/tree/master/cmd/fluent-bit), a [Docker plugin](https://grafana.com/blog/2019/07/15/lokis-path-to-ga-docker-logging-driver-plugin-support-for-systemd/), and more! @@ -63,11 +63,11 @@ This is a perfect example of something which should not be a label, `requestId` filter expressions should be used to query logs for a specific `requestId`. For example if `requestId` is found in the log line as a key=value pair you could write a query like this: `{logGroup="group1"} |= "requestId=32422355"` -## 5. Configure caching +## Configure caching Loki can cache data at many levels, which can drastically improve performance. Details of this will be in a future post. -## 6. Logs must be in increasing time order per stream +## Logs must be in increasing time order per stream One issue many people have with Loki is their client receiving errors for out of order log entries. This happens because of this hard and fast rule within Loki: @@ -104,7 +104,7 @@ But I want Loki to fix this! Why can’t you buffer streams and re-order them fo It's also worth noting that the batching nature of the Loki push API can lead to some instances of out of order errors being received which are really false positives. (Perhaps a batch partially succeeded and was present; or anything that previously succeeded would return an out of order entry; or anything new would be accepted.) -## 7. Use `chunk_target_size` +## Use `chunk_target_size` This was added earlier in the [Loki v1.3.0](https://grafana.com/blog/2020/01/22/loki-1.3.0-released/) release, and we've been experimenting with it for several months. We have `chunk_target_size: 1536000` in all our environments now. This instructs Loki to try to fill all chunks to a target _compressed_ size of 1.5MB. These larger chunks are more efficient for Loki to process. @@ -116,7 +116,7 @@ Lots of small, unfilled chunks are currently kryptonite for Loki. We are always If you have an application that can log fast enough to fill these chunks quickly (much less than `max_chunk_age`), then it becomes more reasonable to use dynamic labels to break that up into separate streams. -## 8. Use `-print-config-stderr` or `-log-config-reverse-order` +## Use `-print-config-stderr` or `-log-config-reverse-order` Starting in version 1.6.0 Loki and Promtail have flags which will dump the entire config object to stderr, or the log file, when they start.