Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation based on what I learned when I did loki setup. #461

Merged
merged 5 commits into from
Apr 9, 2019
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,15 @@ The Loki server has the following API endpoints (_Note:_ Authentication is out o

- `query`: a logQL query
- `limit`: max number of entries to return
- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970)
- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970)
- `direction`: `forward` or `backward`, useful when specifying a limit
- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always one hour ago.
- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time.
- `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward.
- `regexp`: a regex to filter the returned results, will eventually be rolled into the query language

Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time,
so you need to specify the start and end labels accordingly. Querying a long time into the history will cause additional
load to the index server and make the query slower.

Responses looks like this:

```
Expand Down
18 changes: 17 additions & 1 deletion docs/operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,18 @@ storage_config:
dynamodb: dynamodb://access_key:secret_access_key@region
```

You can also use an EC2 instance role instead of hard coding credentials like in the above example.
If you wish to do this the storage_config example looks like this:

```yaml
storage_config:
aws:
s3: s3://region/bucket_name
dynamodbconfig:
dynamodb: dynamodb://region
```


#### S3

Loki is using S3 as object storage. It stores log within directories based on
Expand All @@ -138,6 +150,10 @@ You can setup DynamoDB by yourself, or have `table-manager` setup for you.
You can find out more info about table manager at
[Cortex project](https://github.com/cortexproject/cortex)(https://github.com/cortexproject/cortex).
There is an example table manager deployment inside the ksonnet deployment method. You can find it [here](../production/ksonnet/loki/table-manager.libsonnet)
The table-manager allows deleting old indices by rotating a number of different dynamodb tables and deleting the oldest one. If you choose to
create the table manually you cannot easily erase old data and your index just grows indefinitely.

If you set your DynamoDB table manually, ensure you set the primary index key to `h`
(string) and use `r` (binary) as the sort key. Make sure adjust your throughput base on your usage.
(string) and use `r` (binary) as the sort key. Also set the "period" attribute in the yaml to zero.
Make sure adjust your throughput base on your usage.

71 changes: 71 additions & 0 deletions docs/promtail.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
## Promtail and scrape_configs

Promtail is an agent which reads the Kubernetes pod log files and sends streams of log data to
garo marked this conversation as resolved.
Show resolved Hide resolved
the centralised Loki instances along with a set of labels. Each container in a single pod will usually yield a
single log stream with a set of labels based on that particular pod Kubernetes labels.

The way how Promtail finds out the log locations and extracts the set of labels is by using the *scrape_configs*
section in the Promtail yaml configuration. The syntax is the same what Prometheus uses.

The scrape_configs contains one or more *entries* which are all executed for each container in each new pod running
in the instance. If more than one entry matches your logs you will get duplicates as the logs are sent in more than
one stream, likely with a slightly different labels. Everything is based on different labels.
The term "label" here is used in more than one different way and they can be easily confused.

* Labels starting with __ (two underscores) are internal labels. They are not stored to the loki index and are
invisible after Promtail. They "magically" appear from different sources.
* Labels starting with __meta_kubernetes_pod_label_* are "meta labels" which are generated based on your kubernetes
pod labels. Example: If your kubernetes pod has a label "name" set to "foobar" then the scrape_configs section
will have a label __meta_kubernetes_pod_label_name with value set to "foobar".
* There are other __meta_kubernetes_* labels based on the Kubernetes metadadata, such as the namespace the pod is
running (__meta_kubernetes_namespace) or the name of the container inside the pod (__meta_kubernetes_pod_container_name)
* The label __path__ is a special label which Promtail will read to find out where the log files are to be read in.

The most important part of each entry is the *relabel_configs* which are a list of operations which creates,
renames, modifies or alters labels. A single scrape_config can also reject logs by doing an "action: drop" which means
that this particular scrape_config will not forward logs from a particular pod, but another scrape_config might.
garo marked this conversation as resolved.
Show resolved Hide resolved

Many of the scrape_configs read labels from __meta_kubernetes_* meta-labels, assign them to intermediate labels
such as __service__ based on a few different logic, possibly drop the processing if the __service__ was empty
and finally set visible labels (such as "job") based on the __service__ label.

In general, all of the default Promtail scrape_configs do the following:
* They read pod logs from under /var/log/pods/$1/*.log.
* They set "namespace" label directly from the __meta_kubernetes_namespace.
* They expect to see your pod name in the "name" label
* They set a "job" label which is roughly "your namespace/your job name"

### Idioms and examples on different relabel_configs:

* Drop the processing if a label is empty:
```yaml
- action: drop
regex: ^$
source_labels:
- __service__
```
* Drop the processing if any of these labels contains a value:
```yaml
- action: drop
regex: .+
separator: ''
source_labels:
- __meta_kubernetes_pod_label_name
- __meta_kubernetes_pod_label_app
```
* Rename a metadata label into anothe so that it will be visible in the final log stream:
```yaml
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
```
* Convert all of the Kubernetes pod labels into visible labels:
```yaml
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
```


Additional reading:
* https://www.slideshare.net/roidelapluie/taking-advantage-of-prometheus-relabeling-109483749
4 changes: 3 additions & 1 deletion docs/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,14 @@ This can have several reasons:
- Restarting promtail will not necessarily resend log messages that have been read. To force sending all messages again, delete the positions file (default location `/tmp/positions.yaml`) or make sure new log messages are written after both promtail and Loki have started.
- Promtail is ignoring targets because of a configuration rule
- Detect this by turning on debug logging and then look for `dropping target, no labels` or `ignoring target` messages.
- Promtail cannot find the location of your log files. Check that the scrape_configs contains valid path setting for finding the logs in your worker nodes.
- Your pods are running but not with the labels Promtail is expecting. Check the Promtail scape_configs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first (cannot find the location of your log files) reason was added based on the discussion on the slack where one user had this problem.

I added the second bullet based how I was personally hit with the fact that not all my streams were visible because I wasn't using compatible labels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Connecting to a promtail pod to troubleshoot" is a good chapter with further explanation how to debug the subject. I feel that mentioning these two cases specifically adds valuable information.


## Debug output

Both binaries support a log level parameter on the command-line, e.g.: `loki —log.level= debug ...`

## No labels:
## No labels:

## Failed to create target, "ioutil.ReadDir: readdirent: not a directory"

Expand Down