Label endpoint returns empty despite labels being present #1308

rasple · 2019-11-22T11:55:14Z

Describe the bug
I have loki + promtail + grafana deployed as a stack on a docker swarm with one node (latest images). Promtail scans logs on a volume mounted inside the container and positions.yaml as well as lokis storage is persisted on a mount. When I deploy the stack, for a couple of minutes everything works fine and I can query the logs via grafana and loki. After some time the following error message occurs:

Error connecting to datasource: Data source connected, but no labels received. Verify that Loki and Promtail is configured properly.

Trying to query lokis api for a specific label

curl -G -s "http://<host>:3100/loki/api/v1/label/log_name/values" | jq .

returns the labels as it should.

 {
  "values": [
    "access",
    "error"
  ]
}

However the call that grafana most likely does to show the available labels and their values

curl -G -s "http://<host>:3100/loki/api/v1/label" | jq .

returns

{}

I have done everything in the troubleshooting section relating to this error. Promtail works perfectly until Loki stops serving labels for some reason. Most likely because there are no new logs.

Expected behavior
When I delete positions.yaml and redeploy the stack it works for some time and returns the labels

curl -G -s "http://<host>:3100/loki/api/v1/label" | jq .

returns

{
  "values": [
    "app_name",
    "container_id",
    "filename",
    "hostname",
    "http_method",
    "job",
    "level",
    "log_name",
    "pid",
    "protocol",
    "service_name",
    "stack_name",
    "tid",
    "user_agent"
  ]
}

Maybe I am getting this wrong but since loki persists logs, it should be able to supply them even if there are no logs coming in from promtail at any time.

Environment:
Single node docker swarm on a Ubuntu Server.

Screenshots, Promtail config, or terminal output
promtail-config.yaml excluding pipeline because it works obviously

server:
  http_listen_port: 9080
  grpc_listen_port: 0
  http_listen_host: 0.0.0.0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://loki:3100/api/prom/push
    backoff_config:
      minbackoff: 1s
      maxbackoff: 5s
      maxretries: 10000

scrape_configs:

loki-config.yaml

auth_enabled: false

server:
  http_listen_port: 3100

ingester:
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
    final_sleep: 0s
  chunk_idle_period: 5m
  chunk_retain_period: 30s
  max_transfer_retries: 1

schema_config:
  configs:
  - from: 2018-04-15
    store: boltdb
    object_store: filesystem
    schema: v9
    index:
      prefix: index_
      period: 168h

storage_config:
  boltdb:
    directory: /tmp/loki/index

  filesystem:
    directory: /tmp/loki/chunks

limits_config:
  enforce_metric_name: false
  reject_old_samples: false
  #reject_old_samples_max_age: 168h

chunk_store_config:
  max_look_back_period: 30s

table_manager:
  chunk_tables_provisioning:
    inactive_read_throughput: 0
    inactive_write_throughput: 0
    provisioned_read_throughput: 0
    provisioned_write_throughput: 0
  index_tables_provisioning:
    inactive_read_throughput: 0
    inactive_write_throughput: 0
    provisioned_read_throughput: 0
    provisioned_write_throughput: 0
  retention_deletes_enabled: false
  retention_period: 0

docker-stack.yml (hostnames removed for privacy)

version: "3.4"

services:
  loki:
    image: private-dtr/repo/loki:latest
    volumes:
      - /mnt/loki_data_monitoring:/tmp/loki
      - ./loki-config.yaml:/etc/loki/loki-config.yaml:ro
    ports:
      - "3100:3100"
    command: -config.file=/etc/loki/loki-config.yaml
    networks:
      - loki

  grafana:
    image: private-dtr/repo/grafana:latest
    #ports:
    #  - "3000:3000"
    depends_on:
      - influxdb
      - loki
    environment:
      - GF_SERVER_ROOT_URL=http://<host>/grafana
    volumes:
      - /mnt/grafana_config_monitoring/:/var/lib/grafana/
    deploy:
      labels:
          - "traefik.frontend.rule=PathPrefixStrip:/grafana"
          - "traefik.port=3000"
          - "traefik.backend=grafana"
          - "traefik.docker.network=traefik-net"
    networks:
      - loki
      - influxdb
      - traefik-net

  promtail:
      image: private-dtr/repo/promtail:latest
      depends_on:
        - loki
      ports:
        - "9080:9080"
      volumes:
        - /var/log:/var/log:ro
        - /var/lib/docker/containers/:/var/lib/docker/containers/:ro
        - /var/lib/docker/volumes/:/var/lib/docker/volumes/:ro
        - ./promtail-config.yaml:/etc/promtail/promtail-config.yaml:ro
        - /tmp:/tmp
      command: -config.file=/etc/promtail/promtail-config.yaml -log.level=debug
      deploy:
        mode: global
      networks:
        - loki
        - traefik-net
networks:
  influxdb:
  loki:
  traefik-net:
    external: true

The text was updated successfully, but these errors were encountered:

rasple · 2019-11-25T08:15:49Z

Possible duplicate of #453 though I thought this was fixed a long time ago.

slim-bean · 2019-11-25T15:42:29Z

This is interesting, the problem seems to be the slow volume of logs in relation to how Grafana does a healthcheck on Loki. I'm guessing others haven't seen this as they have some volume of logs continue to trickle in.

Ultimately we should probably find a better way for Grafana to do a health check on Loki rather than running a label query.

For now I think you might need something that logs at least every few minutes to keep Grafana happy.

rasple · 2019-11-26T09:11:53Z

Thank you for your explanation. This should definitely changed in either Loki or Grafana as it causes a vanilla setup to fail if there are not enough (recent) logs. I will close this as a duplicate of #453.

slim-bean · 2019-11-26T14:26:01Z

@davkal what are your thoughts on changing how Grafana does a healthcheck on Loki? Should Loki add a specific endpoint for this?

Mortega5 · 2019-12-11T09:51:18Z

I think the problem is related to this. But version 1.0.0 solves the problem because the API always responds with the __name__ tag.

davkal · 2019-12-11T13:17:53Z

Fixed via grafana/grafana#20971

davkal · 2019-12-11T13:39:35Z

The health check should be cheap, but confer if things are operating nominally. For prometheus we use 1 as a constant query and call it a day. But for Loki we decided to use the labels API to make sure that loki instance is in a state where it's useful. The idea being, if you don't have labels you're gonna have a bad time in Grafana.

rasple closed this as completed Nov 26, 2019

slim-bean reopened this Nov 26, 2019

davkal closed this as completed Dec 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Label endpoint returns empty despite labels being present #1308

Label endpoint returns empty despite labels being present #1308

rasple commented Nov 22, 2019

rasple commented Nov 25, 2019

slim-bean commented Nov 25, 2019

rasple commented Nov 26, 2019 •

edited

Loading

slim-bean commented Nov 26, 2019

Mortega5 commented Dec 11, 2019

davkal commented Dec 11, 2019

davkal commented Dec 11, 2019

Label endpoint returns empty despite labels being present #1308

Label endpoint returns empty despite labels being present #1308

Comments

rasple commented Nov 22, 2019

rasple commented Nov 25, 2019

slim-bean commented Nov 25, 2019

rasple commented Nov 26, 2019 • edited Loading

slim-bean commented Nov 26, 2019

Mortega5 commented Dec 11, 2019

davkal commented Dec 11, 2019

davkal commented Dec 11, 2019

rasple commented Nov 26, 2019 •

edited

Loading