Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After running one night, Loki lost connection. #1173

Closed
hellodudu opened this issue Oct 18, 2019 · 18 comments
Closed

After running one night, Loki lost connection. #1173

hellodudu opened this issue Oct 18, 2019 · 18 comments
Labels
stale A stale issue or PR that will automatically be closed.

Comments

@hellodudu
Copy link

Describe the bug

  • I start two docker containers in remote server, Grafana and Loki. It's works well yesterday, After running 20 hours, I found Loki lost connection. Now I cann't set datasource in grafana and get 404 page not found response from loki/api/v1/label

  • I use Loki collecting logs without Promtail, only by pushing api api/prom/push.

  • loki-local-config.yaml and grafana.ini are all default settings.

To Reproduce
Steps to reproduce the behavior:

  1. Started from docker-compose.yaml
version: "3"
services:
  loki:
    images: Grafana/loki
    container_name: loki
    ports:
      - "3100:3100"
    volumes:
      - ./config/loki/:/etc/loki
      - ./data/loki/:/tmp/loki
    command: -config.file=/etc/loki/loki-local-config.yaml
    restart: unless-stopped
    depends_on:
      - grafana

  grafana:
    images: grafana/grafana
    container_name: grafana
    volumes:
      - ./config/grafana.ini:/etc/grafana/grafana.ini
      - ./data/grafana/:/var/lib/grafana/
    ports:
      - "3000:3000"
    user: "472"
    environment:
      GF_EXPLORE_ENABLED: "true"
    logging:
      driver: loki
      options:
        loki-url: "http://212.64.58.168:3100/api/prom/push"
        loki-retries: "5"
        loki-batch-size: "400"
  1. Set datasource with url http://loki:3100
  2. Run one night(I don't know how much time exactly)
  3. Loki lost connect, set datasource from Grafana failed with Bad Gateway. 502, query with curl http://212.64.58.168:3100/loki/api/v1/label got error 404 page not found. But push api api/prom/push successed.

Expected behavior
loki query logs success.

Environment:

  • centos7
  • docker-compose 1.24.1

Screenshots, Promtail config, or terminal output

@sandeepsukhani
Copy link
Contributor

I think you might be experiencing this issue #1087, your loki is working fine since it can accept logs at push endpoint.
Please try adding master tag to run latest loki build with v1 endpoint or use /api/prom/label i.e old endpoints without v1.

Please check and let me know whether it works.

@hellodudu
Copy link
Author

hellodudu commented Oct 18, 2019

I think you might be experiencing this issue #1087, your loki is working fine since it can accept logs at push endpoint.
Please try adding master tag to run latest loki build with v1 endpoint or use /api/prom/label i.e old endpoints without v1.

Please check and let me know whether it works.

Hi sandlis,
Thanks a lot, I changed query api to /api/prom/label, and got response successful. It means Loki works fine right? But I still can't set datasource from Grafana with neither http://loki:3100 or http://host.docker.internal:3100 nor even http://{public_id}:3100. Do you know where the problem is?

@sandeepsukhani
Copy link
Contributor

sandeepsukhani commented Oct 18, 2019

Yes it means Loki is working fine.
Can't say what the issue might be with adding Loki in datasources without knowing what the error is.
If you want you can also have a look at compose file used by Joe, one of the maintainers of Loki.
https://github.com/joe-elliott/grafana-local/tree/master/loki

Please feel free to reach out if you need any help.

@hellodudu
Copy link
Author

I added promtail and will see the result tomorrow.

@hellodudu
Copy link
Author

It looks fine now. But log labels disappeared after a while.

curl "http://127.0.0.1:3100/api/prom/label"
{}

and grafana query failed too.
image

while query from postman or push another new log will get response success.

http://127.0.0.1:3100/api/prom/query?direction=BACKWARD&limit=1000&regexp=&query=%7Bmoli2%3D%22zhcn%22%7D&start=1571035785254000000&end=1571640585254000000&refId=A

image

How can I obtain labels for long time?

@cyriltovena
Copy link
Contributor

cyriltovena commented Oct 29, 2019 via email

@hellodudu
Copy link
Author

Labels queries have also a start and end parameters, if you haven't received logs in the last 5min it's normal to not received a response for the last 5min labels, have you tried to increase the period ? Le lun. 21 oct. 2019 à 03:10, dudu notifications@github.com a écrit :

May I ask how to query labels from Grafana's explore page? As you can see when I get in explore page, I just got error, even I had chosen a long term custom time range.

@cyriltovena
Copy link
Contributor

Are you sending log with the docker driver ? If so can you check daemon logs to verify there isn’t any error ?

@hellodudu
Copy link
Author

No, I'm sending logs with http post.
Now, I add a new service named "loki_conn" which keep sending logs to loki every minute, the error disappeared, and I can always get the "loki_conn" label.

@Mortega5
Copy link

Mortega5 commented Nov 7, 2019

Labels queries have also a start and end parameters, if you haven't received logs in the last 5min it's normal to not received a response for the last 5min labels, have you tried to increase the period ? Le lun. 21 oct. 2019 à 03:10, dudu notifications@github.com a écrit :

May I ask how to query labels from Grafana's explore page? As you can see when I get in explore page, I just got error, even I had chosen a long term custom time range.

I have the same error. Checking http requests made by grafana I found that it tries to get labels requesting to https://<url>/api/datasources/proxy/10/api/prom/label?start=1573118415653000000. For some reason, it tries to get labels since current time (start=1573118415653000000). In my case I'm using logs too old and loki cannot find labels and returns empty, so it fails.

Can anyone confirm if it's a bug?

@gfdusc
Copy link

gfdusc commented Nov 7, 2019

@Mortega5 Thanks for that!

For me, even requesting an old timestamp (via firefox devtools) still return an empty json.

But...

Removing the last zeroes gives me a response with the tags in the first attempt. (take few seconds to return)
After that all requests (with extra zeroes) works!

@cyriltovena
Copy link
Contributor

@Mortega5

have the same error. Checking http requests made by grafana I found that it tries to get labels requesting to https:///api/datasources/proxy/10/api/prom/label?start=1573118415653000000. For some reason, it tries to get labels since current time (start=1573118415653000000).

You should have a start and end, the start should be now unix nano minus the time range (1h by default), if this is not the case this is a bug.

What version of Grafana are you using ?

@Mortega5
Copy link

I'm using Grafana v6.4.3 (3a2bfb7)

@cyriltovena
Copy link
Contributor

You confirm that with this version you get only the start querystring and its value is now ?

@Mortega5
Copy link

Mortega5 commented Nov 26, 2019

You confirm that with this version you get only the start querystring and its value is now ?

Actually it makes two request: using only the start querystring and using both, start and end. In the first case start is now unix nano. In the second case is now minus time range (5 years in the image).

image

But I have upgraded loki to last version (loki 1.0.0) and loki response is {values: ["__name__"]} so it doesn't fail.

By the way, changing the time range doesn't refresh the labels, you need to refresh the page.

@cyriltovena
Copy link
Contributor

@davkal ^^

@davkal
Copy link
Contributor

davkal commented Dec 11, 2019

@stale
Copy link

stale bot commented Jan 11, 2020

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale A stale issue or PR that will automatically be closed. label Jan 11, 2020
@stale stale bot closed this as completed Jan 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale A stale issue or PR that will automatically be closed.
Projects
None yet
Development

No branches or pull requests

6 participants