Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Grafana integration #3913

Merged
merged 43 commits into from
Nov 22, 2019
Merged

Grafana integration #3913

merged 43 commits into from
Nov 22, 2019

Conversation

expenses
Copy link
Contributor

@expenses expenses commented Oct 25, 2019

See #814. This PR adds a HTTP server that (in a currently very bare-bones way) conforms to the grafana-json-datasource API.

Screen Shot 2019-10-25 at 5 58 25 PM

This PR probably has a fair bit to go before it can be merged. Some notes:

  • Currently, all metrics are stored as f32s, which is probably fine (although a bit awkward) as we're serializing into json anyway.
  • The server is currently turned on by default, but it might be a good idea to put it behind a flag instead.
  • I haven't yet implemented the table query response type, but I'm not sure if this is absolutely needed.
  • I also haven't bothered with the annotations API, not sure about that one either.
  • At the moment, the metrics grow indefinitely. This is definitely something that needs changing, but I'm not sure when a cleanup should occur.
  • Ideally, we want to switch to a small, embedded timeseries database.

@gavofyork gavofyork added the A3-in_progress Pull request is in progress. No review needed at this stage. label Oct 25, 2019
@expenses
Copy link
Contributor Author

Ok, I think I've gotten things to a good-enough place now. Ideally, we'd use a database like Akumuli in the same way that we use RocksDB, but there are no rust bindings for that (I'm looking into it, no promises) and it's probably overkill.

Grafana doesn't seem to complain that the tables mode doesn't return a table, so that's probably a non-issue.

I've added a future that removed metrics that are over a week old that is ran every day, which seems reasonable for the time being.

@expenses expenses changed the title [WIP]: Grafana integration Grafana integration Oct 27, 2019
@gavofyork gavofyork added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Oct 28, 2019
core/grafana-data-source/src/lib.rs Outdated Show resolved Hide resolved
core/grafana-data-source/test/src/main.rs Outdated Show resolved Hide resolved
core/service/Cargo.toml Outdated Show resolved Hide resolved
core/grafana-data-source/src/util.rs Outdated Show resolved Hide resolved
core/grafana-data-source/src/util.rs Outdated Show resolved Hide resolved
@tomaka tomaka added A4-gotissues and removed A0-please_review Pull request needs code review. labels Oct 28, 2019
expenses and others added 3 commits October 28, 2019 22:54
Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com>
Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com>
@gavofyork gavofyork added this to the polkadot-0.6.18 milestone Nov 22, 2019
@gavofyork
Copy link
Member

@expenses please update the lock file. in the future i'd recommend you work in branches off this repo so that stuff like this can be done by anyone else without having to open a new PR.

@expenses
Copy link
Contributor Author

@gavofyork done :^)

@tomaka
Copy link
Contributor

tomaka commented Nov 22, 2019

i'd recommend you work in branches off this repo so that stuff like this can be done by anyone else without having to open a new PR.

For what it's worth, GitHub allows contributors to edit branches corresponding to PRs, even if they are outside of the repo.

@expenses
Copy link
Contributor Author

Also, I'm not a part of the parity github organization yet 😉😉

@gavofyork
Copy link
Member

image

@gavofyork
Copy link
Member

Full test suite passed on build server.

@gavofyork gavofyork merged commit d9ca975 into paritytech:master Nov 22, 2019
@gavofyork
Copy link
Member

i'd recommend you work in branches off this repo so that stuff like this can be done by anyone else without having to open a new PR.

For what it's worth, GitHub allows contributors to edit branches corresponding to PRs, even if they are outside of the repo.

that has never worked for me. can you explain how?

image

@tomaka
Copy link
Contributor

tomaka commented Nov 22, 2019

that has never worked for me. can you explain how?

Hmmm
The option is enabled by default but people can disable it, that's maybe why

@bkchr
Copy link
Member

bkchr commented Nov 23, 2019

For me it also never worked.

"You can set commit permissions when you first create a pull request from a fork. For more information, see "Creating a pull request from a fork." Additionally, you can modify an existing pull request to let repository maintainers make commits to your branch."

Most people probably have disabled it.

@expenses
Copy link
Contributor Author

Oh, I was assuming it was enabled on this PR (as I thought that was the default) but it's not. Sorry 😅

@expenses
Copy link
Contributor Author

checkbox

@Hyung-bharvest
Copy link

Hyung-bharvest commented Nov 26, 2019

@expenses @gavofyork

Hi. We are implementing a Prometheus exporter runtime in substrate and connecting into grafana server. I am afraid it is a duplicated job with almost same objective.

I am curious why you don't use already well known open source program "prometheus", but implemented subset of its feature by hand?

Our repository is here
https://github.com/nodebreaker-carlsagan/substrate/tree/prometheus_v0.1/client/prometheus

Thank you.

@expenses
Copy link
Contributor Author

@dlguddus I decided not to use Prometheus because it's overkill if you just want to record metrics. For most use-cases, the advantages of Prometheus such as persistent metrics and being able to store metrics from multiple nodes are not needed. I believe that we should be aiming for good defaults in Substrate, and using a simple in-memory database is a better default than needing Prometheus.

That said, it would be great if we could support both. With #4187 we could do this, or if you setup your Prometheus observer to use the existing record_metrics macro, that would be great!

@Hyung-bharvest
Copy link

Hyung-bharvest commented Nov 26, 2019

As a validator, we are running about 30 nodes. We expect it would be more than 100 next year. Definitely we need generalized data import from our monitoring center. This is our common usecase, as a validator because we are not only validating polkadot.

As you can see in our implementation, prometheus does not store any data in local PC, but just realtime passing the data to monitoring(grafana) server. So I expect it is as light as your implementation which stores and clear data in local memory.

Do I understand the specs correctly?

@Hyung-bharvest
Copy link

And, I think #4187 is definitely a must go work because it can minimize complexity of codebase for extracting data for various data exporter such as telemetry, prometheus, grafana-api, rpc, lcd, and more.

We will communicate there to contribute and discuss. Thanks for inviting!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants