The Filecoin Network Monitoring and Analysis System
Sentinel is a collection of services which monitor the health and function of the Filecoin network.
A Visor process collects permenant Filecoin chain meterics from a Lotus daemon, and writes them to a TimescaleDB, a time-series and relational datastore.
Many Drone instances collect ephemeral, node-specific Lotus metrics and write them to the same TimescaleDB.
The metrics are displayed in Grafana. A set of very important queries are captured in Grafana to track and alert on critical performance and economic health indicators over time.
This will setup Lotus to run against mainnet. Syncing the network will take a long time and will not be quick. Access to a fully synced node may be achieved this way if there is sufficient time to wait for sync to complete.
git clone git@github.com:filecoin-project/sentinel.git
cd sentinel
make deps
make run-lotus
- (In another window)
build/lotus sync wait
which blocks until lotus finishes syncing the chain. make run-docker
to start Docker services- (In separate windows)
make run-drone
andmake run-visor
.
Clone the repo and fetch the submodules. make
will help you out:
$ git clone git@github.com:filecoin-project/sentinel.git
$ cd sentinel
$ make deps
git submodule update --init --recursive
...
Now we need to get Lotus built and syncing the chain.
Sentinel requires a running lotus daemon that has completed synching the chain locally. A full sync of the Mainnet takes days. You can test against the Nerpa network to try Sentinel out.
Follow the Lotus install and setup instructions, to install the dependencies, check out the branch for the network you want to join, (e.g. ntwrk-nerpa
), and build a local copy of lotus. If you need to test against mainnet, see Testing against Mainnet.
Run Lotus to start syncing.
$ lotus daemon
In a sperate shell, ask lotus to tell us when it has finished syncing.
$ lotus sync wait
lotus sync wait
Worker 0: Target Height: 9099 Target: [bafy2bzacecqrt46shkioecxkt6mrdvi5xd73wh2gonrp3bhxfut6szc2labj6] State: complete Height: 9099
Done!
Now let's set up the Database.
A docker-compose
file is provided that will spin up a TimescaleDB and a Grafana instance to query it. Check that you have docker
and docker-compose
installed. Docker Desktop is relatively painless.
$ make run-docker
Note that TimescaleDB is packaged as an Postgres extension, so some other components will refer to it as Postgres.
Now we need to generate some data with Visor and Drone
In a new shell run Visor. By default it will read Lotus data from $(HOME)/.lotus
and writes to the local TimescaleDB container we just started at localhost:5432
.
$ make run-visor
If there are no ERROR
s then it is now writing filecoin chain metrics to the database.
In another shell, run Drone. By default it reads the Lotus data dir at $(HOME)/.lotus
and Lotus Prometheus metrics from http://127.0.0.1:1234/debug/metrics and writes to the TimescaleDB container at localhost:5432
. Edit build/telegraf.conf
if you need to customise those values.
$ make run-drone
...
2020-09-18T11:24:22Z D! [agent] Successfully connected to outputs.postgresql
2020-09-18T11:24:31Z D! [outputs.postgresql] Wrote batch of 38 metrics in 588.361501ms
2020-09-18T11:24:32Z D! [inputs.lotus] Recorded lotus info
2020-09-18T11:24:32Z D! [inputs.lotus] Recorded pending mpool messages
2020-09-18T11:24:32Z I! [inputs.lotus] Service workers started
...
Verify that it connects to postgres (TimescaleDB is a postgres extention) and is able to read data from lotus without errors.
Grafana and a set of Sentinel dashboards are provisioned in a container along with the TimescaleDB as part of the make run-docker
target.
Visit http://localhost:3000 to open Grafana and login with username and password as admin
. You should now see a sentinel
folder with dashboards in. Have an explore, and make some more! Look at ./grafana/provisioning/dashboards/dashboards.yml
to see how the existing dashboards are provisioned. You can export new dashboards as JSON from the grafana UI and update that file to add new ones.
Note: Build artifacts are put into ./build
path. If you want to force building without make clean
ing first, you can also make -B <target>
.
make
- produces all build targets (lotus, visor, and drone binaries)
make lotus
- only builds the lotus daemon binary
make visor
- only builds the visor binary
make drone
- only builds the Sentinel Drone agent binary
make run-drone
- start development Sentinel Drone process with debug output (uses configuration at build/drone.conf
)
make run-lotus
- start lotus daemon with default settings (lotus repo at $(HOME)/.lotus
)
make run-visor
- start visor binary. The database and repo path can be changed from default via LOTUS_DB
and LOTUS_REPO
environment variables. (Defaults to LOTUS_DB ?= postgres://postgres:password@localhost:5432/postgres?sslmode=disabled
and LOTUS_REPO ?= $(HOME)/.lotus
.
make run-docker
- start docker services (currently TimescaleDB, Grafana)
make stop-drone
- stop development Sentinel Drone process
make stop-visor
- stop visor process
make stop-lotus
- stop lotus daemon process
make stop-docker
- stop all docker services
make install-services
- Install lotus-daemon, sentinel-drone, sentinel-visor as systemd services
make replace-services
- Build lotus-daemon, sentinel-drone, sentinel-visor and replace existing binaries without deploying configuration files
make clean-services
- Uninstall lotus-daemon, sentinel-drone, sentinel-visor as systemd services (not logs or configuration)
Install individual services:
make install-drone-service
make install-lotus-service
make install-visor-service
Replace individual services:
make replace-drone-service
make replace-lotus-service
make replace-visor-service
Also works with their make clean-*-service
counterparts.
make clean
- removes build artifacts
make clean-state
- stops and destroys docker service volumes (which resets TimescaleDB and Grafana settings and configuration)
A complete local sync of Mainnet
takes a long time. To complete it in a reasonable time you need a copy of an already sync'd lotus data dir from a friendly filecoin operator. (NOTE: At present, regular chain snapshots, as described in https://docs.filecoin.io/get-started/lotus/chain-snapshots don't work for chainwatch because they are incomplete exports.)
To take a full copy of a Lotus data dir
- shutdown existing lotus
- run lotus daemon on your new box for just a moment (to initialize the repo path)
- tar/copy/transport .../.lotus/datastore into the same path on your destination
- start old and new daemons
- 🎉
Sentinel follows the Filecoin Project Code of Conduct. Before contributing, please acquaint yourself with our social courtesies and expectations.
Welcoming new issues and pull requests.
The Filecoin Project and Sentinel is dual-licensed under Apache 2.0 and MIT terms:
- Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)