Skip to content

Commit

Permalink
Send Prometheus metrics to Grafana Cloud (#2994)
Browse files Browse the repository at this point in the history
## Motivation

We're migrating our internal validators to send data to Grafana Cloud.

## Proposal

Starting with the local setup. This was useful to test the required changes that we'll likely need for the actual internal validator setup. We probably won't want to send metrics to Grafana Cloud from local runs all the time (specially for cost reasons), so this is disabled by default.

Grafana Cloud credentials are stored in a secret within GCP's Secret Manager. Since Grafana Cloud will be for our internal validators, then I figured storing the secret on GCP made sense.

## Test Plan

Ran locally with `linera net up --kubernetes`, and saw the metrics in our Grafana Cloud instance:


![Screenshot 2024-12-02 at 11.37.04.png](https://graphite-user-uploaded-assets-prod.s3.amazonaws.com/HlciHFAoHZW62zn13apJ/4b46c8bc-a5d3-4aad-9e3f-151cf44be947.png)

## Release Plan

If we want to start sending metrics to Grafana Cloud from our devnet/testnet, then:
- These changes should be backported to the latest `devnet` branch
- These changes should be backported to the latest `testnet` branch, then
  • Loading branch information
ndr-ds authored Jan 29, 2025
1 parent 7bc7fef commit ffef3ee
Show file tree
Hide file tree
Showing 3 changed files with 48 additions and 17 deletions.
27 changes: 14 additions & 13 deletions kubernetes/linera-validator/helmfile.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
environments:
default:
values:
- writeToGrafanaCloud: {{ env "LINERA_WRITE_TO_GRAFANA_CLOUD" | default "false" }}
validatorLabel: {{ env "LINERA_VALIDATOR_LABEL" | default (printf "local-%s" (env "USER")) }}

---

repositories:
- name: scylla
url: https://scylla-operator-charts.storage.googleapis.com/stable
Expand All @@ -16,22 +24,15 @@ releases:
needs:
- scylla/scylla
values:
- {{ env "LINERA_HELMFILE_VALUES_LINERA_CORE" | default "values-local.yaml" }}
{{ if .Values.writeToGrafanaCloud -}}
- grafanaCloudUsername: {{ fetchSecretValue (env "LINERA_GRAFANA_CLOUD_USERNAME_SECRET") | quote }}
grafanaCloudAPIToken: {{ fetchSecretValue (env "LINERA_GRAFANA_CLOUD_API_TOKEN_SECRET") | quote }}
{{- end }}
- writeToGrafanaCloud: {{ .Values.writeToGrafanaCloud }}
- {{ env "LINERA_HELMFILE_VALUES_LINERA_CORE" | default "values-local.yaml.gotmpl" }}
set:
- name: installCRDs
value: "true"
- name: validator.serverConfig
value: {{ env "LINERA_HELMFILE_SET_SERVER_CONFIG" | default "working/server_1.json" }}
- name: validator.genesisConfig
value: {{ env "LINERA_HELMFILE_SET_GENESIS_CONFIG" | default "working/genesis.json" }}
- name: numShards
value: {{ env "LINERA_HELMFILE_SET_NUM_SHARDS" | default 10 }}
- name: lineraImage
value: {{ env "LINERA_HELMFILE_LINERA_IMAGE" | default "linera:latest" }}
- name: staticIpGcpName
value: {{ env "LINERA_HELMFILE_STATIC_IP_GCP_NAME" | default "" }}
- name: validatorDomainName
value: {{ env "LINERA_HELMFILE_VALIDATOR_DOMAIN_NAME" | default "" }}
- name: scylla
version: v1.13.0
namespace: scylla
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{{- if .Values.writeToGrafanaCloud }}
apiVersion: v1
kind: Secret
metadata:
name: grafana-cloud-auth-secret
type: kubernetes.io/basic-auth
stringData:
username: {{ .Values.grafanaCloudUsername | quote }}
password: {{ .Values.grafanaCloudAPIToken | quote }}
{{- end }}
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Values for charts linera-validator for local validators.

# Linera
lineraImage: "" # Is set by helmfile.
lineraImage: {{ env "LINERA_HELMFILE_LINERA_IMAGE" | default "linera:latest" }}
lineraImagePullPolicy: Never
logLevel: "debug"
proxyPort: 19100
metricsPort: 21100
numShards: 10
numShards: {{ env "LINERA_HELMFILE_SET_NUM_SHARDS" | default 10 }}

# Loki
loki-stack:
Expand Down Expand Up @@ -40,6 +40,26 @@ kube-prometheus-stack:
- grafana-piechart-panel
prometheus:
prometheusSpec:
{{- if .Values.writeToGrafanaCloud }}
scrapeInterval: 90s
remoteWrite:
- url: https://prometheus-prod-13-prod-us-east-0.grafana.net/api/prom/push
basicAuth:
username:
name: grafana-cloud-auth-secret
key: username
password:
name: grafana-cloud-auth-secret
key: password
writeRelabelConfigs:
- sourceLabels: [__name__]
regex: (apiextensions|apiserver|csi|kube|kubelet|kubernetes|node|prober|prometheus|rest|storage|volume|etcd|net|grafana|authentication|code|workqueue|cluster|go|alertmanager|authorization|namespace|scrape|up|field|registered|process|scylla).+
action: drop
- regex: endpoint|instance|namespace|pod|prometheus|prometheus_replica|service|name|resource|id
action: labeldrop
externalLabels:
validator: {{ .Values.validatorLabel }}
{{- end }}
retention: 2d
retentionSize: 1GB
storageSpec:
Expand Down Expand Up @@ -101,5 +121,5 @@ environment: "kind"

# Validator
validator:
serverConfig: "" # Is set by helmfile.
genesisConfig: "" # Is set by helmfile.
serverConfig: {{ env "LINERA_HELMFILE_SET_SERVER_CONFIG" | default "working/server_1.json" }}
genesisConfig: {{ env "LINERA_HELMFILE_SET_GENESIS_CONFIG" | default "working/genesis.json" }}

0 comments on commit ffef3ee

Please sign in to comment.