Skip to content
This repository has been archived by the owner on Sep 9, 2020. It is now read-only.

Commit

Permalink
Merge pull request #2 from jrasell/add-telemetry-config-runtime-stats
Browse files Browse the repository at this point in the history
Add initial support for statsite and statsd runtime telemetry.
  • Loading branch information
jrasell authored May 14, 2019
2 parents 256ad65 + 5ee46d0 commit f922d06
Show file tree
Hide file tree
Showing 125 changed files with 27,441 additions and 5,591 deletions.
7 changes: 5 additions & 2 deletions cmd/server/command.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ func RegisterCommand(rootCmd *cobra.Command) error {
}
serverCfg.RegisterConfig(cmd)
serverCfg.RegisterTLSConfig(cmd)
serverCfg.RegisterTelemetryConfig(cmd)
logCfg.RegisterConfig(cmd)
rootCmd.AddCommand(cmd)

Expand All @@ -33,6 +34,7 @@ func RegisterCommand(rootCmd *cobra.Command) error {
func runServer(_ *cobra.Command, _ []string) {
serverConfig := serverCfg.GetConfig()
tlsConfig := serverCfg.GetTLSConfig()
telemetryConfig := serverCfg.GetTelemetryConfig()

if err := verifyServerConfig(serverConfig); err != nil {
fmt.Println(err)
Expand All @@ -47,8 +49,9 @@ func runServer(_ *cobra.Command, _ []string) {
}

cfg := &server.Config{
Server: &serverConfig,
TLS: &tlsConfig,
Server: &serverConfig,
TLS: &tlsConfig,
Telemetry: &telemetryConfig,
}
srv := server.New(log.Logger, cfg)

Expand Down
2 changes: 2 additions & 0 deletions docs/configuration/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ The Sherpa server can be configured by supplying either CLI flags or using envir
* `--policy-engine-strict-checking-enabled` (bool: true) - When enabled, all scaling activities must pass through policy checks.
* `--storage-consul-enabled` (bool: false) - Use Consul as a storage backend when using the API policy engine.
* `--storage-consul-path` (string: "sherpa/policies/") - The Consul KV path that will be used to store policies.
* `--telemetry-statsd-address` (string: "") - Specifies the address of a statsd server to forward metrics to
* `--telemetry-statsite-address` (string: "") - Specifies the address of a statsite server to forward metrics data to
* `--tls-cert-key-path` (string: "") - Path to the TLS certificate key for the Sherpa server.
* `--tls-cert-path (string: "")` - Path to the TLS certificate for the Sherpa server.

Expand Down
78 changes: 78 additions & 0 deletions docs/configuration/telemetry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Telemetry

The Sherpa server collects various runtime metrics about the performance that are retained for one minute. This data can be viewed either by sending the Sherpa server process a signal, or [configuring](./README.md) the server to stream data to [statsite](https://github.com/statsite/statsite) or [statsd](https://github.com/statsd/statsd).

To view this data via sending a signal to the Sherpa process: on Unix, this is `USR1` while on Windows it is `BREAK`. Once Nomad receives the signal, it will dump the current telemetry information to the server's `stderr`:

```bash
[2019-05-14 18:32:50 +0100 BST][G] 'sherpa.lluna.local.runtime.sys_bytes': 72220920.000
[2019-05-14 18:32:50 +0100 BST][G] 'sherpa.lluna.local.runtime.malloc_count': 76736.000
[2019-05-14 18:32:50 +0100 BST][G] 'sherpa.lluna.local.runtime.free_count': 41066.000
[2019-05-14 18:32:50 +0100 BST][G] 'sherpa.lluna.local.runtime.heap_objects': 35670.000
[2019-05-14 18:32:50 +0100 BST][G] 'sherpa.lluna.local.runtime.total_gc_pause_ns': 39109.000
[2019-05-14 18:32:50 +0100 BST][G] 'sherpa.lluna.local.runtime.total_gc_runs': 1.000
[2019-05-14 18:32:50 +0100 BST][G] 'sherpa.lluna.local.runtime.num_goroutines': 7.000
[2019-05-14 18:32:50 +0100 BST][G] 'sherpa.lluna.local.runtime.alloc_bytes': 3044160.000
[2019-05-14 18:32:50 +0100 BST][S] 'sherpa.runtime.gc_pause_ns': Count: 1 Sum: 39109.000 LastUpdated: 2019-05-14 18:32:54.504907 +0100 BST m=+1.125110442
```

# Runtime Metrics

Runtime metrics allow operators to get insight into how the Sherpa server process is functioning.

<table class="table table-bordered table-striped">
<tr>
<th>Metric</th>
<th>Description</th>
<th>Unit</th>
<th>Type</th>
</tr>
<tr>
<td>`sherpa.runtime.num_goroutines`</td>
<td>Number of goroutines and general load pressure indicator</td>
<td>Number of goroutines</td>
<td>Gauge</td>
</tr>
<tr>
<td>`sherpa.runtime.alloc_bytes`</td>
<td>Number of bytes allocated to the Sherpa process which should keep a steady state</td>
<td>Number of bytes</td>
<td>Gauge</td>
</tr>
<tr>
<td>`sherpa.runtime.sys_bytes`</td>
<td>This includes what is being used by Sherpa's heap and what has been reclaimed but not given back to the operating system</td>
<td>Number of bytes</td>
<td>Gauge</td>
</tr>
<tr>
<td>`sherpa.runtime.malloc_count`</td>
<td>Cumulative count of allocated heap objects</td>
<td>Number of heap objects</td>
<td>Gauge</td>
</tr>
<tr>
<td>`sherpa.runtime.free_count`</td>
<td>Number of freed objects from the heap and should steadily increase over time</td>
<td>Number of freed objects</td>
<td>Gauge</td>
</tr>
<tr>
<td>`sherpa.runtime.heap_objects`</td>
<td>This is a good general memory pressure indicator worth establishing a baseline and thresholds for alerting</td>
<td>Number of objects in the heap</td>
<td>Gauge</td>
</tr>
<tr>
<td>`sherpa.runtime.total_gc_pause_ns`</td>
<td>The total garbage collector pause time since Sherpa was last started</td>
<td>Milliseconds</td>
<td>Summary</td>
</tr>
<tr>
<td>`sherpa.runtime.total_gc_runs`</td>
<td>Total number of garbage collection runs since Sherpa was last started</td>
<td>Number of operations</td>
<td>Gauge</td>
</tr>
</table>
21 changes: 21 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,39 @@ module github.com/jrasell/sherpa
go 1.12

require (
github.com/armon/go-metrics v0.0.0-20190430140413-ec5e00d3c878
github.com/beorn7/perks v1.0.0 // indirect
github.com/davecgh/go-spew v1.1.1
github.com/go-logfmt/logfmt v0.4.0 // indirect
github.com/gogo/protobuf v1.2.1 // indirect
github.com/golang/protobuf v1.3.1 // indirect
github.com/gorilla/mux v1.7.1
github.com/hashicorp/consul/api v1.1.0
github.com/hashicorp/go-cleanhttp v0.5.1
github.com/hashicorp/go-rootcerts v1.0.0
github.com/hashicorp/golang-lru v0.5.1 // indirect
github.com/hashicorp/nomad/api v0.0.0-20190508234936-7ba2378a159e
github.com/inconshreveable/mousetrap v1.0.0 // indirect
github.com/kisielk/errcheck v1.2.0 // indirect
github.com/konsorten/go-windows-terminal-sequences v1.0.2 // indirect
github.com/mattn/go-isatty v0.0.7
github.com/pkg/errors v0.8.1
github.com/prometheus/client_model v0.0.0-20190129233127-fd36f4220a90 // indirect
github.com/prometheus/common v0.4.0 // indirect
github.com/prometheus/procfs v0.0.0-20190507164030-5867b95ac084 // indirect
github.com/rs/zerolog v1.14.3
github.com/ryanuber/columnize v2.1.0+incompatible
github.com/sean-/sysexits v0.0.0-20171026162210-598690305aaa
github.com/sirupsen/logrus v1.4.1 // indirect
github.com/spf13/cobra v0.0.3
github.com/spf13/viper v1.3.2
github.com/stretchr/objx v0.2.0 // indirect
github.com/stretchr/testify v1.3.0
golang.org/x/crypto v0.0.0-20190513172903-22d7a77e9e5f // indirect
golang.org/x/net v0.0.0-20190514140710-3ec191127204 // indirect
golang.org/x/sys v0.0.0-20190514135907-3a4b5fb9f71f // indirect
golang.org/x/text v0.3.2 // indirect
golang.org/x/tools v0.0.0-20190514143549-2d081dbd584e // indirect
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127 // indirect
gopkg.in/square/go-jose.v2 v2.3.1
)
Loading

0 comments on commit f922d06

Please sign in to comment.