Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruler: Recording Rules #3766

Merged
merged 41 commits into from
Jun 2, 2021
Merged

Conversation

dannykopping
Copy link
Contributor

@dannykopping dannykopping commented May 25, 2021

What this PR does / why we need it:
This PR adds Prometheus-style recording rules to Loki.

Recording rules are evaluated in the Ruler component. All recording rules are metric queries evaluated against the configured store, and the resulting samples are sent to any system that implements a Prometheus remote-write receiver; such systems include:

  • Prometheus (>=v2.25.0 with --enable-feature=remote-write-receiver)
  • Cortex
  • Thanos

Which issue(s) this PR fixes:
Fixes #3765

Special notes for your reviewer:
All rule groups are associated to a tenant. To ensure isolation, each rule group gets assigned its own remote-write appender, and each appender keeps its own FIFO queue of samples to write to the remote-write target. If the target cannot receive samples (e.g. service is down), the samples need to be kept in memory until the target is available again.

I've created an EvictingQueue to this end; ultimately we can't store an infinite number of samples, so this queue will evict the oldest samples when it reaches its defined capacity (globally/per-tenant configurable).

I've opted for this approach rather than a full WAL implementation for now due to its simplicity; we can always deprecate this in favour of a WAL should the need arise in the future.

Metrics added:

  • recording_rules_samples_evicted_total
  • recording_rules_samples_queued_total
  • recording_rules_samples_queued_current
  • recording_rules_samples_queue_capacity
  • recording_rules_remote_write_errors

This is obviously biasing towards newer rather than older samples; should a user want to disable this functionality (an argument could be made biasing towards older samples) and rather alert when a remote-write queue is full, that could be implemented fairly easily.

Checklist

  • Documentation added
  • Tests updated

Danny Kopping added 24 commits March 7, 2021 14:18
…heus

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Adding remote_write config for ruler

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
This also mirrors Cortex's package structure

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Refactoring

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Renaming "buffer size" to "queue capacity" to be more accurate

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Minor refactoring

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
@dannykopping dannykopping self-assigned this May 25, 2021
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
@dannykopping dannykopping force-pushed the dannykopping/recording_rules branch from 0adb92b to 858934a Compare May 25, 2021 22:06
Danny Kopping added 6 commits May 31, 2021 17:38
…eflective of usage

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Removing extraneous checks

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Danny Kopping added 2 commits June 1, 2021 09:10
Minor refactoring

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Refactoring for testability

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@sandeepsukhani sandeepsukhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


// RegisterFlags adds the flags required to config this to the given FlagSet.
func (c *RemoteWriteConfig) RegisterFlags(f *flag.FlagSet) {
f.BoolVar(&c.Enabled, "ruler.remote-write.enabled", false, "Remote-write recording rule samples to Prometheus-compatible remote-write receiver.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remote write config itself isn't specifiable via flags, but I think that is OK. It's not our package (thus would be hard to guarantee we always have flags for every field) and doesn't implement this function already. We've been increasingly favoring file configuration to flags as well.

"github.com/prometheus/client_golang/prometheus/promauto"
)

var samplesEvicted = promauto.NewCounterVec(prometheus.CounterOpts{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to see these instantiated rather than globally defined (example: https://github.com/grafana/loki/blob/main/pkg/ingester/metrics.go)

Help: "Number of samples queued to be remote-written.",
}, []string{"user_id", "group_key"})

var samplesQueueCapacity = promauto.NewGaugeVec(prometheus.GaugeOpts{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a per tenant limit, but the metric is partitioned by group. Can we make this only partitioned by tenant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point, will do.

return nil
}

// PrepareRequest takes the given queue and serialized it into a compressed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// PrepareRequest takes the given queue and serialized it into a compressed
// PrepareRequest takes the given queue and serializes it into a compressed

defer q.Unlock()

if len(q.entries) >= q.capacity {
q.evictOldest()
Copy link
Member

@owen-d owen-d Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: What you're doing here looks good, but we could be a bit safer in the case the entries somehow become significantly larger than the capacity (doesn't look possible in the current code).

This change would make Append work as expected regardless of the prior length

Suggested change
q.evictOldest()
q.onEvict()
start := (len(q.entries)-q.Capacity()) + 1
q.entries = append(q.entries[:0], q.entries[start:]...)

"github.com/prometheus/prometheus/config"
)

// DefaultQueueCapacity defines the default size of the samples queue which will hold samples
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is unused

@@ -122,6 +123,7 @@ func (l *Limits) RegisterFlags(f *flag.FlagSet) {

f.IntVar(&l.RulerMaxRulesPerRuleGroup, "ruler.max-rules-per-rule-group", 0, "Maximum number of rules per rule group per-tenant. 0 to disable.")
f.IntVar(&l.RulerMaxRuleGroupsPerTenant, "ruler.max-rule-groups-per-tenant", 0, "Maximum number of rule groups per-tenant. 0 to disable.")
f.IntVar(&l.RulerRemoteWriteQueueCapacity, "ruler.remote-write.queue-capacity", 10000, "Capacity of remote-write queues; if a queue exceeds its capacity it will evict oldest samples.")
Copy link
Member

@owen-d owen-d Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit hesitant to add this to the overrides because the overrides are expected to be updated during runtime. This would mean that all internal queue capacities would need to be adjusted at some regular interval (for instance if the queue capacity configuration was increased or decreased). I'm not opposed to either approach:

  1. Removing this as a runtime configuration and making it a static config across all tenants in the ruler config block
  2. Exposing it as a runtime config per tenant and re-adjusting queue capacities on the fly

(2) Would entail some more work, though.

Edit:
After thinking about this a bit more, I'd like to expose it as a per tenant config, primarily because
(A) While unlikely, some tenants may wish to evaluate rules significantly faster than others.
(2) It's a pain to move from a global configuration in a component block to a per tenant configuration in the limits because there are then 3 places where a config can be set: (the component, the base limits, per tenant overrides)

Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all this looks great. I've left a few nits and one issue about runtime configurations not reloading which I'd like us to solve.

// create or retrieve an appender associated with this groupKey (unique ID for rule group)
appender, found := a.groupAppender[groupKey]
if found {
return appender
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is ultimately called on every evaluation cycle in the Ruler, we can probably add a method WithCapacity which can mutate an appenders capacity and thus will be kept up to date with the overrides (would need a test)

Suggested change
return appender
return appender.WithCapacity(capacity)

Danny Kopping added 5 commits June 2, 2021 11:59
Refactoring

Unexporting and refactoring memstore metrics to match

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 🎉

Evaluations *prometheus.CounterVec
Samples prometheus.Gauge // in memory samples
CacheHits *prometheus.CounterVec // cache hits on in memory samples
type memstoreMetrics struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to make these private, but I did link you to ingesterMetrics as an example to follow which are private though 😆. In the long term, I think we should expose them consistently and I'd favor public Metric structs where possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Prometheus-style recording rules
4 participants