Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unique promtail_instance id to labels for gcptarget #3501

Merged
merged 2 commits into from
Mar 17, 2021
Merged

Add unique promtail_instance id to labels for gcptarget #3501

merged 2 commits into from
Mar 17, 2021

Conversation

kavirajk
Copy link
Contributor

@kavirajk kavirajk commented Mar 17, 2021

Rationale:
To make labelset unique per promtail instance to avoid out-of-order errors

What this PR does / why we need it:
Add unique promtail_instance label to gcplog target to avoid out-of-order errors.

Which issue(s) this PR fixes:
Fixes #3492

Special notes for your reviewer:

Checklist

  • Documentation added
  • Tests updated

Copy link
Contributor

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, couple small comments

@@ -53,6 +58,7 @@ func format(
// mandatory label for gcplog
lbs := labels.NewBuilder(nil)
lbs.Set("resource_type", ge.Resource.Type)
lbs.Set("promtail_instance", instanceID.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some explanation here of why we've put this unique instance ID in place will be helpful.
I also think promtail_instance might be somewhat misleading but I can't think of a clearer name right now - perhaps something that captures its purpose a bit more, and/or documentation to describe its purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment and doc explaining the rationale!

Regarding name, first I thought may be instance_id would be good fit. but its too good that original log entries may actually have it in future!. So just went with promtail_instance. If you have any better name feel free :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! I don't think we should duplicate the docs in case they go out of sync - simply linking them with a very brief explanation should suffice.

Perhaps something like this:

// We add a `promtail_instance` label to avoid identical streams with different timestamps if the `use_incoming_timestamp` option is set to false, which can lead to out-of-order errors

Rationale:
To make labelset unique per promtail instance to avoid out-of-order errors
docs/sources/clients/promtail/scraping.md Outdated Show resolved Hide resolved
@@ -206,6 +206,10 @@ Before using `gcplog` target, GCP should be [configured](../gcplog-cloud) with p

It also support `relabeling` and `pipeline` stages just like other targets.

Log entries scraped by `gcplog` will add additional label called `promtail_instance`. This label uniquely identifies each promtail instances trying to scrape gcplog(from single `subscription_id`).
We need this unique identifier to avoid out-of-order errors from Loki servers.
Because say two promtail instances rewrite timestamp of log entries(with same labelset) at the same time may reach Loki servers at different times can cause Loki servers to reject it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here you should perhaps explain a bit more what causes this conflict in log lines - it may not be immediately obvious to users who are new to this. You've added a config option in the docs (use_incoming_timestamp: false # default rewrite timestamps) but this is optional - correct? If users don't notice that log line, they may be confused as to why we are rewriting the timestamp.

This adds another wrinkle: if users opt to not rewrite the timestamp, is the unique UUID even required? (we can leave it in either way, just curious)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the wrinkle: I tried with and without rewriting the timestamp during my deployment. In both cases out of order errors appeared, so this UUID is still needed to be able to ingest GCP logs.

Maybe this is not needed when you are limiting the amount of logs, but I already had issues with just the k8s_cluster resource logs.

Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Copy link
Collaborator

@slim-bean slim-bean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@slim-bean slim-bean merged commit 14b2c09 into grafana:master Mar 17, 2021
cyriltovena added a commit to cyriltovena/loki that referenced this pull request Jun 11, 2021
…3501)

This uses a buffer pool for when we parse the chunk time range value.

```
❯ benchcmp before.txt after.txt
benchmark                                      old ns/op     new ns/op     delta
BenchmarkParseIndexEntries500-16               265343        232446        -12.40%
BenchmarkParseIndexEntries2500-16              1393835       1232639       -11.56%
BenchmarkParseIndexEntries10000-16             5741372       5032199       -12.35%
BenchmarkParseIndexEntries50000-16             30158888      27272835      -9.57%
BenchmarkParseIndexEntriesRegexSet500-16       79101         50394         -36.29%
BenchmarkParseIndexEntriesRegexSet2500-16      386920        246015        -36.42%
BenchmarkParseIndexEntriesRegexSet10000-16     1570068       957154        -39.04%
BenchmarkParseIndexEntriesRegexSet50000-16     7445006       4757111       -36.10%

benchmark                                      old allocs     new allocs     delta
BenchmarkParseIndexEntries500-16               1504           1004           -33.24%
BenchmarkParseIndexEntries2500-16              7504           5004           -33.32%
BenchmarkParseIndexEntries10000-16             30006          20005          -33.33%
BenchmarkParseIndexEntries50000-16             150008         100007         -33.33%
BenchmarkParseIndexEntriesRegexSet500-16       1522           1022           -32.85%
BenchmarkParseIndexEntriesRegexSet2500-16      7522           5022           -33.24%
BenchmarkParseIndexEntriesRegexSet10000-16     30022          20022          -33.31%
BenchmarkParseIndexEntriesRegexSet50000-16     150022         100022         -33.33%

benchmark                                      old bytes     new bytes     delta
BenchmarkParseIndexEntries500-16               96397         32365         -66.43%
BenchmarkParseIndexEntries2500-16              482307        162164        -66.38%
BenchmarkParseIndexEntries10000-16             1928766       648454        -66.38%
BenchmarkParseIndexEntries50000-16             9608702       3207322       -66.62%
BenchmarkParseIndexEntriesRegexSet500-16       88665         24689         -72.15%
BenchmarkParseIndexEntriesRegexSet2500-16      441471        121573        -72.46%
BenchmarkParseIndexEntriesRegexSet10000-16     1764370       484644        -72.53%
BenchmarkParseIndexEntriesRegexSet50000-16     8803429       2404042       -72.69%
```

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
@james-callahan
Copy link
Contributor

Now that loki accepts out-of-order writes should this be backed out?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GCPlog job - entry out of order.
5 participants