Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/promtail: IETF Syslog (RFC5424) Support #1275

Merged
merged 8 commits into from
Dec 13, 2019

Conversation

bastjan
Copy link
Contributor

@bastjan bastjan commented Nov 18, 2019

What this PR does / why we need it:

Allow promtail to ingest IETF syslog (RFC5424).

Limitations:

  • No remote time support (tried it but got out-of-order errors due to data races across the network)
  • No UDP/ No BSD syslog (RFC3164).

Performance:

Tested on my somewhat slow machine (Intel Core i5-3210M). Used the loggen tool from syslog-ng with 200 parallel log streams. Minimal relabeling.

Configuration
server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://localhost:3100/loki/api/v1/push

scrape_configs:
- job_name: syslog
  syslog:
    listen_address: localhost:1514
    labels:
      job: "syslog"
  relabel_configs:
  - source_labels: ['__syslog_message_hostname']
    target_label: 'hostname'
> loggen --syslog-proto --active-connections 200 --rate 10000 --stream localhost 1514
average rate = 78671.04 msg/sec, count=2124253, time=27.001, (average) msg size=260, bandwidth=19963.56 kB/sec

Which issue(s) this PR fixes:
Fixes #153, Fixes #935

Special notes for your reviewer:

Currently tracks the develop branch of github.com/influxdata/go-syslog since the v2.0.0 releases go mod support is broken.

Checklist

  • Documentation added
  • Tests updated

@CLAassistant
Copy link

CLAassistant commented Nov 18, 2019

CLA assistant check
All committers have signed the CLA.

@rfratto
Copy link
Member

rfratto commented Nov 18, 2019

This is great, and something we definitely want in Promtail! I talked with the team and it seems the UDP limitation is the first thing we might want to get resolved since it's so common in the syslog world.

cc @slim-bean

@bastjan
Copy link
Contributor Author

bastjan commented Nov 18, 2019

This is great, and something we definitely want in Promtail! I talked with the team and it seems the UDP limitation is the first thing we might want to get resolved since it's so common in the syslog world.

Great to hear! I'm planning to bring this merge request to a mergeable state in the next two days.

I would really appreciate a review of the code/concept before adding more features.

I never implemented an UDP server before in go but it does not look too complicated... I think I can do an initial version over the next two weeks.

@bastjan bastjan force-pushed the syslog-support branch 2 times, most recently from c5f041c to 2cd1104 Compare November 19, 2019 20:19
@bastjan bastjan marked this pull request as ready for review November 19, 2019 20:20
@bastjan
Copy link
Contributor Author

bastjan commented Nov 26, 2019

I used this feature for a while now in our staging environment with syslog-ng in front for UDP/TLS. Works great with an average of around 4400 lines/s (90% of the traffic is simulated). Memory usage hovers around 75MB.

@bastjan
Copy link
Contributor Author

bastjan commented Nov 27, 2019

After using loki and promtail some more I'm wondering if the syslog parser should move to the pipeline stage.

Something like this:

scrape_configs:
- job_name: syslog
  syslog:
    listen_address: 0.0.0.0:1514
    labels:
      job: "syslog"
  entry_parser: syslog-rfc5452
  entry_parser_options:
    label_structured_data: yes

or

scrape_configs:
- job_name: syslog
  syslog:
    listen_address: 0.0.0.0:1514
    labels:
      job: "syslog"
  entry_parser: raw
  pipeline_stages:
  - syslog-rfc5452:
      label_structured_data: yes
  - timestamp:
      source: timetamp
      format: 2006-01-02T15:04:05.999999Z07:00 # (RFC3339Micro)

I haven't used promtail enough to form an opinion on it.

RFC5452 is quite static and the spec is usually correctly implemented. So I don't feel like there is any advantage in flexibility. RFC3164 (BSD Syslog) on the other hand is rarely correctly implemented, the format varies a lot, and I wonder if the parsing it should just be a pipeline stage.

@cyriltovena
Copy link
Contributor

I don’t know much about syslog specifications (/cc @slim-bean ) but if there’s is no clear advantage I would go for the first config example.

I think it’s also better to keep the syslog configuration specific to the syslog scrape config.

I’m wondering if the use case of ingesting multiple specifications from one instance of promtail should be covered. Although we could run 2 instances with different config.

@bastjan
Copy link
Contributor Author

bastjan commented Dec 3, 2019

I’m wondering if the use case of ingesting multiple specifications from one instance of promtail should be covered. Although we could run 2 instances with different config.

You could always have two or more scrape configs.

A nice feature to have is support for both specifications on the same listener. Syslog-NG does not support this natively, and we had some problems because not every appliance does support setting a non-standard port. I've written a small syslog "router" at work that accepts both protocols and routes them to the correct Syslog-NG listener.

@pstibrany
Copy link
Member

Thank you for your continuous improvements on this PR. It is not yet clear to us, whether syslog receiving should be part of Promtail or not. Syslog ecosystem is quite big, and there may be better alternatives (eg. push logs from syslog-ng / rsyslog directly through their http or program outputs, or perhaps even adding direct support there). That's not to say that we don't want this contribution, but we're still investigating what the best way forward is.

@pstibrany
Copy link
Member

pstibrany commented Dec 3, 2019

I used this feature for a while now in our staging environment with syslog-ng in front for UDP/TLS.

Btw, since you're already running syslog-ng, how would direct exporter for syslog-ng to Loki work for you? I mean, without using promtail. I understand that you would lose some of the power of Promtail, on the other hand, direct export from syslog-ng would not lose logs if Promtail was down. It would also keep Promtail bit simpler, because it wouldn't need to understand different syslog formats or listening to additional tcp/udp ports.

@bastjan
Copy link
Contributor Author

bastjan commented Dec 3, 2019

Btw, since you're already running syslog-ng, how would direct exporter for syslog-ng to Loki work for you?

That would work great, but a C syslog-ng plugin won't arrive in the mainstream distros for a while, and a python plugin is hard to package and deploy.

Having at least IETF syslog over TCP makes the integration of rsyslog and syslog-ng a breeze without writing plugins for both (and every other new product).

syslog-ng
destination d_loki {
  syslog("localhost" transport("tcp") port(6514));
};
rsyslog
action(type="omfwd" protocol="tcp" port="6514" Template="RSYSLOG_SyslogProtocol23Format")

I haven't tried to use the http output plugin of syslog-ng for loki yet, but the config will get complicated fast.

direct export from syslog-ng would not lose logs if Promtail was down

For more complicated setups there can still be a syslog-ng/rsyslog in front with the "reliable" parameter set.

@pstibrany
Copy link
Member

I haven't tried to use the http output plugin of syslog-ng for loki yet, but the config will get complicated fast.

I looked at the documentation earlier today, and while basic http integration would not be that difficult, problem is that it would use one request per message. Syslog-ng http output also supports batching, but it would be difficult to massage output to what Loki currently expects.

@bastjan
Copy link
Contributor Author

bastjan commented Dec 3, 2019

I looked at the documentation earlier today, and while basic http integration would not be that difficult, problem is that it would use one request per message. Syslog-ng http output also supports batching, but it would be difficult to massage output to what Loki currently expects.

Yeah that's what I expected. I created a python plugin prototype some time ago which uses the provided gRPC definitions https://gist.github.com/bastjan/cc8f25b6ea7395689e2e110836e312ce. I can provide the full repo but have to strip company stuff out before...

@slim-bean
Copy link
Collaborator

Sorry I'm late to the party here. I'm in favor of getting this added to promtail, where are we at with this, what do we need to get this merged?

I looked previously at the example config and would like to point out the entry_parser key has been deprecated and we should avoid using it, maybe we just need another key name for this.

Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have few minor comments, but overall this looks pretty good, thank you!

After some thinking and internal discussion, I think we should start with TCP-only, and use it mostly as a writing destination for real syslog server, not as a replacement. In that light, perhaps we don't even need to support non-transparent-framing, but since it's so trivial to implement, we can keep that.

To your earlier question, I don't think this should be a pipeline, so we're good here.

docs/clients/promtail/configuration.md Outdated Show resolved Hide resolved
docs/clients/promtail/configuration.md Outdated Show resolved Hide resolved
docs/clients/promtail/configuration.md Outdated Show resolved Hide resolved
docs/clients/promtail/configuration.md Outdated Show resolved Hide resolved
pkg/promtail/targets/syslogtarget.go Outdated Show resolved Hide resolved
pkg/promtail/targets/syslogtarget.go Outdated Show resolved Hide resolved
pkg/promtail/targets/syslogtarget.go Show resolved Hide resolved
pkg/promtail/targets/syslogtarget.go Outdated Show resolved Hide resolved
pkg/promtail/targets/syslogtarget_test.go Outdated Show resolved Hide resolved
pkg/promtail/targets/syslogparser/syslogparser.go Outdated Show resolved Hide resolved
@pstibrany
Copy link
Member

@bastjan btw, if you currently don't have time to take another look at this PR, I have some capacity now and can take over and polish it, if you want.

@bastjan
Copy link
Contributor Author

bastjan commented Dec 10, 2019

@bastjan btw, if you currently don't have time to take another look at this PR, I have some capacity now and can take over and polish it, if you want.

Thanks for the review!

I've got time to polish the PR, should be able to finish it by today. :)

@pstibrany
Copy link
Member

I've got time to polish the PR, should be able to finish it by today. :)

That's great, thank you! (No need to hurry :) )

* Use context to shutdown server.
* Use util.Backoff from cortex
* Do not embed mutex into TestLabeledClient
* Use strings.HasPrefix
* Better naming of connectionsClosed -> openConnections
@bastjan
Copy link
Contributor Author

bastjan commented Dec 10, 2019

@pstibrany I think I resolved most issues. Take a look and tell me what you think. :)

Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Changes look good!

I would like to avoid using 2 goroutines per connection (see suggestion in syslogparser.go).

pkg/promtail/targets/syslogtarget.go Show resolved Hide resolved
pkg/promtail/targets/syslogtarget.go Outdated Show resolved Hide resolved
pkg/promtail/targets/syslogtarget.go Show resolved Hide resolved
pkg/promtail/targets/syslogtarget_test.go Show resolved Hide resolved
pkg/promtail/targets/syslogparser/syslogparser.go Outdated Show resolved Hide resolved
pkg/promtail/targets/syslogparser/syslogparser.go Outdated Show resolved Hide resolved
pkg/promtail/targets/syslogtarget.go Outdated Show resolved Hide resolved
@bastjan bastjan requested a review from pstibrany December 10, 2019 16:40
Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing my feedback! I've left one more comment about non-transparent-framing, please take a look.

pkg/promtail/targets/syslogtarget.go Show resolved Hide resolved
pkg/promtail/targets/syslogparser/syslogparser_test.go Outdated Show resolved Hide resolved
* Callback instead of channel in ParseStream.
  Removes one running Goroutine per connection.
* Improve parse error log level.
* Move mutex above field it protects.
* Use backoff.Ongoing()
@bastjan bastjan force-pushed the syslog-support branch 2 times, most recently from 2a43852 to 03b587f Compare December 11, 2019 16:17
Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I especially like documentation changes. Thank you very much for addressing my feedback!

pkg/promtail/targets/syslogparser/syslogparser.go Outdated Show resolved Hide resolved
docs/clients/promtail/configuration.md Show resolved Hide resolved
Copy link
Member

@rfratto rfratto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some small comments around the docs.

docs/clients/promtail/configuration.md Outdated Show resolved Hide resolved
docs/clients/promtail/configuration.md Outdated Show resolved Hide resolved
@bastjan bastjan requested a review from rfratto December 12, 2019 19:16
// Details returns target-specific details.
func (t *SyslogTarget) Details() interface{} {
return map[string]string{}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to get info useful for checking how does the target, I think we should create an issue, have you seen the /targets and /service-discovery page in Promtail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to get info useful for checking how does the target, I think we should create an issue, have you seen the /targets and /service-discovery page in Promtail?

Just took a look at those pages and indeed there is some useful information missing from the syslog target.

I think the most obvious candidates would be open connections in /targets and the listen address in /service-discovery.

Have to dig in a bit further to see how those pages look in different setups...

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is one of those awesome contribution !

You really did a great job.

Thanks a lot

@cyriltovena cyriltovena merged commit f0f6f24 into grafana:master Dec 13, 2019
@bastjan bastjan deleted the syslog-support branch December 13, 2019 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A way to ingest syslog messages over the network and into Loki Support syslog-ng
6 participants