Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filebeat broken pipe #14

Closed
jchannon opened this issue Jan 11, 2016 · 4 comments
Closed

filebeat broken pipe #14

jchannon opened this issue Jan 11, 2016 · 4 comments

Comments

@jchannon
Copy link
Contributor

Hi,

This is probably not your issue so feel free to tell me to create an issue on the elastic filebeat repo.

I've had your docker image working and had filebeat working fine.

However after stopping filebeat and the docker container and starting it all up again, filebeat sees a file but then throws errors which I don't understand and was wondering if you'd seen it before?

2016/01/11 16:50:45.526612 registrar.go:83: INFO Starting Registrar
2016/01/11 16:50:45.526678 filebeat.go:122: INFO Start sending events to output
2016/01/11 16:50:45.526749 spooler.go:77: INFO Starting spooler: spool_size: 1024; idle_timeout: 5s
2016/01/11 16:50:45.526801 log.go:62: INFO Harvester started for file: /persist/blah/2015-12-01.log
2016/01/11 16:50:48.028564 single.go:75: INFO Error publishing events (retrying): write tcp 127.0.0.1:42956->127.0.0.1:5044: write: broken pipe
2016/01/11 16:50:48.028744 single.go:143: INFO send fail
2016/01/11 16:50:48.028802 single.go:150: INFO backoff retry: 1s
2016/01/11 16:50:49.030793 single.go:75: INFO Error publishing events (retrying): write tcp 127.0.0.1:42968->127.0.0.1:5044: write: broken pipe
2016/01/11 16:50:49.030878 single.go:143: INFO send fail
2016/01/11 16:50:49.030953 single.go:150: INFO backoff retry: 2s
2016/01/11 16:50:51.033594 single.go:75: INFO Error publishing events (retrying): read tcp 127.0.0.1:42980->127.0.0.1:5044: read: connection reset by peer
2016/01/11 16:50:51.033879 single.go:143: INFO send fail
@spujadas
Copy link
Owner

Hi,

Nope, never seen these errors before I'm afraid (at least not using ELK and Beats). I tried to reproduce your issue (using Filebeat version 1.0.1), but to no avail: after I stop and start Filebeat and the container, everything's working normally (once Logstash is back up and running, of course).

The only errors I got were when I started Filebeat without starting the container, which threw

2016/01/11 20:36:39.319181 transport.go:125: ERR SSL client failed to connect with: dial tcp 172.17.0.23:5044: getsockopt: no route to host

and when I docker exec'ed into the container and stopped Logstash, which made Filebeat throw

2016/01/11 20:37:46.081347 transport.go:125: ERR SSL client failed to connect with: dial tcp 172.17.0.24:5044: getsockopt: connection refused

… but both these behaviours are exactly the ones that would be expected (i.e. if the IP address or the TCP port is not available, then the connection fails), whereas in your case, the broken pipe and connection reset by peer suggest that the connections get set up properly to begin with, but then break down whilst data is being transferred to and – respectively – from Logstash, which is odd, admittedly. (Usually, these errors tend to be caused by temporary network problems, but I'm guessing that you're testing locally so that seem unlikely here.)

Anyway, could very well be an issue with the container, especially if you're seeing this only when you restart it, but I'd need more information to figure out if that's actually the case. If you could send me your Filebeat config, test log file and steps/commands to reproduce your issue, I'd be happy to do some more testing to investigate.

@jchannon
Copy link
Contributor Author

Hi, so we're not using the certificate stuff and we do that by overriding your file as seen in dockerfile below.

FROM sebp/elk

ADD ./mixins/* /etc/logstash/conf.d/

EXPOSE 5044

#02-beats-input.conf
input {
  beats {
    port => 5044
  }
}

#12-acanomanager.conf
filter {
    if [type] == "vqlog" {
        grok {
            break_on_match => true
            match => {
                "message" => [
                    "%{TIMESTAMP_ISO8601:date}\s+RequestId\s:\s%{UUID:requestid}\sComponent\s:\s%{GREEDYDATA:component}\s+Method\s:\s%{GREEDYDATA:method}\s+Message\s:\s%{GREEDYDATA:info}",
                    "%{TIMESTAMP_ISO8601:date}\s%{GREEDYDATA:info}\s+RequestId\s:\s%{UUID:requestid}",
                    "%{TIMESTAMP_ISO8601:date}\s%{GREEDYDATA:info}"
                ]
            }
        }

        grok {
            break_on_match => true
            match => {
                "path" => [
                    "%{[^/]+(?=/[^/]+$):customer}"
                ]
            }
        }

        date {
            locale => "en"
            match => ["date", "ISO8601", "yyyy-MM-dd HH:mm:ss.SSSS"]
            remove_field => ["date"]
        }
    }
}

#start filebeat

filebeat -e -c filebeat.yml -v

#filebeat config

output:
  logstash:
    enabled: true
    hosts:
      - localhost:5044
    timeout: 15

filebeat:
  prospectors:
    -
      paths:
        - "/persist/**/*.log"
      document_type: vqlog

#start docker

docker run -v elkstorage:/var/lib/elasticsearch -p 5601:5601 -p 9200:9200 -p 5000:5000 -p 5044:5044 -it 192.168.1.45:5000/vqcomms/elk

@jchannon
Copy link
Contributor Author

I think we've narrowed it down to an invalid grok regex and the field wasn't path but should have been source. I'm surprised that it throws that error however

@spujadas
Copy link
Owner

Well spotted! Yep, a broken regex would go towards explaining a few things.

I tried using "%{[^/]+(?=/[^/]+$):customer}" as a grok regex in the 11-nginx.conf file of the default image, and got this output from Logstash:

/var/log/logstash# tail -f logstash.*

==> logstash.log <==
{:timestamp=>"2016-01-12T20:19:02.244000+0000", :message=>"The error reported is: \n  pattern %{[^/]+(?=/[^/]+$):customer} not defined"}

==> logstash.stdout <==
The error reported is:
  pattern %{[^/]+(?=/[^/]+$):customer} not defined

So far so good... but as a side effect, Logstash terminates on the spot (fair enough I suppose), thus breaking the pipe that was established by Filebeat, which as a result now throws:

2016-01-12T20:19:06Z INFO Error publishing events (retrying): write tcp 172.17.0.3:54662->172.17.0.1:5044: write: broken pipe

Another way of breaking the pipe between Filebeat and Logstash is to run the ELK container normally, and then docker exec into it and stop the Logstash service.

So putting the pieces together, it all makes sense. What I find somewhat disconcerting is that Logstash is silently exiting without mentioning it anywhere. (I'm going out on a limb here as I'm not sure if this is related, but plugin failure and shutdown management seems to be a known Logstash issue/topic under discussion: elastic/logstash#2477)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants