Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leak or other allocation problems (Logstash #4745

Closed
wyardley opened this issue Mar 1, 2016 · 19 comments
Closed

memory leak or other allocation problems (Logstash #4745

wyardley opened this issue Mar 1, 2016 · 19 comments

Comments

@wyardley
Copy link

wyardley commented Mar 1, 2016

2.1.2 using the "official" RPM package on CentOS 7 with OpenJDK v1.8.0.71, I'm continuing to see memory allocation grow over time (implies a memory leak to me) (though apparently not the same issue as #3722)

Error: Your application used more memory than the safety cap of 3783M.
Specify -J-Xmx####m to increase it (#### = cap size in MB).
Specify -w for full OutOfMemoryError stack trace

and

OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000618e00000, 132513792, 0) failed; error='Cannot allocate memory' (errno=12)
# egrep -v "^(#|$)" /etc/sysconfig/logstash 
LS_OPTS=" -w 8 "
LS_HEAP_SIZE="6809m"
LS_CONF_DIR=/etc/logstash/conf.d
LS_OPEN_FILES=65535
KILL_ON_STOP_TIMEOUT=0

The current system has 7566 MB of RAM. Let me know what other information I can provide to help diagnose this. I have the same behavior on 3 systems that I'm using as part of an ELK install (the logstash nodes are dedicated to logstash).

@suyograo
Copy link
Contributor

suyograo commented Mar 3, 2016

Thanks for the report. We need heap dump and config information to start with

@wyardley
Copy link
Author

wyardley commented Mar 4, 2016

Most of the config (the /etc/sysconfig/logstash) is above.

The various config files in /etc/logstash/conf.d (catted together) are:

input {
  tcp {
    port => 5544
    type => syslog
  }
  udp {
    port => 5544
    type => syslog
  }
}
filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
}
output {
  elasticsearch {
    hosts => ["XXXXXX:9200"]
    sniffing => true
  }
}

How can I get the heap dump? As you can see above, $LS_OPTS already seems to have a '-w' flag.

@purbon
Copy link
Contributor

purbon commented Mar 7, 2016

To generate the heap dump there are serveral ways, but you could use for example the jvisualvmt tool (https://docs.oracle.com/javase/8/docs/technotes/tools/unix/jvisualvm.html) provided by the JVM or commands like jmap and or jconsole, you can see specifics documentation in the java documentation.

@wyardley
Copy link
Author

I will use the approach outlined here: https://discuss.elastic.co/t/how-to-capture-a-heap-dump-from-a-running-jvm-logstash/85
I'm running this on our logstash machines now; I'll see if we get a dump if / when they die again.

Is there a chance of any sensitive information being in the heap dump (and if so, is there a way to get this to you all out of band)?

@wyardley
Copy link
Author

btw, it looks like Logstash is saving a heapdump on the OOM crash, in /opt/logstash. The ones I have are a little bit old, so I'm moving them out of the way, and will try to collect a fresher one. Using the approach outlined in the previous link doesn't seem to work as far as collecting from a process that crashes.

@jsvd
Copy link
Member

jsvd commented Mar 14, 2016

This seems to be related to logstash-plugins/logstash-output-elasticsearch#392 and cheald/manticore#45

@suyograo
Copy link
Contributor

This has been fixed in version ES output version 2.5.3.

To install this you can do:

bin/plugin install --version 2.5.3 logstash-output-elasticsearch

@wyardley
Copy link
Author

Will this be backported to the packages for 2.1.x (e.g., 2.1.3), or only hit 2.2.x?

@jsvd
Copy link
Member

jsvd commented Mar 15, 2016

@wyardley this a plugin fix, you just need to update the plugin to 2.5.3. Since other plugins depend on manticore, you need to update all plugins with the same dependencies:

joaoduarte@Joaos-MBP /tmp % tar -zxf logstash-2.1.1.tar.gz
joaoduarte@Joaos-MBP /tmp % cd logstash-2.1.1
joaoduarte@Joaos-MBP /tmp/logstash-2.1.1 % bin/plugin update logstash-output-http  logstash-output-elasticsearch
Updating logstash-output-http, logstash-output-elasticsearch
Error Bundler::InstallError, retrying 1/10
An error occurred while installing manticore (0.5.5), and Bundler cannot continue.
Make sure that `gem install manticore -v '0.5.5'` succeeds before bundling.
WARNING: SSLSocket#session= is not supported
Updated logstash-output-elasticsearch 2.2.0 to 2.5.3
Updated logstash-output-http 2.0.5 to 2.1.1

don't mind the extra error, the update will work. I'm investigating why this is happening

@myth
Copy link

myth commented Mar 15, 2016

Just did the bin/plugin update logstash-output-http logstash-output-elasticsearch command, with identical outut as you. But now all my filebeats instances are triggering: /usr/bin/filebeat[3347]: transport.go:125: SSL client failed to connect with: dial tcp 127.0.0.1:5044: getsockopt: connection refused. Any ideas on where to dig out more info from this?

@jsvd
Copy link
Member

jsvd commented Mar 15, 2016

any logging messages on the logstash side? is the logstash process running?

@myth
Copy link

myth commented Mar 15, 2016

{:timestamp=>"2016-03-15T15:10:25.631000+0100", :message=>"The error reported is: \n \n\n\tyou might need to reinstall the gem which depends on the missing jar or in case there is Jars.lock then resolve the jars withlock_jarscommand\n\nno such file to load -- org/apache/httpcomponents/httpcore/4.4.1/httpcore-4.4.1 (LoadError)"}

That's the new content of logstash.log. logstash.err is empty for today. Got a medium urgency update to filebeat now through apt, gonna try that now and see if it might be related.

Update: My systemctl was taking some time to display logstash as active:exited. Logstash process will not start.

@jsvd
Copy link
Member

jsvd commented Mar 15, 2016

@myth which version of logstash was this? I tested with 2.1.1, but there might be other versions that trigger this

@myth
Copy link

myth commented Mar 15, 2016

2.2.2

@jsvd
Copy link
Member

jsvd commented Mar 15, 2016

strange..please post the config. my experiment did not trigger that:

/tmp % tar -zxf logstash-2.2.2.tar.gz
/tmp % cd logstash-2.2.2
/tmp/logstash-2.2.2 % vim esoutput
/tmp/logstash-2.2.2 % cat esoutput
input {
  generator { }
}

output {
  elasticsearch {
    hosts => ["localhost:9200", "localhost:9201"]
    sniffing => true
  }
}
/tmp/logstash-2.2.2 % bin/plugin update logstash-output-http  logstash-output-elasticsearch
Updating logstash-output-http, logstash-output-elasticsearch
Error Bundler::InstallError, retrying 1/10
An error occurred while installing manticore (0.5.5), and Bundler cannot continue.
Make sure that `gem install manticore -v '0.5.5'` succeeds before bundling.
WARNING: SSLSocket#session= is not supported
Updated logstash-output-elasticsearch 2.5.1 to 2.5.3
/tmp/logstash-2.2.2 % bin/logstash -f esoutput
Settings: Default pipeline workers: 4
Logstash startup completed
^CSIGINT received. Shutting down the pipeline. {:level=>:warn}
Logstash shutdown completed
/tmp/logstash-2.2.2 %

@myth
Copy link

myth commented Mar 15, 2016

I was going to try to re-create the steps to provide you with more details, and started by reinstalling logstash from the package repository, and update the plugins, but the plugins were already updated (did not purge) and that seemed to resolve the issue.

I can try to fire up a similar setup on another VM and see if i can re-create the steps using the config I have.

@jsvd
Copy link
Member

jsvd commented Mar 15, 2016

ok. just for the sake of tidiness, since it's no longer the original concern of this issue, please open another if you're able to replicate this, thanks.

@myth
Copy link

myth commented Mar 15, 2016

Sure, will do. I've been tracking these OOM isses and related issues in other plugins and I appreciate the effort. So thanks and keep up the good work.

@wyardley
Copy link
Author

@jsvd understood, but we use the RPM packages; the jruby libraries (including the stock plugins) are vendored into the package. So I'm asking if the next iteration of the package will have those updated.
The logstash 2.1 package (now 2.1.3-1) still seems to have output plugin 2.4.1 and manticore 0.5.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants