memory leak or other allocation problems (Logstash #4745

wyardley · 2016-03-01T22:57:04Z

2.1.2 using the "official" RPM package on CentOS 7 with OpenJDK v1.8.0.71, I'm continuing to see memory allocation grow over time (implies a memory leak to me) (though apparently not the same issue as #3722)

Error: Your application used more memory than the safety cap of 3783M.
Specify -J-Xmx####m to increase it (#### = cap size in MB).
Specify -w for full OutOfMemoryError stack trace

and

OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000618e00000, 132513792, 0) failed; error='Cannot allocate memory' (errno=12)

# egrep -v "^(#|$)" /etc/sysconfig/logstash 
LS_OPTS=" -w 8 "
LS_HEAP_SIZE="6809m"
LS_CONF_DIR=/etc/logstash/conf.d
LS_OPEN_FILES=65535
KILL_ON_STOP_TIMEOUT=0

The current system has 7566 MB of RAM. Let me know what other information I can provide to help diagnose this. I have the same behavior on 3 systems that I'm using as part of an ELK install (the logstash nodes are dedicated to logstash).

The text was updated successfully, but these errors were encountered:

suyograo · 2016-03-03T01:58:17Z

Thanks for the report. We need heap dump and config information to start with

wyardley · 2016-03-04T23:14:52Z

Most of the config (the /etc/sysconfig/logstash) is above.

The various config files in /etc/logstash/conf.d (catted together) are:

input {
  tcp {
    port => 5544
    type => syslog
  }
  udp {
    port => 5544
    type => syslog
  }
}
filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
}
output {
  elasticsearch {
    hosts => ["XXXXXX:9200"]
    sniffing => true
  }
}

How can I get the heap dump? As you can see above, $LS_OPTS already seems to have a '-w' flag.

purbon · 2016-03-07T14:03:37Z

To generate the heap dump there are serveral ways, but you could use for example the jvisualvmt tool (https://docs.oracle.com/javase/8/docs/technotes/tools/unix/jvisualvm.html) provided by the JVM or commands like jmap and or jconsole, you can see specifics documentation in the java documentation.

wyardley · 2016-03-10T18:23:08Z

I will use the approach outlined here: https://discuss.elastic.co/t/how-to-capture-a-heap-dump-from-a-running-jvm-logstash/85
I'm running this on our logstash machines now; I'll see if we get a dump if / when they die again.

Is there a chance of any sensitive information being in the heap dump (and if so, is there a way to get this to you all out of band)?

wyardley · 2016-03-14T18:28:18Z

btw, it looks like Logstash is saving a heapdump on the OOM crash, in /opt/logstash. The ones I have are a little bit old, so I'm moving them out of the way, and will try to collect a fresher one. Using the approach outlined in the previous link doesn't seem to work as far as collecting from a process that crashes.

jsvd · 2016-03-14T22:38:43Z

This seems to be related to logstash-plugins/logstash-output-elasticsearch#392 and cheald/manticore#45

suyograo · 2016-03-15T04:31:50Z

This has been fixed in version ES output version 2.5.3.

To install this you can do:

bin/plugin install --version 2.5.3 logstash-output-elasticsearch

wyardley · 2016-03-15T05:55:08Z

Will this be backported to the packages for 2.1.x (e.g., 2.1.3), or only hit 2.2.x?

jsvd · 2016-03-15T10:22:41Z

@wyardley this a plugin fix, you just need to update the plugin to 2.5.3. Since other plugins depend on manticore, you need to update all plugins with the same dependencies:

joaoduarte@Joaos-MBP /tmp % tar -zxf logstash-2.1.1.tar.gz
joaoduarte@Joaos-MBP /tmp % cd logstash-2.1.1
joaoduarte@Joaos-MBP /tmp/logstash-2.1.1 % bin/plugin update logstash-output-http  logstash-output-elasticsearch
Updating logstash-output-http, logstash-output-elasticsearch
Error Bundler::InstallError, retrying 1/10
An error occurred while installing manticore (0.5.5), and Bundler cannot continue.
Make sure that `gem install manticore -v '0.5.5'` succeeds before bundling.
WARNING: SSLSocket#session= is not supported
Updated logstash-output-elasticsearch 2.2.0 to 2.5.3
Updated logstash-output-http 2.0.5 to 2.1.1

don't mind the extra error, the update will work. I'm investigating why this is happening

myth · 2016-03-15T13:30:27Z

Just did the bin/plugin update logstash-output-http logstash-output-elasticsearch command, with identical outut as you. But now all my filebeats instances are triggering: /usr/bin/filebeat[3347]: transport.go:125: SSL client failed to connect with: dial tcp 127.0.0.1:5044: getsockopt: connection refused. Any ideas on where to dig out more info from this?

jsvd · 2016-03-15T13:48:46Z

any logging messages on the logstash side? is the logstash process running?

myth · 2016-03-15T14:14:21Z

{:timestamp=>"2016-03-15T15:10:25.631000+0100", :message=>"The error reported is: \n \n\n\tyou might need to reinstall the gem which depends on the missing jar or in case there is Jars.lock then resolve the jars withlock_jarscommand\n\nno such file to load -- org/apache/httpcomponents/httpcore/4.4.1/httpcore-4.4.1 (LoadError)"}

That's the new content of logstash.log. logstash.err is empty for today. Got a medium urgency update to filebeat now through apt, gonna try that now and see if it might be related.

Update: My systemctl was taking some time to display logstash as active:exited. Logstash process will not start.

jsvd · 2016-03-15T15:10:04Z

@myth which version of logstash was this? I tested with 2.1.1, but there might be other versions that trigger this

myth · 2016-03-15T15:11:08Z

2.2.2

jsvd · 2016-03-15T15:16:33Z

strange..please post the config. my experiment did not trigger that:

/tmp % tar -zxf logstash-2.2.2.tar.gz
/tmp % cd logstash-2.2.2
/tmp/logstash-2.2.2 % vim esoutput
/tmp/logstash-2.2.2 % cat esoutput
input {
  generator { }
}

output {
  elasticsearch {
    hosts => ["localhost:9200", "localhost:9201"]
    sniffing => true
  }
}
/tmp/logstash-2.2.2 % bin/plugin update logstash-output-http  logstash-output-elasticsearch
Updating logstash-output-http, logstash-output-elasticsearch
Error Bundler::InstallError, retrying 1/10
An error occurred while installing manticore (0.5.5), and Bundler cannot continue.
Make sure that `gem install manticore -v '0.5.5'` succeeds before bundling.
WARNING: SSLSocket#session= is not supported
Updated logstash-output-elasticsearch 2.5.1 to 2.5.3
/tmp/logstash-2.2.2 % bin/logstash -f esoutput
Settings: Default pipeline workers: 4
Logstash startup completed
^CSIGINT received. Shutting down the pipeline. {:level=>:warn}
Logstash shutdown completed
/tmp/logstash-2.2.2 %

myth · 2016-03-15T15:32:26Z

I was going to try to re-create the steps to provide you with more details, and started by reinstalling logstash from the package repository, and update the plugins, but the plugins were already updated (did not purge) and that seemed to resolve the issue.

I can try to fire up a similar setup on another VM and see if i can re-create the steps using the config I have.

jsvd · 2016-03-15T15:38:17Z

ok. just for the sake of tidiness, since it's no longer the original concern of this issue, please open another if you're able to replicate this, thanks.

myth · 2016-03-15T15:40:30Z

Sure, will do. I've been tracking these OOM isses and related issues in other plugins and I appreciate the effort. So thanks and keep up the good work.

wyardley · 2016-03-15T16:35:00Z

@jsvd understood, but we use the RPM packages; the jruby libraries (including the stock plugins) are vendored into the package. So I'm asking if the next iteration of the package will have those updated.
The logstash 2.1 package (now 2.1.3-1) still seems to have output plugin 2.4.1 and manticore 0.5.2.

suyograo added the needs details label Mar 3, 2016

suyograo closed this as completed Mar 15, 2016

jsvd mentioned this issue Mar 16, 2016

Upgrading plugins with manticore throws an error and sometimes corrupts installation #4818

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory leak or other allocation problems (Logstash #4745

memory leak or other allocation problems (Logstash #4745

wyardley commented Mar 1, 2016

suyograo commented Mar 3, 2016

wyardley commented Mar 4, 2016

purbon commented Mar 7, 2016

wyardley commented Mar 10, 2016

wyardley commented Mar 14, 2016

jsvd commented Mar 14, 2016

suyograo commented Mar 15, 2016

wyardley commented Mar 15, 2016

jsvd commented Mar 15, 2016

myth commented Mar 15, 2016

jsvd commented Mar 15, 2016

myth commented Mar 15, 2016

jsvd commented Mar 15, 2016

myth commented Mar 15, 2016

jsvd commented Mar 15, 2016

myth commented Mar 15, 2016

jsvd commented Mar 15, 2016

myth commented Mar 15, 2016

wyardley commented Mar 15, 2016

memory leak or other allocation problems (Logstash #4745

memory leak or other allocation problems (Logstash #4745

Comments

wyardley commented Mar 1, 2016

suyograo commented Mar 3, 2016

wyardley commented Mar 4, 2016

purbon commented Mar 7, 2016

wyardley commented Mar 10, 2016

wyardley commented Mar 14, 2016

jsvd commented Mar 14, 2016

suyograo commented Mar 15, 2016

wyardley commented Mar 15, 2016

jsvd commented Mar 15, 2016

myth commented Mar 15, 2016

jsvd commented Mar 15, 2016

myth commented Mar 15, 2016

jsvd commented Mar 15, 2016

myth commented Mar 15, 2016

jsvd commented Mar 15, 2016

myth commented Mar 15, 2016

jsvd commented Mar 15, 2016

myth commented Mar 15, 2016

wyardley commented Mar 15, 2016