TCP input miss data #51

kirillkrainov · 2016-04-18T16:08:10Z

I'm using tcp input to accept logs in json from other services. There is a bug, if you leave parameter -b default, and try send to socket 40-50k json logs w/o any timeouts, you will get a few invalid jsons and json parse error in logstash. Looks like tcp input don't finish with reading whole line till \n and concatenate it with the next one line till the next one \n

jordansissel · 2016-04-18T16:31:49Z

I don't know what this means: parameter -b default - can you explain?

Can you attach a sample pcap or data set?

kirillkrainov · 2016-04-18T20:34:50Z

It's mean, that batch-size for workers in logstash configured by default, it's 125 events or something, when i increased batch size bug dissapiared.

feichaohao · 2017-02-08T06:05:20Z

How to solve the problem

original-brownbear · 2017-06-20T11:48:21Z

@jordansissel hmmmmm :) I tried implementing the plugin in a single threaded nonblocking way this morning, but this one is a huge blocker and I'm not sure how to deal with it.

The problem here is this (in the current implementation):

We decode events from the incoming stream like so:

blocking read of max 16384 byte
run the buffer through the codec that then decodes a single even from it
read again into a fresh buffer

To me, it looks like this approach is just working by accident at the moment, mainly due to the fact that we do blocking reads. Of cause, if we block and clients are sending events one by one, all is well, as long as the connection is super stable and you have enough cores so that you "never miss a read".
But in general, it looks like we're broken if:

a client writes more than one event in one packet (TCP_NODELAY/Nagle activated on the client should cause this a lot), also this code in the tests demonstrates the issue:

      socket.puts("PROXY TCP4 1.2.3.4 5.6.7.8 1234 5678\r");
      socket.flush
      event_count.times do |i|
        # unicode smiley for testing unicode support!
        socket.puts("#{i} ☹")
        socket.flush

(flushing forcefully constantly is necessary)

we get a partial read (I could see this happening easily if the client is using a buffered output stream to serialize JSON into directly)
an event is larger than 16384 byte (not that likely I guess)

Do you have any guidance on how to approach this? To me, it feels like I need to know the length of an event so that I can feed it to our codecs if I want to fix the partial read case. The multiple events in a single read case is easy, I can just carry over the buffered bytes after decoding. But how do I detect if the beginning of my buffer does not contain a full event but only the beginning of one?
With blocking I/O I could see a straightforward fix by simply running the codec on an IO backed by the InputStream directly in a blocking way.
But for NIO the situation seems a lot trickier without any sort of content length header. So far the only approach I could see would be buffering data up to a length that I expect to contain at least one event and then simply try to decode from that data, catch exceptions and subsequently either buffer more or compact (if I read one event from it at least) my buffer and start over? (this would be really horrible with Juby exception handling performance)

colinsurprenant · 2018-03-07T21:44:06Z

This seems like something milling would solve? elastic/logstash/issues/4858 elastic/logstash/issues/5124

But here wouldn't a codec such as json_lines, which relies on BufferedTokenizer to buffer and decodes upon finding delimiters, just work?

original-brownbear · 2018-03-08T00:27:07Z

@colinsurprenant This one was actually solved in the end with the help of Jordan. That was just me not understanding how buffers work fully back then + there being an actual bug in the old implementation. Fixed by the move to Netty/Java now :)

andrewvc added the P4 label May 17, 2016

original-brownbear self-assigned this Jun 20, 2017

colinsurprenant mentioned this issue Feb 19, 2018

V1 Proxy Protocol headers parsed into host, proxy_host and proxy_port fields incorrectly #109

Open

original-brownbear closed this as completed Mar 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TCP input miss data #51

TCP input miss data #51

kirillkrainov commented Apr 18, 2016 •

edited

Loading

jordansissel commented Apr 18, 2016

kirillkrainov commented Apr 18, 2016

feichaohao commented Feb 8, 2017

original-brownbear commented Jun 20, 2017

colinsurprenant commented Mar 7, 2018

original-brownbear commented Mar 8, 2018

TCP input miss data #51

TCP input miss data #51

Comments

kirillkrainov commented Apr 18, 2016 • edited Loading

jordansissel commented Apr 18, 2016

kirillkrainov commented Apr 18, 2016

feichaohao commented Feb 8, 2017

original-brownbear commented Jun 20, 2017

colinsurprenant commented Mar 7, 2018

original-brownbear commented Mar 8, 2018

kirillkrainov commented Apr 18, 2016 •

edited

Loading