Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP input miss data #51

Closed
kirillkrainov opened this issue Apr 18, 2016 · 6 comments
Closed

TCP input miss data #51

kirillkrainov opened this issue Apr 18, 2016 · 6 comments
Assignees
Labels

Comments

@kirillkrainov
Copy link

kirillkrainov commented Apr 18, 2016

I'm using tcp input to accept logs in json from other services. There is a bug, if you leave parameter -b default, and try send to socket 40-50k json logs w/o any timeouts, you will get a few invalid jsons and json parse error in logstash. Looks like tcp input don't finish with reading whole line till \n and concatenate it with the next one line till the next one \n

@jordansissel
Copy link
Contributor

I don't know what this means: parameter -b default - can you explain?

Can you attach a sample pcap or data set?

@kirillkrainov
Copy link
Author

It's mean, that batch-size for workers in logstash configured by default, it's 125 events or something, when i increased batch size bug dissapiared.

@andrewvc andrewvc added the P4 label May 17, 2016
@feichaohao
Copy link

How to solve the problem

@original-brownbear original-brownbear self-assigned this Jun 20, 2017
@original-brownbear
Copy link
Contributor

@jordansissel hmmmmm :) I tried implementing the plugin in a single threaded nonblocking way this morning, but this one is a huge blocker and I'm not sure how to deal with it.

The problem here is this (in the current implementation):

We decode events from the incoming stream like so:

  • blocking read of max 16384 byte
  • run the buffer through the codec that then decodes a single even from it
  • read again into a fresh buffer

To me, it looks like this approach is just working by accident at the moment, mainly due to the fact that we do blocking reads. Of cause, if we block and clients are sending events one by one, all is well, as long as the connection is super stable and you have enough cores so that you "never miss a read".
But in general, it looks like we're broken if:

  • a client writes more than one event in one packet (TCP_NODELAY/Nagle activated on the client should cause this a lot), also this code in the tests demonstrates the issue:
      socket.puts("PROXY TCP4 1.2.3.4 5.6.7.8 1234 5678\r");
      socket.flush
      event_count.times do |i|
        # unicode smiley for testing unicode support!
        socket.puts("#{i} ☹")
        socket.flush

(flushing forcefully constantly is necessary)

  • we get a partial read (I could see this happening easily if the client is using a buffered output stream to serialize JSON into directly)
  • an event is larger than 16384 byte (not that likely I guess)

Do you have any guidance on how to approach this? To me, it feels like I need to know the length of an event so that I can feed it to our codecs if I want to fix the partial read case. The multiple events in a single read case is easy, I can just carry over the buffered bytes after decoding. But how do I detect if the beginning of my buffer does not contain a full event but only the beginning of one?
With blocking I/O I could see a straightforward fix by simply running the codec on an IO backed by the InputStream directly in a blocking way.
But for NIO the situation seems a lot trickier without any sort of content length header. So far the only approach I could see would be buffering data up to a length that I expect to contain at least one event and then simply try to decode from that data, catch exceptions and subsequently either buffer more or compact (if I read one event from it at least) my buffer and start over? (this would be really horrible with Juby exception handling performance)

@colinsurprenant
Copy link
Contributor

This seems like something milling would solve? elastic/logstash/issues/4858 elastic/logstash/issues/5124

But here wouldn't a codec such as json_lines, which relies on BufferedTokenizer to buffer and decodes upon finding delimiters, just work?

@original-brownbear
Copy link
Contributor

@colinsurprenant This one was actually solved in the end with the help of Jordan. That was just me not understanding how buffers work fully back then + there being an actual bug in the old implementation. Fixed by the move to Netty/Java now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants