Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jersey Apache Connector can hang #4321

Closed
jansupol opened this issue Nov 23, 2019 · 18 comments
Closed

Jersey Apache Connector can hang #4321

jansupol opened this issue Nov 23, 2019 · 18 comments
Assignees

Comments

@jansupol
Copy link
Contributor

We had reportings that Jersey Apache Connector can throw an exception:

And for Apache Http Client 4.5.1+, the Apache Connector as it was in Jersey pre 2.29 stopped working since Apache changed the way the connection is closed.

So the Apache Connector has been changed in 2.29, so that it works with Apache Connector 4.5.1+ and MalformedChunkCodingException can be thrown.

However, the Apache Connector can hang now:

The current issue is when migrating from Jersey pre 2.29 to 2.29, the connection hang as the timeout is not set.

@jansupol
Copy link
Contributor Author

One option is to keep behaviour of Jersey 2.28 for Apache HttpClient 4.5 and the behaviour of Jersey 2.29 for Apache HttpClient 4.5.1+. Optionally, this default behaviour can be overridden.
See #4322

Another option is to close the entity stream in an executor service thread with a timeout, but I am worried about performance, then.

@jansupol
Copy link
Contributor Author

@agherardi Any thoughts?

@jansupol jansupol self-assigned this Nov 23, 2019
@agherardi
Copy link
Contributor

I have a question (apologies - I am no longer working with Jersey. so I cannot easily test this myself):

  • If the client consumes the entire entity stream (e.g., by fully parsing a JSON response) then closes the stream, close() returns immediately and the connection is reused - correct? In other words, if the client consumes the entire stream, close() does not hang even if the client doesn't set a read timeout. Correct?

@ok2c
Copy link

ok2c commented Nov 24, 2019

@agherardi Correct.

@agherardi
Copy link
Contributor

Thanks.

Then I can think of the following scenarios that cause close() to hang:

  1. The server is misbehaved and doesn't send the whole response content
  2. The server keeps sending more content by design - e.g., the content is a live video stream
  3. The underlying TCP/IP connection goes into a blackhole due to a network issue

I like the idea of giving the client application a way to specify whether close() should either:

  • Try to complete the request/response cycle gracefully, so that the connection can be reused - with the caveat that close() may be blocked up to the read timeout; or
  • Tear down the connection and return immediately. That's the best way to handle Place holder issue for Pull Request 2 #2 above, regardless of whether or not the client sets a read timeout.

So @jansupol 's approach looks good to me. My only suggestion would be to rename the closing strategies to something that's easier to understand - for instance "graceful close" vs "immediate close".

@jansupol
Copy link
Contributor Author

@agherardi Sorry to hear you no longer use Jersey.

To answer your question, in the case of ChunkedInpuStream, it seems difficult to consume the entity stream, the ChunkedInputStream keeps reading from the underlying SocketInputStream...

I filed an issue to the Apache JIRA.

@jansupol
Copy link
Contributor Author

So as far as I understand it, Apache says SSE (Part of HTML5) violates the HTTP protocol. SSE uses Transfer-Encoding: chunked, which in turn is handled by ChunkedInputStream. The ChunkedInputStream expects the chunk stream to be closed by 0\r\n\r\n, but when closing from the client side (which is possible by SSE) 0\r\n\r\n will never be put to the chunk stream.

@ok2c
Copy link

ok2c commented Nov 25, 2019

So as far as I understand it, Apache says SSE (Part of HTML5) violates the HTTP protocol

@jansupol I am entitled to speak on behalf of the ASF or even Apache HttpComponents but such as my interpretation of the HTTP specification.

@jansupol
Copy link
Contributor Author

@agherardi Thank you for looking into this.

I agree that STREAM_RESPONSE and RESPONSE STREAM are not the best names. Maybe for backward compatibility "PRE_JERSEY_229" vs "JERSEY_229" would be an option. As I understand it, "graceful close" is for the new closing strategy that may hang and "immediate close" is the original way, which throws an exception with Apache Http Client 4.5.1+. Not bad names, I cannot come up with better ones.

@jansupol
Copy link
Contributor Author

@ok2c Thank you for the opinion. SSE is part of the HTML5 standard designed to work on the HTTP 1.1 protocol. As such, it makes sense to me to support it.

@ok2c
Copy link

ok2c commented Nov 25, 2019

@jansupol Being HTML5 specification compliant does not relieve SSE server endpoints from having to respect the HTTP specification requirements with regards to connection persistence and re-use.
If a particular client endpoint does not want or need to keep connections persistent there is nothing stopping it.

@agherardi
Copy link
Contributor

@jansupol up to you, of course, but I'd focus on the behavior - graceful vs immediate close - rather than the version.

With httpclient 4.5.1+ you can implement both graceful and immediate close. Can you implement both with < 4.5.1, possibly eating up the exception? If yes, great. If not, you can throw an exception if the app tries to set a close mode that is not supported.

After doing that, you can decide what the default close mode should be.

On a different note: The choice between graceful and immediate close is independent of the underlying stack - it is not an httpclient "thing". So perhaps the option should be generic rather than Apache-connector specific?

@jansupol
Copy link
Contributor Author

jansupol commented Dec 6, 2019

@ok2c If I read your comment correctly, you recommend not to close the content stream, only the response, the way it is described in the Apache Documentation. The connection is not reusable, but it is not the main issue. The bigger issue I find is that the server side does not get notified that the connection is closed even when the client is closed in that case.

@jansupol
Copy link
Contributor Author

jansupol commented Dec 6, 2019

@agherardi I like the idea of having the closing mechanism generic, but for the other connectors this is not needed to be set, so far this seems to be Apache Connector specific.

As I have dug deeper, there can be multiple options on how the user prefers to close the connection, so I prefer to have an interface that is to handle the close, and the user can provide whatever solution is preferred, based on timeout properties, given request URL, or other options. That way the names of the strategies do not need to be thought about...

@ok2c
Copy link

ok2c commented Dec 6, 2019

The bigger issue I find is that the server side does not get notified that the connection is closed even when the client is closed in that case.

@jansupol Why? The content stream never closes the underlying socket. The socket is closed by the response object if the content stream has not been fully consumed.
I am not sure what makes you think that the server side is not going to get notified.

@jansupol
Copy link
Contributor Author

jansupol commented Dec 6, 2019

@ok2c In the case only a CloseableHttpResponse is closed, it is still possible to write into the Chunk Stream on the Server side in my tests. I found a workaround to call HttpUriRequest#abort() that closes the server side properly. When trying to write to the server chunk stream, an IOException is thrown that the client closed the connection, but this does not happen when only the CloseableHttpClient is closed.

@ok2c
Copy link

ok2c commented Dec 6, 2019

In the case only a CloseableHttpResponse is closed, it is still possible to write into the Chunk Stream on the Server side in my tests.

@jansupol This looks like a server side issue.

@jansupol
Copy link
Contributor Author

jansupol commented Dec 6, 2019

I think the issue is solved by #4338. The ApacheConnectionClosingStrategy is the interface that can be provided by the user. By default, there are two implementations, one that used to work with Apache Client 4.5 (the ImmediateCloseStrategy), one that seems to solve all the issues with Apache Client 4.5.1+ (the GracefulCloseStrategy).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants