-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove premature call to workSocket() in TNonblockingServer #1476
Remove premature call to workSocket() in TNonblockingServer #1476
Conversation
The following tests FAILED: I think the failure in TNonblockingSSLServerTest is related. concurrency_test passes locally for me, so I am guessing that there may be a different problem with it (sporadic failure perhaps?). |
Yes, concurrency_test has sporadic failures - it is a lot better than it used to be; seems to be worse on Visual Studio 10. Perhaps the solution here is to handle EAGAIN properly? |
@jeking3 My latest fix worked fine, but failed to compile on windows. I'll find a variation that works for all platforms. |
@jeking3 This is ready for review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use peek() which is already well-soaked and should provide the same answer?
@@ -78,6 +78,7 @@ class TSSLSocket : public TSocket { | |||
bool peek(); | |||
void open(); | |||
void close(); | |||
bool hasPendingDataToRead(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this what peek() is for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Peek is designed to be blocking for blocking sockets. And for non-blocking ones, it throws an exception if at least 1 byte is not available after a few retries. It may be well-soaked, but not well-designed IMO. It implements MSG_PEEK semantics and not MSG_PEEK | MSG_DONT_WAIT. Not to mention its doxygen being out of touch with its implementation. MSG_PEEK means peak at the data, but do not remove it from the buffer. In order to peek, it waits for the data to arrive. That is very different than what hasPendingDataToRead does.
I can replace the existing peek() with this new implementation if you prefer that. However, I don't know if there will be any implications. It will certainly be a behavior change and my guess is that it will break things. I can test and see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be preferable to have one way to ask, "is there any data I can read" that does not block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the cross test suite currently does not test asynchronous. It is more of a protocol test; it would be interesting to add one more matrix choice of threaded vs. nonblocking for the languages that support it, and test both against both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeking3 You said it would be preferable to have one way to ask, "is there any data I can read" that does not block. I completely agree. But the existing peek
method is not designed to be non-blocking. So I have a few choices here:
- Update the doxygen comment of
peek
to reflect what it does and keephasPendingDataToRead
- Add an optional argument to
peek
that saysnonBlocking=false
and if it is provided astrue
, do whathasPendingDataToRead
used to do and removehasPendingDataToRead
. - Change
peek
to be always non-blocking.
How about #2 above? I think #3 is not easy, as it will require code changes in other places that I am not comfortable with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented option #2. In terms of your matrix choice suggestion: How can we make that happen? Is that something I can help with? I am bothered by the TNonblockingSeverTest
test not catching the problem reported in this bug. The moment we switched to 0.11, it started throwing unexpected exceptions for us, when using Java client against C++ TNonblockingSeverTest
server. This should have been caught by the existing tests. I would like to help with this. So if you have a suggestion for me, please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my comments were wrong; peek blocks until there is something to do when the socket is a blocking socket. I think that if the socket knew it was non-blocking then a call to peek() would behave like a call to hasPendingDataToRead, i.e. it would be a non-blocking call. Perhaps that's a better way to approach it, but I would have to look at the code a little more closely. I think the original intention of my comment was correct, there should be only one way to peek. Calling peek() on a non-blocking socket should not block; calling peek() on a blocking socket should block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same applies to read
. This goes back to my original point. TSocket
should behave differently for blocking vs non-blocking sockets. Look at how TSocket
and TSSLSocket
handle read
s on a non-blocking socket. They throw exceptions on EAGAIN
. Luckily for those of us using TNonBlockingServer
without SSL, it does not ever call read
on a non-SSL socket that is not ready to return data. However, that is not the case for SSL
(as there could be bytes in the socket but not enough to yield application level bytes). In such a case, the TSSLSocket
throws an exception. Read calls in TNonBlockingServer
are retried by catching the exception from TSSLSocket
and searching for the "retry"
string in the exception message. Uggh!
The correct way to fix all this mess is to make TSocket
aware of whether the socket is blocking or non-blocking, and behave accordingly. For instance, it should return that 0 bytes were read when the socket is non-blocking and EAGAIN is received.
My suggesting is to handle the issue of making TSocket
and TSSLSocket
aware of the blocking/non-blocking nature of the socket in a separate Jira issue.
hasPendingDataToRead
is a method that is non-blocking for both blocking and non-blocking sockets and is a valid operation on any type of socket. It corresponds to the MSG_PEEK | MSG_DONT_WAIT
combination on a recv
call, which behaves the same for blocking and non-blocking sockets. I think it has value independent of seek
.
I see now, I may have led you down the wrong path with my comments. So peek() blocks waiting for data the be available to read and the end result of peek is that there is data to read OR the socket disconnected. What you are looking for is a call to see if the socket has data available but does not block. As such, those are different code paths... Perhaps your original code before this last push was correct. I will take another look. |
} | ||
initializeHandshake(); | ||
if (!checkHandshake()) | ||
throw TSSLException("SSL_peek: Handshake is not completed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably say something other than SSL_peek?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
if (!checkHandshake()) | ||
throw TSSLException("SSL_peek: Handshake is not completed"); | ||
// data may be available in SSL buffers (note: SSL_pending does not have a failure mode) | ||
return TSocket::hasPendingDataToRead() || SSL_pending(ssl_) > 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should SSL_pending be checked first for efficiency? First check the SSL buffers, then if those are clear then check the socket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Nice catch.
@@ -78,6 +78,7 @@ class TSSLSocket : public TSocket { | |||
bool peek(); | |||
void open(); | |||
void close(); | |||
bool hasPendingDataToRead(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my comments were wrong; peek blocks until there is something to do when the socket is a blocking socket. I think that if the socket knew it was non-blocking then a call to peek() would behave like a call to hasPendingDataToRead, i.e. it would be a non-blocking call. Perhaps that's a better way to approach it, but I would have to look at the code a little more closely. I think the original intention of my comment was correct, there should be only one way to peek. Calling peek() on a non-blocking socket should not block; calling peek() on a blocking socket should block.
This reverts commit 501c440.
Ok, reverted back. |
@jeking3 anything else I can do to get this into a mergable state? |
Squash to one commit, please. |
No description provided.