-
-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crash in ssl3_read_bytes/ tls_get_message_header #23650
Comments
Does this happen in a client or a server? |
this is on the client : I'll attach the result of |
the output of print *s in gdb. |
If I understand the description of the crash correctly you are experiencing a crash in p = (unsigned char *)s->init_buf->data; From the rest of your description it looks like it is attempting to re-enter "init" to deal with a new handshake message that has arrived after the initial handshake message has completed. It appears we are dealing with TLSv1.3 here (s->version == 772, which is TLSv1.3) - so typically this occurs when a server sends a NewSessionTicket or a KeyUpdate message. After the initial handshake completes we expect the state machine to be in the "MSG_FLOW_FINISHED" state (i.e. We are then supposed to hit this code: Lines 351 to 406 in 19cc035
You can see that, as long as we are in the Subsequently, a bit lower down beyond the end of the block above we change the state in A few of possibilities spring to mind as to how this process has gone wrong:
Not sure if there is any debugging you can do to help figure out which of these scenarios is occurring? |
To the first question: yes. Unfortunately this is very hard to reproduce, it happens around once a week during startup. The server was just started before so maybe it was not completely initialized and sends some garbage in this state. The comment says "unexpected handshake message or protocol violation", it could well be some garbage over the line. |
good news, it looks like I can reproduce it with a restart loop, at least it happened once again. Let me know what you need to be logged (preferably stderr) or attach a patch and I'll try to reproduce it again. Meanwhile I'l get the source rpms... |
Is the client possibly multithreaded and calling the SSL functions simultaneously from multiple threads? |
It would also help further investigating if you could build your client against a clean latest OpenSSL release tarball instead of the Red Hat build to see if the problem is still reproducible. |
yes it is multithreaded (the omniORB library). It's supposed to be one thread per connection or locked properly, but of course it could be a bug there as well. Maybe it's best if I put in some logging and try to recreate the crash and then either post it here or close the issue :-) |
You could definitely see bugs like this if SSL_*() functions are called simultaneously against a single SSL object from multiple threads without proper call serialization through locking. |
I'm still on it. Recreating a libssl replacement in a redhat system is not that easy, with all their patches and special stuff, but I think I got it finally. |
I can now reproduce it with a self-built lib. In the core I find another thread being in ssl code, but with a different SSL * handle. This would allowed, I assume. I now try to somehow log the calls to the crucial functions in memory so I can see what happens exactly, maybe the error happened a short time before... |
Yeah different SSL * in different threads should work with no problems. |
I might have found an unprotected bit of code where SSL_pending is called. I suppose this could cause such a problem when called in parallel? |
While investigating further, I found even more such problems. Especially when enabling bidirectional mode there seems to be a blocking read which hangs around while a send is done. But now I wonder how this actually worked for 10 days with heavy load :-) But I think it's clear now, that this is not an openssl problem, but a usage problem. |
used version is 3.0.7-24 als provided in RHEL9.
I get this crash from time to time:
it looks like it does some kind of reset in ssl3_read_bytes() here:
but it has been forgotten that init_buf is freed and reset to NULL (added around 2020).
The text was updated successfully, but these errors were encountered: