Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Double Close of File Descriptors by Libwebsockets Causing System Bugs #1807

Closed
2 tasks done
YOSI-yoshidayuji opened this issue Sep 14, 2023 · 8 comments
Closed
2 tasks done
Labels
bug Something isn't working dependency Related to a dependency pending-action

Comments

@YOSI-yoshidayuji
Copy link

Please confirm you have already done the following

  • I have searched the repository for related/existing bug reports
  • I have all the details the issue requires

Describe the bug

By applying the libwebsockets-leak-pipe-fix.patch, a double close of file descriptors by libwebsockets occurs.
This causes a hard-to-reproduce, sometimes fatal bug in the system.
To resolve this issue, the following patch for libwebsockets is also required.
warmcat/libwebsockets@f7aff78
Our organization has reported this issue through support, but as it has not yet been resolved, I will also report it as a bug.

Expected Behavior

Ensure that libwebsockets does not arbitrarily close file descriptors that are unrelated to itself.

Current Behavior

If a process opens the same file descriptor between the first close and the second close, that file descriptor is arbitrarily closed.

Reproduction Steps

I discovered the double close of libwebsocket by creating closehook.so from the following close_hook.c,
and replacing the close process with LD_PRELOAD.

Compile command:
gcc -g -Wall -D_GNU_SOURCE -fPIC -shared -o closehook.so close_hook.c -ldl

Execution of the program:
LD_PRELOAD=/path/to/closehook.so target

-- close_hook.c
#include <stdio.h>
#include <dlfcn.h>
#include <errno.h>
#include <string.h>

typedef int (*ORIGINAL_CLOSE)(int fd);
int close( int fd)
{
ORIGINAL_CLOSE original_close = (ORIGINAL_CLOSE)dlsym(RTLD_NEXT, "close");
int ret;
ret = original_close( fd );
if( ret == -1 ) {
fprintf( stderr, "close error fd=%d errno(%d) %s\n", fd, errno, strerror(errno) );
}
return ret;
}

WebRTC C SDK version being used

1.7.3 (But 1.8.1 is same)

Compiler and Version used

gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

Operating System and version

Ubuntu 11.4.0

Platform being used

Linux

@YOSI-yoshidayuji YOSI-yoshidayuji added bug Something isn't working needs-triage labels Sep 14, 2023
@niyatim23
Copy link
Contributor

Hi @YOSI-yoshidayuji, thanks for reporting this. We are already tracking this internally as well. We'll notify you once we have a fix out

@disa6302
Copy link
Contributor

@YOSI-yoshidayuji ,

I appreciate the deep dive you have done! Before we investigate including this as a patch in the SDK, can you clarify what the fatal bug is? Is it a crash? Was the application run under gdb to confirm the source of the crash? If so, do you have a stack trace you can attach here?

@YOSI-yoshidayuji
Copy link
Author

My program uses a token with a set validity period for connecting to the signaling server.
There is a process to reconnect to the signaling server with a new token to extend the connection time.
A bug in libwebsocket became apparent during the disconnection and connection process with the signaling server.
The issue does not always occur, it is rare, occurring about once every 2000 times.
It is unpredictable what will happen.
Most frequently, the program crashes due to an assert statement in libwebsocket.
I will show the core dump at the time as a reference at the end.

Even if the program does not crash, problems occurred,
such as the sockfd created by the socket() function being unusable due to a "Bad file descriptor".
It is unpredictable in which thread and when the failure will occur in the program linked with the WebRTC SDK.

--
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:57
57 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0xa1d86420 (LWP 28206))]
(gdb) backtrace
#0 0xb66eb5e0 in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:57
#1 0xb66ec92c in __GI_abort () at abort.c:89
#2 0xb66e45ac in __assert_fail_base (fmt=0xb67f80a4 "", assertion=0xb6030aa8 "context->lws_lookup[wsi->desc.sockfd - (0)] == 0",
assertion@entry=0x2 <error: Cannot access memory at address 0x2>, file=0xb6030a2c "/usr/src/debug/webrtc-sdk/1.7.3-r0/git/open-source/libwebsockets/build/src/project_libwebsockets/lib/plat/unix/unix-fds.c",
file@entry=0xa1d86420 "\001", line=133,
line@entry=3061809316, function=function@entry=0xb60309f4 <PRETTY_FUNCTION.36885> "insert_wsi") at assert.c:92
#3 0xb66e4668 in __GI___assert_fail (assertion=0x2 <error: Cannot access memory at address 0x2>, file=0xa1d86420 "\001", line=3061809316,
line@entry=133, function=0xb60309f4 <PRETTY_FUNCTION.36885> "insert_wsi") at assert.c:101
#4 0xb6000f90 in insert_wsi (context=0xb67f80a4 , context@entry=
0xb4c90000, wsi=wsi@entry=0x9aa73a00)
at /usr/src/debug/webrtc-sdk/1.7.3-r0/git/open-source/libwebsockets/build/src/project_libwebsockets/lib/plat/unix/unix-fds.c:132
#5 0xb6014a8c in __insert_wsi_socket_into_fds (context=
0xb4c90000, wsi=wsi@entry=0x9aa73a00)
at /usr/src/debug/webrtc-sdk/1.7.3-r0/git/open-source/libwebsockets/build/src/project_libwebsockets/lib/core-net/pollfd.c:315
---Type to continue, or q to quit---
#6 0xb601c030 in lws_client_connect_3_connect (wsi=wsi@entry=0x9aa73a00, ads=,
ads@entry=0xb568035c "m-26d02974.kinesisvideo.ap-northeast-1.amazonaws.com", result=, n=16, n@entry=0, opaque=,
opaque@entry=0x0)
at /usr/src/debug/webrtc-sdk/1.7.3-r0/git/open-source/libwebsockets/build/src/project_libwebsockets/lib/core-net/client/connect3.c:363
#7 0xb601b44c in lws_client_connect_2_dnsreq (wsi=0x9aa73a00)
at /usr/src/debug/webrtc-sdk/1.7.3-r0/git/open-source/libwebsockets/build/src/project_libwebsockets/lib/core-net/client/connect2.c:388
#8 0xb601dc54 in lws_header_table_attach (wsi=wsi@entry=0x9aa73a00, autoservice=autoservice@entry=0)
at /usr/src/debug/webrtc-sdk/1.7.3-r0/git/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/http/parsers.c:291
#9 0xb6021244 in rops_client_bind_h1 (wsi=0x9aa73a00, i=)
at /usr/src/debug/webrtc-sdk/1.7.3-r0/git/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/h1/ops-h1.c:1007
#10 0xb601adf0 in lws_client_connect_via_info (i=i@entry=0xa1d8363c)
at /usr/src/debug/webrtc-sdk/1.7.3-r0/git/open-source/libwebsockets/build/src/project_libwebsockets/lib/core-net/client/connect.c:412
#11 0xb636fcb4 in lwsCompleteSync (pCallInfo=pCallInfo@entry=0xb4cd6000)
at /usr/src/debug/webrtc-sdk/1.7.3-r0/git/src/source/Signaling/LwsApiCalls.c:647
---Type to continue, or q to quit---
#12 0xb6371844 in lwsListenerHandler (args=0xb4cd6000)
at /usr/src/debug/webrtc-sdk/1.7.3-r0/git/src/source/Signaling/LwsApiCalls.c:1493
#13 0xb6b58b6c in start_thread (arg=0xa1d86420) at pthread_create.c:311
#14 0xb678e5d0 in ()
at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92
(gdb)

@YOSI-yoshidayuji
Copy link
Author

As additional information, while the failure rarely occurs, the double closure of the file descriptor by libwebsocket always occurs every time.

@disa6302
Copy link
Contributor

Got it. Thank you for the stack trace @YOSI-yoshidayuji . Will take a look at it and revert back in a few days.

@YOSI-yoshidayuji
Copy link
Author

YOSI-yoshidayuji commented Nov 21, 2023

I know that lws is planned to be updated to v4.3.2 in "Update lws version #1820" and tried v4.3.2 locally.
This version does not have an issue with double-closing file descriptors. However, file descriptors for Pipes are leaked.
I was planning to report this issue to lws, but it was already resolved in v4.3.3 of lws.
If you are going to upgrade lws, I think it would be inappropriate to use v4.3.2 and you should upgrade to v4.3.3.

[Fixes leaking fds created by 'pipe()' call #2745]
(warmcat/libwebsockets@e440419)

Thank you and best regards.

@disa6302
Copy link
Contributor

Thanks @YOSI-yoshidayuji ,

Will look into updating it.

@sirknightj sirknightj added the dependency Related to a dependency label Jan 1, 2024
@disa6302
Copy link
Contributor

Closing since LWS has been updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working dependency Related to a dependency pending-action
Projects
None yet
Development

No branches or pull requests

4 participants