-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flooding messages causes a crash (LTS_07_2022, any OS, MQTT protocole, Low Level API) #2449
Comments
The LL APIs are not thread safe. I would avoid using multi threading when calling these APIs. Consider using the non LL APIs are these are thread safe and implement locking in the SDK. Also, for some reason you are calling IoTHubDeviceClient_LL_DoWork in multiple locations. This is very unusual and should not be required. The expected pseudo implementation for LL apis is as follows:
Samples are located here: |
I read about these advices, either in the documentation or the iothub's examples. To address the first point made, my use of thread is ONLY regarding signal handling, thus does not have any related impact to anything happening in the iothub's C sdk (no functions from the SDK will ever be called from that thread, nor shared resources will be accessed). Regarding the several calls of the DoWork function, it is in a healthy way only called once per loop, as advised in your pseudo implementation. However in cases of errors before returning I might trigger forced calls in order to attempt sending all messages. This code can be broken down to your pseudo implementation and then the problem subsists for me. Flooding messages seems to -after some time- trigger an ill heap-use-after-free. |
Any news ? |
As stated above, this is probably a reentry issue calling non-locking APIs from another thread. If possible, please provide a C99 non-multithreaded sample that repos this issue. As always, we welcome external contributions to this open source repo! |
The above example is C99, non multithreaded, only calls do_work in one single case, and on my side on my machines reproduces the issue under 10' average (sometimes requires a few tries). |
Logs uploaded : OPTION_LOG_TRACE enabled on the LL API Crash1 : compiled in debug |
Hi @BillyTheFrog ,
I have modified one of our samples to have both functions above run in the same loop with no delay, and I'm running it under valgrind to verify if there are any crashes. I'll share details if we get any repro. |
"I have modified one of our samples to have both functions above run in the same loop with no delay" why not using my sample ? Does it have anything that seems bad in it ? In this case my sample could be faulty. But if everything seems compliant with both the library and ISO 9899 TC3, then I guess it's worth investigating the sample I provided. |
Development Machine, OS, Compiler (and Other Relevant Toolchain Info)
Bug reproduced on arch linux, debian, ubuntu, yocto, under both x86 and arm architectures.
SDK Version (Please Give Commit SHA if Manually Compiling)
LTS_07_2022_Ref02
Protocol
MQTT
Describe the Bug
After a while of intensive sending (count between 1 and 2 mins), a crash occurs. It appears to be a double linked list issue, accessing a recently freed area.
The crash tracks back to IoTHubDeviceClient_LL_DoWork.
I also saw some errors in a thread sanitizer. To monitor this issue I used valgrind and google's sanitizers (address & thread). See attached logs below for details.
Here is a minimal example that managed to reproduce the issue for me.
I checked the documentation and hopefully I didn't misuse the SDK.
Please replace '[ YOUR CONNECTION STRING GOES HERE ]' by your own connection string.
Once compiled, running the program should reproduce the issue.
To call 'IoTHubDeviceClient_LL_DoWork' less frequently, an argument can be given (arg. #1 of the compiled binary) to call the function every X messages passed to the SDK (putting 5 as first argument will pass 5 messages to the SDK before calling the function).
By default the function is called after every message passed.
The second and third arguments of the binary are also optional, they allow you to specify the key/value pair in the message sent. By default beeing "Alive" for the key (arg. #2) and "[$count]" for the value (arg. #3) the message will contain an increasing value (to keep track, as a message ID).
Console Logs
data:image/s3,"s3://crabby-images/bb146/bb1460e45590c4d3bcc47a03e9a7f4d26d35cf0e" alt="image"
Thread sanitizer log:
Memory sanitizer log:
data:image/s3,"s3://crabby-images/2e511/2e5112b3693b48f20b91ffc53df7278de83898a5" alt="image"
The text was updated successfully, but these errors were encountered: