-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random timeouts waiting for CONNACK cause process crashes in iothubtransport_mqtt_common module #842
Comments
@RivaStyle How are you building the SDK on the ESP32, I'm looking at recreated, and I'd like to get as close to your configuration as possible. Thanks. |
These are my CMake flags:
where $(USER_FLAGS) is:
"build_all/linux/toolchain-uwp.cmake" is:
As a side note, I didn't have to specify the C_STANDARD and CXX_STANDARD till 1.2.12 (my update steps have been 1.2.10->1.2.12->1.2.13 in a matter of a couple weeks). In any case the issue was present as well using 1.2.10. |
(I forgot to add an information, that "build_serializer" is a custom option I added to not build the serializer module as it wasn't needed) |
@RivaStyle (As a side note I really like that build_serializer flag, I will try to get the team to incorporate it). I guess there are two issues here, the wait for CONNACK and the segfault. For the CONNACK, did you lose network connectivity around this time? It's unusual for the IoTHub to not send back a CONNACK unless the device loses it's network. For the Segfault, do you now what communication stack you're using (tlsio_openssl, tlsio_mbedtls, tlsio_wolfssl, ...)? I believe that there is some clue in the issue. Thanks, |
No, when the CONNACK timeout happens I don't lose network connectivity, but it could be some kind of overload as other heavy processes are running alongside it. I'll try to disable it and have a further check. Regarding the communication stack, it's tlsio_openssl. Thanks, |
@RivaStyle @jebrando I could reproduce this issue. I suspect the problem is with I am adding some pain points in the Line 2331 : View Code I think RETRY_ACTION_RETRY_LATER retry action is not being handled When retry action is @jebrando Can you confirm whether this is true, and let me know if you would like me to raise a PR to resolve this issue. Thanks. |
@jitin17 if you wanna check if that can intercept my case, the retry policy I'm using is |
@RivaStyle This is the default retry policy. And this fails because |
…val without the need of changing sleep time Also, added support for IOTHUB_CLIENT_RETRY_INTERVAL connection retry policy, which will take care of internet connection loss and WiFi connection loss. By default, retry policy is IOTHUB_CLIENT_RETRY_EXPONENTIAL_BACKOFF_WITH_JITTER, but it causes a bug due to this known issue Azure/azure-iot-sdk-c#842. As soon as this bug is resolved, we will make retry policy setting configurable.
@RivaStyle I have raised a PR #875 to fix this issue. |
Hi @RivaStyle , PR #875 has been merged. |
I confirm that the library, as it was cloned from Git last Friday, definitely fixes the issue. |
Alright, thanks for confirming. |
@jebrando, @RivaStyle, @ewertons, thank you for your contribution to our open-sourced project! Please help us improve by filling out this 2-minute customer satisfaction survey |
OS and version used: Ubuntu 16.04
SDK version used: 2019-01-31 (v1.2.13)
Description of the issue:
When connecting to IoT Hub to push data, it randomly happens (after many successful attempts) that the MQTT connection times out at initialization, waiting for CONNACK.
This triggers DisconnectFromClient where the case is:
Despite that isDestroyCalled flag, a xio_destroy is called anyway on transport_data->xioTransport, causing a crash.
Could it be an issue similar to #446?
Code sample exhibiting the issue:
iothub_client_sample_mqtt_esp8266, put in a state machine and set to push a single message in a user-selectable time interval (between 5 minutes and 1 hour).
Modifications made:
Console log of the issue:
The text was updated successfully, but these errors were encountered: