Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MQTT transport getting into connection retry expired state for incorrect trigger (message retransmission) #2070

Closed
ewertons opened this issue Aug 30, 2021 · 5 comments

Comments

@ewertons
Copy link
Contributor

Development Machine, OS, Compiler (and Other Relevant Toolchain Info)

N/A

SDK Version (Please Give Commit SHA if Manually Compiling)

Commit a88ce6d

Protocol

MQTT

Describe the Bug

The MQTT transport seems to get into "Connection Retry Policy Expired" state from more than one trigger event:

  • From multiple unsuccessful reconnections (expected)
  • From multiple unsuccessful attempts to send a message to the IoT Hub (unexpected).

The second is unexpected because message re-transmission is not the same as connection retry. One might lead to the other, but it should not get the SDK client into a state where it might not attempt to reconnect.

Please investigate.

Possible offending code:

transport_data->transport_callbacks.connection_status_cb(IOTHUB_CLIENT_CONNECTION_UNAUTHENTICATED, IOTHUB_CLIENT_CONNECTION_RETRY_EXPIRED, transport_data->transport_ctx);

The only location that should be raising a connection status callback with reconnection expired is this:

transport_data->transport_callbacks.connection_status_cb(IOTHUB_CLIENT_CONNECTION_UNAUTHENTICATED, IOTHUB_CLIENT_CONNECTION_RETRY_EXPIRED, transport_data->transport_ctx);

@ericwolz
Copy link
Contributor

#2037

@ericwolz
Copy link
Contributor

ericwolz commented Sep 2, 2021

We retry the publish if we timeout waiting for the PUBACK. If that fails again, we treat that as a protocol error and terminate the connection.

Creating IoTHub Device handle
Sending message 1 to IoTHub
Sending message 2 to IoTHub
Sending message 3 to IoTHub
Sending message 4 to IoTHub
Sending message 5 to IoTHub
-> 11:01:23 CONNECT | VER: 4 | KEEPALIVE: 240 | FLAGS: 192 | USERNAME: ericwol-hub.azure-devices.net/snoopy/?api-version=2020-09-30&DeviceClientType=iothubclient%2f1.8.0%20(native%3b%20WindowsProduct%3a0x00000004%206.2%3b%20x64%3b%20%7bF9FA04EF-2602-43AB-8505-A1EDE028ADD8%7d) | PWD: XXXX | CLEAN: 0
<- 11:01:23 CONNACK | SESSION_PRESENT: true | RETURN_CODE: 0x0
connection_status_callback result:0 reason:6
The device client is connected to iothub
-> 11:01:23 PUBLISH | IS_DUP: false | RETAIN: 0 | QOS: DELIVER_AT_LEAST_ONCE | TOPIC_NAME: devices/snoopy/messages/events/property_key=property_value | PACKET_ID: 2 | PAYLOAD_LEN: 12
-> 11:01:23 PUBLISH | IS_DUP: false | RETAIN: 0 | QOS: DELIVER_AT_LEAST_ONCE | TOPIC_NAME: devices/snoopy/messages/events/property_key=property_value | PACKET_ID: 3 | PAYLOAD_LEN: 12
-> 11:01:23 PUBLISH | IS_DUP: false | RETAIN: 0 | QOS: DELIVER_AT_LEAST_ONCE | TOPIC_NAME: devices/snoopy/messages/events/property_key=property_value | PACKET_ID: 4 | PAYLOAD_LEN: 12
-> 11:01:23 PUBLISH | IS_DUP: false | RETAIN: 0 | QOS: DELIVER_AT_LEAST_ONCE | TOPIC_NAME: devices/snoopy/messages/events/property_key=property_value | PACKET_ID: 5 | PAYLOAD_LEN: 12
-> 11:01:23 PUBLISH | IS_DUP: false | RETAIN: 0 | QOS: DELIVER_AT_LEAST_ONCE | TOPIC_NAME: devices/snoopy/messages/events/property_key=property_value | PACKET_ID: 6 | PAYLOAD_LEN: 12
<- 11:01:29 PUBACK | PACKET_ID: 3
Confirmation callback received for message 1 with result IOTHUB_CLIENT_CONFIRMATION_OK
<- 11:01:29 PUBACK | PACKET_ID: 4
Confirmation callback received for message 2 with result IOTHUB_CLIENT_CONFIRMATION_OK
<- 11:01:29 PUBACK | PACKET_ID: 5
Confirmation callback received for message 3 with result IOTHUB_CLIENT_CONFIRMATION_OK
<- 11:01:29 PUBACK | PACKET_ID: 6
Confirmation callback received for message 4 with result IOTHUB_CLIENT_CONFIRMATION_OK
-> 11:02:24 PUBLISH | IS_DUP: false | RETAIN: 0 | QOS: DELIVER_AT_LEAST_ONCE | TOPIC_NAME: devices/snoopy/messages/events/property_key=property_value | PACKET_ID: 2 | PAYLOAD_LEN: 12
Confirmation callback received for message 5 with result IOTHUB_CLIENT_CONFIRMATION_MESSAGE_TIMEOUT
-> 11:03:25 DISCONNECT
connection_status_callback result:1 reason:3
The device client has been disconnected
-> 11:03:27 CONNECT | VER: 4 | KEEPALIVE: 240 | FLAGS: 192 | USERNAME: ericwol-hub.azure-devices.net/snoopy/?api-version=2020-09-30&DeviceClientType=iothubclient%2f1.8.0%20(native%3%20WindowsProduct%3a0x00000004%206.2%3b%20x64%3b%20%7bF9FA04EF-2602-43AB-8505-A1EDE028ADD8%7d) | PWD: XXXX | CLEAN: 0
<- 11:03:27 CONNACK | SESSION_PRESENT: true | RETURN_CODE: 0x0
connection_status_callback result:0 reason:6
The device client is connected to iothub

@ericwolz
Copy link
Contributor

ericwolz commented Sep 2, 2021

If the retry of PUB timeout due to not receiving a PUBACK, we send the following events:

SendMessageComplete: IOTHUB_CLIENT_CONFIRMATION_MESSAGE_TIMEOUT
ConnectionStatusCallback: IOTHUB_CLIENT_CONNECTION_RETRY_EXPIRED

@ericwolz
Copy link
Contributor

#2078 2078

@ericwolz
Copy link
Contributor

resolved with PR #2078

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants