Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDKES-01TZXD switch doesn't work after the power failure. #3870

Closed
r-vit opened this issue Jul 8, 2020 · 16 comments · Fixed by Koenkk/zigbee-herdsman#207
Closed

GDKES-01TZXD switch doesn't work after the power failure. #3870

r-vit opened this issue Jul 8, 2020 · 16 comments · Fixed by Koenkk/zigbee-herdsman#207

Comments

@r-vit
Copy link

r-vit commented Jul 8, 2020

Bug Report

What happened

I've got a bunch of GDKES-01TZXD switches. Most of the time they work pretty well, but every time after a power failure they stop responding. When I generate the map of the zigbee network I see that each of these switches get two connections - one to the coordinator with lqi=170 and the other to one of the routers.

It looks like the connections to the coordinator are not valid, so if I try to send a command to the switch, the device select route that doesn't exist.
The error in the log looks like that:

Publish 'set' 'state' to 'LightSwitchOffice' failed: 'Error: Command 0xec1bbdfffe66e506/1 genOnOff.on({}, {"timeout":10000,"disableResponse":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null}) failed (Error: Data request failed with error: 'MAC transaction expired' (240))'

At the same time if I press the switch manually the state in Home Assistant changes and reflects the actual state of the switch

After putting the device to the pairing mode it reconnects and starts working properly. But only till the next power failure.

It looks like when the power resumes but the coordinator hasn't started yet the switches connect to any available routers, but then when the zigbee2mqtt finally starts, the software believes that there's another route to the devices with better lqi (170), so it tries to send the commands that way.

Is there a way to 'expire' those routes?

What did you expect to happen

Devices resume working properly after the power restored

How to reproduce it (minimal and precise)

Debug Info

Zigbee2mqtt version: 1.14.1
Adapter hardware:
CC26X2R1
(it was the same with CC2531 as well)
Adapter firmware version: Coordinator firmware version: '{"type":"zStack3x0","meta":{"transportrev":2,"product":1,"majorrel":2,"minorrel":7,"maintrel":1,"revision":20200417}}'

@Koenkk
Copy link
Owner

Koenkk commented Jul 8, 2020

Can you try replacing /app/zigbee2mqtt/node_modules/zigbee-hedsman/dist/adapter/z-stack/adapter/zStackAdapter.js with https://gist.github.com/Koenkk/38934204fc14239f0ef8e8793d876965 and see if that fixes the issue?

@r-vit
Copy link
Author

r-vit commented Jul 8, 2020

yes, thank you, I will try to do that as soon as possible and will report the results

@r-vit
Copy link
Author

r-vit commented Jul 9, 2020

What I tried today:

  1. stopped zigbee2mqtt daemon
  2. turned off power by the main power switch in the apartment for a few minutes
  3. turned the power on
  4. after a couple of minutes started the zigbee2mqtt daemon

Most of the switches were ok, but two of them (which coincidentally were the closet to the coordinator) showed the same symptoms as before - two routes and an error in the log:
error 2020-07-09 11:02:24: Publish 'set' 'state' to 'LightSwitchBathroom' failed: 'Error: Command 0xec1bbdfffe688588/1 genOnOff.on({}, {"timeout":10000,"disableResponse":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null}) failed (Error: Data request failed with error: 'MAC transaction expired' (240))'

Screenshot_2020-07-09 Node-RED Dashboard

@Koenkk
Copy link
Owner

Koenkk commented Jul 9, 2020

Can you sniff the traffic of a failed command? https://www.zigbee2mqtt.io/how_tos/how_to_sniff_zigbee_traffic.html

@r-vit
Copy link
Author

r-vit commented Jul 9, 2020

When I send a command to any of the switches that are connected normally I see the OnOFF message in the wireshark log; but if I send it to the switches that doesn't respond to commands apparently nothing is being send to them! I'm confused.
Screenshot_20200709_232434

@Koenkk
Copy link
Owner

Koenkk commented Jul 10, 2020

@r-vit When getting the 240 error the commands itself are indeed not send, however I expect the route requests to be send. I've updated https://gist.github.com/Koenkk/38934204fc14239f0ef8e8793d876965, with this can you reproduce the issue again and provide me the herdsman debug logging?

To enable herdsman debug logging, see https://www.zigbee2mqtt.io/information/debug.html#zigbee-herdsman-debug-logging

@r-vit
Copy link
Author

r-vit commented Jul 11, 2020

I tried to remove the lines that didn't look relevant to me from the log, but if you need I'll attach the full log.
This time it was another device that lost connection.

debug 2020-07-11 09:50:21: Received MQTT message on 'zigbee2mqtt/LightSwitchBalcony/set' with data 'ON'
debug 2020-07-11 09:50:21: Publishing 'set' 'state' to 'LightSwitchBalcony'
debug 2020-07-11 09:50:26: Received MQTT message on 'zigbee2mqtt/bridge/config/devices/get' with data '1594450226189'
debug 2020-07-11 09:50:26: Received MQTT message on 'zigbee2mqtt/bridge/config/groups' with data '1594450226191'
debug 2020-07-11 09:50:44: Error: Command 0xec1bbdfffe7bd389/1 genOnOff.on({}, {"timeout":10000,"disableResponse":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null}) failed (Error: Timeout - 36210 - 1 - 4 - 6 - 11 after 10000ms) at Endpoint.<anonymous> (/opt/zigbee2mqtt/node_modules/zigbee-herdsman/dist/controller/model/endpoint.js:365:23) at Generator.throw (<anonymous>) at rejected (/opt/zigbee2mqtt/node_modules/zigbee-herdsman/dist/controller/model/endpoint.js:6:65)
info 2020-07-11 09:50:44: MQTT publish: topic 'zigbee2mqtt/bridge/log', payload '{"type":"zigbee_publish_error","message":"Publish 'set' 'state' to 'LightSwitchBalcony' failed: 'Error: Command 0xec1bbdfffe7bd389/1 genOnOff.on({}, {"timeout":10000,"disableResponse":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null}) failed (Error: Timeout - 36210 - 1 - 4 - 6 - 11 after 10000ms)'","meta":{"friendly_name":"LightSwitchBalcony"}}'

@r-vit
Copy link
Author

r-vit commented Jul 11, 2020

I think the previous log file shows different situation - that switch completely went off the network despite being shown on the network map. I will need to do another experiment.

@r-vit
Copy link
Author

r-vit commented Jul 12, 2020

Screenshot_2020-07-12 Node-RED Dashboard
ok, here's the log file from the actual situation (relevant lines):

debug 2020-07-12 09:25:23: Received MQTT message on 'zigbee2mqtt/LightSwitchEntrance/set' with data 'ON'
debug 2020-07-12 09:25:23: Publishing 'set' 'state' to 'LightSwitchEntrance'
error 2020-07-12 09:26:19: Publish 'set' 'state' to 'LightSwitchEntrance' failed: 'Error: Command 0xec1bbdfffe6884de/1 genOnOff.on({}, {"timeout":10000,"disableResponse":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null}) failed (Error: Data request failed with error: 'MAC transaction expired' (240))'
debug 2020-07-12 09:26:19: Error: Command 0xec1bbdfffe6884de/1 genOnOff.on({}, {"timeout":10000,"disableResponse":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null}) failed (Error: Data request failed with error: 'MAC transaction expired' (240))
    at Endpoint.<anonymous> (/opt/zigbee2mqtt/node_modules/zigbee-herdsman/dist/controller/model/endpoint.js:365:23)
    at Generator.throw (<anonymous>)
    at rejected (/opt/zigbee2mqtt/node_modules/zigbee-herdsman/dist/controller/model/endpoint.js:6:65)
info  2020-07-12 09:26:19: MQTT publish: topic 'zigbee2mqtt/bridge/log', payload '{"type":"zigbee_publish_error","message":"Publish 'set' 'state' to 'LightSwitchEntrance' failed: 'Error: Command 0xec1bbdfffe6884de/1 genOnOff.on({}, {\"timeout\":10000,\"disableResponse\":false,\"disableDefaultResponse\":false,\"direction\":0,\"srcEndpoint\":null,\"reservedBits\":0,\"manufacturerCode\":null,\"transactionSequenceNumber\":null}) failed (Error: Data request failed with error: 'MAC transaction expired' (240))'","meta":{"friendly_name":"LightSwitchEntrance"}}'

@Koenkk
Copy link
Owner

Koenkk commented Jul 12, 2020

This is not the herdsman debug logging, note that the herdsman debug logging is not logged to the log files but only to STDOUT. Also provide me the log from the point where you execute the failing command until 1 minute after this and attach this as a file or post it on pastebin.com.

@r-vit
Copy link
Author

r-vit commented Jul 12, 2020

log.txt
I hope I made it correctly this time. I think the command was issued at 2020-07-12T13:37:05.313Z
herdsman_error_log.txt

@Koenkk
Copy link
Owner

Koenkk commented Jul 14, 2020

Thanks, I cannot find the cause of the issue in the log yet, it seems that there is no response to the route discover request. Could you share me the sniff of the problem? https://www.zigbee2mqtt.io/how_tos/how_to_sniff_zigbee_traffic.html

@r-vit
Copy link
Author

r-vit commented Jul 14, 2020

Sure
What is interesting, with the time most of the wrong routes were dropped, and now there's only one device that doesn't respond
I'm attaching 2 log-files and the Wireshark intercepted traffic
The first command was issued at 2020-07-14T18:12:44.404Z (21:12:44 local time)
I issued it several times just in case

log.txt
herdsman_error_log.txt
dump.zip

Upd. nope, the wrong routes were not dropped

@Koenkk
Copy link
Owner

Koenkk commented Jul 15, 2020

Can you also send me your network key? (otherwise I cannot decrypt the sniff, in case you want to share it privately send it to me on telegram: @Koenkk)

Koenkk added a commit to Koenkk/zigbee-herdsman-converters that referenced this issue Aug 4, 2020
@Koenkk
Copy link
Owner

Koenkk commented Aug 6, 2020

For others: @r-vit and me are testing a possible fix, to test:

cd /app/zigbee2mqtt/node_modules
rm -rf zigbee-herdsman
git clone https://github.com/Koenkk/zigbee-herdsman.git -b fix_enddevice_not_reachable
cd zigbee-herdsman
npm ci
npm run build
  • Start z2m

@Koenkk
Copy link
Owner

Koenkk commented Aug 12, 2020

Fix is now in the dev branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants