-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request unicast responses for resolving nodes when using minmdns #11635
Request unicast responses for resolving nodes when using minmdns #11635
Conversation
fast track: trivial change |
Did you figure out why unicast responses weren't working on linux with chip-device-ctrl? Also, there's another spot in that file where we set answer via unicast to false. |
Updated the browse-nodes code to also use unicast, however I am unsure how to test that bit. |
Re-tested after changing unicast replies to other times. Always used m5stack, the following work with and without avahi running (to check for conflicts):
watched traffic with wireshark and confirmed replies were unicast. |
PR #11635: Size comparison from a27a311 to fc0bac5 Increases (7 builds for efr32, linux, p6)
Decreases (14 builds for k32w, linux, qpg, telink)
Full report (21 builds for efr32, k32w, linux, p6, qpg, telink)
|
Not sure this is a trivial change given this changes behavior? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't run tests as "root"
Refer to #2688 I don't want to rehash that entire discussion here. We don't need root. |
Please explain the request a bit more. Looked at #2688 and it seemed to request no root, however happy internally does root. Current script does not require root, however for commands it knows need root it will execute sudo unless current user is already root. You can run |
Please read it again and expand the whole discussion. I provided specific commands for running it without root and we have integrated those.
I don't agree with it and it is not necessary. We've had developers spend hours debugging cirque making global network configuration changes on their machine. |
Here's a link to the important part: |
Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test wrapper look changes looks good now.
I do have some concerns about the RFC compliance of the functional change (violates a SHOULD). But I understand the immediate need re: min MDNS.
Should we perhaps add some TODOs here? And note that making repairs to whatever is blocking use of avahi is an another solution.
(andy314): IPv6 does Duplicate Address Detection even thoughconnectedhomeip/scripts/tests/test_suites.sh Lines 91 to 101 in 03f4d6a
This comment was generated by todo based on a
|
PR #11635: Size comparison from 641aacd to 03f4d6a Full report (38 builds for efr32, esp32, k32w, linux, mbed, nrfconnect, p6, qpg, telink)
|
I'm OK with this in practice, but we will need some kind of developer PSA for this - given that chip-device-ctrl is still built as default minimal mdns (not sure about chip-tool), we'll probably end up with folks who are unable to resolve nodes because avahi is eating their packets. It's not that easy to kill either - just stopping the service isn't sufficient. It will come back even if the config isn't set that way. It's the zombie packet eater that won't die. As far as I can tell, you actually need to mask off the socket to stop the service from restarting itself. Not exactly intuitive and we're likely to lose a fair number of dev hours on that if we don't send out instructions and warnings far and wide. |
…ject-chip#11635) * Request unicast responses for resolving nodes when using minmdns * Set unicast answer to true in browse nodes as well * Make unit tests use platform mdns: chip tool and all clusters all need to be mdns servers (because tv commisioning) so having a single central point seems more correct than having several mdns servers * Set the chip_mdns value as a string * Escape quotes for platform mdns build * Revert platform mdns usage in tests: avahi does not properly work for testing (tried locally, could not get it to work but minimal can work, need to figure out how) * Add support for using netns namespaces for unit test runs - check to see if that makes linux unit tests happy * Do not restyle test_suites.sh * Some script cleanup * Temporarely run the tests on ipv4 only until I figure out ipv6 multicast support * Cleanup netns as a final step, just in case * Move cleanup again - cleanup is added again post-netns setup * Code review comments, enable ipv6 back * Add note on why we stop restuling testsuites.sh * Use unshare instead of sudo to run unit tests in namespaces * Update remount logic to only apply for unshare environments * Updated messaging logic * Added a sleep for IPv6 and a comment as to why
Problem
Requesting multicast replies will get throttled according to mDNS rfc. Chip-tool restarts on every operation, effectively forgetting mDNS responses (unless using the platform mdns where something like avahi/bonjour caches for us) and requesting mDNS discovery will result in throttling.
Change overview
Resolving nodes sets the 'reply via unicast' flag, disabling broadcast, making network traffic less chatty and bypassing throttling (chip-tool is much more responsive)
Additional changes for scripts to work:
specifically for linux, use network namespaces. reason is minmdns listens on 5353 and having 2 apps (chip-tool and all-clusters-app) competing for the same port does not work well when unicast is used. Darwin works because it uses mdns platform and that is centralized (no competing for the port). Using avahi on linux did not work and took too long to try to fix
disabled auto-formatting for test runner script - formatting hardener was both changing lines I never touched and tried to 'correct' lines that I explicitly did not want string-escaped.
Testing
Proviosioned an m5stack with this change in chip-tool (success) and toggled an on/off a few times (successfull and much more responsive and reliable than without this flag).
Unit tests on linux pass.