Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodes hang after discovery server restart [13704] #2289

Closed
amfern opened this issue Oct 24, 2021 · 6 comments · Fixed by #2470
Closed

Nodes hang after discovery server restart [13704] #2289

amfern opened this issue Oct 24, 2021 · 6 comments · Fixed by #2470

Comments

@amfern
Copy link

amfern commented Oct 24, 2021

Node hangs on closing when using TCP discovery server

Expected Behavior

Node shutdown on sigterm

Current Behavior

Node hangs with [INFO] [1635080629.958593776] [rclcpp]: signal_handler(signal_value=2) message in the terminal

Steps to Reproduce

  1. run discovery server
RMW_IMPLEMENTATION=rmw_fastrtps_cpp FASTRTPS_DEFAULT_PROFILES_FILE=./discovery_server_tcp.xml ros2 run demo_nodes_cpp listener --ros-args -r __ns:=/discovery_server_ns -r  __node:=discovery_server
  1. run talker node and wait 3 seconds for it to connect
RMW_IMPLEMENTATION=rmw_fastrtps_cpp FASTRTPS_DEFAULT_PROFILES_FILE=./discovery_client_tcp_2.xml ros2 run demo_nodes_cpp talker
  1. stop the discovery server with ctrl+c and wait 3 seconds
  2. start discovery server and wait 3 seconds
  3. stop talker node with ctrl+c
  4. node doesn't stop, it hangs with [rclcpp]: signal_handler(signal_value=2)

System information

testing inside container, all nodes are running inside the same container

  • Fast-RTPS version: ii ros-galactic-fastrtps 2.3.4-1focal.20210805.154711 amd64 Implementation of RTPS standard.
  • OS: Linux ilya.linux 5.13.13-zen1-1-zen #1 ZEN SMP PREEMPT Thu, 26 Aug 2021 19:14:35 +0000 x86_64 GNU/Linux
  • Network interfaces:
root@1b19ecb7a6fe:/ws# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
17: eth0@if18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever
  • ROS2: ros2 galactic docker

Additional context

Additional resources

discovery_server_tcp.xml

<?xml version="1.0" encoding="UTF-8" ?>
<profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
    <transport_descriptors>
        <transport_descriptor>
            <transport_id>TCPv4_SERVER</transport_id>
            <type>TCPv4</type>
            <listening_ports>
                <port>27811</port>
            </listening_ports>
            <calculate_crc>false</calculate_crc>
            <check_crc>false</check_crc>
        </transport_descriptor>
    </transport_descriptors>

    <participant profile_name="TCP_server" is_default_profile="true">
        <rtps>
            <userTransports>
                <transport_id>TCPv4_SERVER</transport_id>
            </userTransports>
            <useBuiltinTransports>false</useBuiltinTransports>
            <prefix>4d.49.47.55.45.4c.5f.42.41.52.52.4f</prefix>
            <builtin>
                <discovery_config>
                    <discoveryProtocol>SERVER</discoveryProtocol>
                    <leaseAnnouncement><sec>1</sec><nanosec>0</nanosec></leaseAnnouncement>
                    <leaseDuration><sec>3</sec><nanosec>0</nanosec></leaseDuration>
	                  <clientAnnouncementPeriod>
		                    <nanosec>250000000</nanosec>
	                  </clientAnnouncementPeriod>
                </discovery_config>
                <metatrafficUnicastLocatorList>
                    <locator>
                        <tcpv4>
                            <address>127.0.0.1</address>
                            <physical_port>27811</physical_port>
                            <port>6339</port>
                        </tcpv4>
                    </locator>
                </metatrafficUnicastLocatorList>
            </builtin>
        </rtps>
    </participant>
</profiles>

discovery_client_tcp_2.xml

<?xml version="1.0" encoding="UTF-8" ?>
<profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
    <transport_descriptors>
        <transport_descriptor>
            <transport_id>LAN publisher tcp transport</transport_id>
            <type>TCPv4</type>
            <listening_ports>
                <port>64752</port> <!-- publisher devoted tcp port -->
            </listening_ports>
        </transport_descriptor>
    </transport_descriptors>

    <participant profile_name="TCP_client_1" is_default_profile="true">
        <rtps>
            <prefix>63.6c.69.65.6e.74.32.5f.73.31.5f.5f</prefix>
            <userTransports>
                <transport_id>LAN publisher tcp transport</transport_id>
            </userTransports>
            <useBuiltinTransports>false</useBuiltinTransports>
            <builtin>
                <discovery_config>
                    <discoveryProtocol>CLIENT</discoveryProtocol>
                    <leaseAnnouncement><sec>1</sec><nanosec>0</nanosec></leaseAnnouncement>
                    <leaseDuration><sec>3</sec><nanosec>0</nanosec></leaseDuration>
	                  <clientAnnouncementPeriod>
		                    <nanosec>250000000</nanosec>
	                  </clientAnnouncementPeriod>
                    <discoveryServersList>
                        <RemoteServer prefix="4d.49.47.55.45.4c.5f.42.41.52.52.4f">
                            <metatrafficUnicastLocatorList>
                                <locator>
                                    <tcpv4>
                                        <address>127.0.0.1</address>
                                        <physical_port>27811</physical_port>
                                        <port>6339</port>
                                    </tcpv4>
                                </locator>
                            </metatrafficUnicastLocatorList>
                        </RemoteServer>
                    </discoveryServersList>
                </discovery_config>
            </builtin>
        </rtps>
    </participant>
</profiles>
@amfern
Copy link
Author

amfern commented Oct 26, 2021

Here is a video of the issue happening https://www.youtube.com/watch?v=_BKYorPGjkc

@amfern amfern changed the title Node hangs with discovery server over TCP Nodes hang after discovery server restart Oct 26, 2021
@JLBuenoLopez
Copy link
Contributor

Hi @amfern,

Sorry for the late reply. Would you mind testing with the latest release? #2246 fixed some Discovery Server re-connection issues and it might have solved this issue as well.

@JLBuenoLopez JLBuenoLopez changed the title Nodes hang after discovery server restart Nodes hang after discovery server restart [13704] Feb 1, 2022
@amfern
Copy link
Author

amfern commented Feb 1, 2022

Hi @JLBuenoLopez-eProsima

I can still reproduce it on rolling release, here are the versions

ii  ros-rolling-fastrtps                               2.3.4-1focal.20220120.180137         amd64        *eprosima Fast DDS* (formerly Fast RTPS) is a C++ implementation of the DDS (Data Distribution Service) standard of the OMG (Object Management Group).
ii  ros-rolling-fastrtps-cmake-module                  2.0.4-1focal.20220120.194458         amd64        Provide CMake module to find eProsima FastRTPS.
ii  ros-rolling-rmw-fastrtps-cpp                       6.1.2-1focal.20220121.220647         amd64        Implement the ROS middleware interface using eProsima FastRTPS static code generation in C++.
ii  ros-rolling-rmw-fastrtps-shared-cpp                6.1.2-1focal.20220121.214111         amd64        Code shared on static and dynamic type support of rmw_fastrtps_cpp.
ii  ros-rolling-rosidl-typesupport-fastrtps-c          2.0.4-1focal.20220121.211520         amd64        Generate the C interfaces for eProsima FastRTPS.
ii  ros-rolling-rosidl-typesupport-fastrtps-cpp        2.0.4-1focal.20220121.210627         amd64        Generate the C++ interfaces for eProsima FastRTPS.

@EduPonz
Copy link

EduPonz commented Feb 2, 2022

Hi @amfern ,

I think this branch should solve this too. You can check with this

@amfern
Copy link
Author

amfern commented Feb 2, 2022

Yes, it did fix it. Thank you

@EduPonz
Copy link

EduPonz commented Feb 2, 2022

That's great! I'll leave it the issue open until we merge the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants