Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periodic crash in services test #11

Closed
gerkey opened this issue Dec 31, 2015 · 14 comments
Closed

Periodic crash in services test #11

gerkey opened this issue Dec 31, 2015 · 14 comments

Comments

@gerkey
Copy link
Member

gerkey commented Dec 31, 2015

Following #10, I'm trying to get system_tests.test_rclcpp.test_services_cpp__rmw_fastrtps_cpp to pass reliably, and I can't. I've seen a variety of segfaults, aborts for double-free, and deadlocks. I just spent some time digging through the code and have failed to figure out the problem. If you have everything built, then you can reproduce the problem like so (I've been testing on Linux):

ulimit -c unlimited
cd src/ros2/system_tests/test_rclcpp
while nosetests3 -s ../../../../build/test_rclcpp/test_services_cpp__rmw_fastrtps_cpp.py; do true; done

You should, after a short period of time, see a problem. If there's a crash, you should get a core file. I haven't yet gotten any crashes to happen in gdb or valgrind (most of the time, I just get a deadlock in that situation).

I'm happy to provide more information to help with the investigation.

@gerkey
Copy link
Member Author

gerkey commented Dec 31, 2015

Looking at our Jenkins jobs, it seems that we're seeing intermittent crashes in multiple tests when using Fast-RTPS, e.g.:

We don't see such crashes with any other RMW implementation. It seems like either there are memory management issues internal to Fast-RTPS or we're using it incorrectly via the RMW layer. Any idea what's happening or where to look for a fix?

@richiprosima
Copy link
Contributor

We will analyze the problem. Thanks for your info

@richiprosima
Copy link
Contributor

I believe I found the problem. I'm testing a solution.

@gerkey
Copy link
Member Author

gerkey commented Jan 4, 2016

That sounds promising. Let me know if we can help with testing.

@richiprosima
Copy link
Contributor

I've updated FastCDR and FastRTPS. In this update there are changes to resolve the segmentation fault.

@gerkey
Copy link
Member Author

gerkey commented Jan 5, 2016

Thanks for looking into the problem. I updated Fast-CDR and Fast-RTPS and rebuilt everything. I'm still seeing occasional crashes from the while loop mentioned in the description of this issue. E.g., here's a segfault (return code -11) in the service server process:

(test_services_server_cpp) pid 22260: ['/home/gerkey/ros2_ws/build/test_rclcpp/test_services_server_cpp__rmw_fastrtps_cpp'] (stderr > stdout, all > console)
(test_services_client_cpp) pid 22261: ['/home/gerkey/ros2_ws/build/test_rclcpp/test_services_client_cpp__rmw_fastrtps_cpp', 'test_services_server_cpp'] (stderr > stdout, all > console)
[test_services_client_cpp] [==========] Running 3 tests from 1 test case.
[test_services_client_cpp] [----------] Global test environment set-up.
[test_services_client_cpp] [----------] 3 tests from test_services_client__rmw_fastrtps_cpp
[test_services_client_cpp] [ RUN      ] test_services_client__rmw_fastrtps_cpp.test_add_noreqid
(test_services_server_cpp) rc -11
() tear down
(test_services_client_cpp) signal SIGINT
(test_services_client_cpp) signal SIGTERM
(test_services_client_cpp) rc -15
F
======================================================================
FAIL: test_services_cpp__rmw_fastrtps_cpp.test_services_cpp
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/home/gerkey/ros2_ws/build/test_rclcpp/test_services_cpp__rmw_fastrtps_cpp.py", line 26, in test_services_cpp
    assert rc == 0, "The launch file failed with exit code '" + str(rc) + "'. "
nose.proxy.AssertionError: The launch file failed with exit code '-11'. 

In this situation, I'm not getting a core file, so I can't provide any more detail (maybe nosetests and/or asyncio are getting in the way of the core file production?).

@gerkey
Copy link
Member Author

gerkey commented Jan 5, 2016

Also, more often than the segfaults, I'm still seeing deadlocks in this test. I.e., both client and server are still running (haven't crashed), but they're not meeting the termination condition for the test. Any idea what's causing that problem?

Here's a backtrace from attaching to the service server:

(gdb) thread apply all bt

Thread 5 (Thread 0x7f149248f700 (LWP 22320)):
#0  0x00007f1493c42b13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f1492d46308 in boost::asio::detail::epoll_reactor::run(bool, boost::asio::detail::op_queue<boost::asio::detail::task_io_service_operation>&) () from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#2  0x00007f1492d49c78 in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#3  0x00007f1492d4e547 in eprosima::fastrtps::rtps::ResourceEvent::run_io_service() ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#4  0x00007f1492ab8a4a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
#5  0x00007f1493517182 in start_thread (arg=0x7f149248f700) at pthread_create.c:312
#6  0x00007f1493c4247d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 4 (Thread 0x7f1491c8e700 (LWP 22321)):
#0  0x00007f1493c42b13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f1492d46308 in boost::asio::detail::epoll_reactor::run(bool, boost::asio::detail::op_queue<boost::asio::detail::task_io_service_operation>&) () from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#2  0x00007f1492d49c78 in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#3  0x00007f1492d42abf in eprosima::fastrtps::rtps::ListenResourceImpl::run_io_service() ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#4  0x00007f1492ab8a4a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
#5  0x00007f1493517182 in start_thread (arg=0x7f1491c8e700) at pthread_create.c:312
#6  0x00007f1493c4247d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3 (Thread 0x7f149148d700 (LWP 22322)):
#0  0x00007f1493c42b13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f1492d46308 in boost::asio::detail::epoll_reactor::run(bool, boost::asio::detail::op_queue<boost::asio::detail::task_io_service_operation>&) () from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#2  0x00007f1492d49c78 in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#3  0x00007f1492d42abf in eprosima::fastrtps::rtps::ListenResourceImpl::run_io_service() ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#4  0x00007f1492ab8a4a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
#5  0x00007f1493517182 in start_thread (arg=0x7f149148d700) at pthread_create.c:312
#6  0x00007f1493c4247d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 (Thread 0x7f1490c8c700 (LWP 22323)):
#0  0x00007f1493c42b13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f1492d46308 in boost::asio::detail::epoll_reactor::run(bool, boost::asio::detail::op_queue<boost::asio::detail::task_io_service_operation>&) () from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#2  0x00007f1492d49c78 in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#3  0x00007f1492d42abf in eprosima::fastrtps::rtps::ListenResourceImpl::run_io_service() ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#4  0x00007f1492ab8a4a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
#5  0x00007f1493517182 in start_thread (arg=0x7f1490c8c700) at pthread_create.c:312
#6  0x00007f1493c4247d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x7f14951b8780 (LWP 22318)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f14941d14bc in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007f149444844f in rmw_wait () from /home/gerkey/ros2_ws/install/lib/librmw_fastrtps_cpp.so
#3  0x00007f1494addc20 in rclcpp::executor::Executor::wait_for_work(std::chrono::duration<long, std::ratio<1l, 1000000000l> >) () from /home/gerkey/ros2_ws/install/lib/librclcpp__rmw_fastrtps_cpp.so
#4  0x00007f1494aded90 in rclcpp::executor::Executor::get_next_executable(std::chrono::duration<long, std::ratio<1l, 1000000000l> >) () from /home/gerkey/ros2_ws/install/lib/librclcpp__rmw_fastrtps_cpp.so
#5  0x00007f1494ae51ed in rclcpp::executors::single_threaded_executor::SingleThreadedExecutor::spin() ()
   from /home/gerkey/ros2_ws/install/lib/librclcpp__rmw_fastrtps_cpp.so
#6  0x00007f1494ae211c in rclcpp::spin(std::shared_ptr<rclcpp::node::Node>) ()
   from /home/gerkey/ros2_ws/install/lib/librclcpp__rmw_fastrtps_cpp.so
#7  0x0000000000415d3d in main ()

Here's a backtrace from attaching to the service client:

(gdb) thread apply all bt

Thread 7 (Thread 0x7fbe9c70e700 (LWP 22324)):
#0  0x00007fbe9dca3b13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fbe9cfc5308 in boost::asio::detail::epoll_reactor::run(bool, boost::asio::detail::op_queue<boost::asio::detail::task_io_service_operation>&) () from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#2  0x00007fbe9cfc8c78 in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#3  0x00007fbe9cfcd547 in eprosima::fastrtps::rtps::ResourceEvent::run_io_service() ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#4  0x00007fbe9cd37a4a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
#5  0x00007fbe9e490182 in start_thread (arg=0x7fbe9c70e700) at pthread_create.c:312
#6  0x00007fbe9dca347d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 6 (Thread 0x7fbe9bf0d700 (LWP 22325)):
#0  0x00007fbe9dca3b13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fbe9cfc5308 in boost::asio::detail::epoll_reactor::run(bool, boost::asio::detail::op_queue<boost::asio::detail::task_io_service_operation>&) () from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#2  0x00007fbe9cfc8c78 in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#3  0x00007fbe9cfc1abf in eprosima::fastrtps::rtps::ListenResourceImpl::run_io_service() ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#4  0x00007fbe9cd37a4a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
#5  0x00007fbe9e490182 in start_thread (arg=0x7fbe9bf0d700) at pthread_create.c:312
#6  0x00007fbe9dca347d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 5 (Thread 0x7fbe9b70c700 (LWP 22326)):
#0  0x00007fbe9dca3b13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fbe9cfc5308 in boost::asio::detail::epoll_reactor::run(bool, boost::asio::detail::op_queue<boost::asio::detail::task_io_service_operation>&) () from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#2  0x00007fbe9cfc8c78 in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#3  0x00007fbe9cfc1abf in eprosima::fastrtps::rtps::ListenResourceImpl::run_io_service() ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#4  0x00007fbe9cd37a4a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
#5  0x00007fbe9e490182 in start_thread (arg=0x7fbe9b70c700) at pthread_create.c:312
#6  0x00007fbe9dca347d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 4 (Thread 0x7fbe9af0b700 (LWP 22327)):
#0  0x00007fbe9dca3b13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fbe9cfc5308 in boost::asio::detail::epoll_reactor::run(bool, boost::asio::detail::op_queue<boost::asio::detail::task_io_service_operation>&) () from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#2  0x00007fbe9cfc8c78 in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#3  0x00007fbe9cfc1abf in eprosima::fastrtps::rtps::ListenResourceImpl::run_io_service() ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#4  0x00007fbe9cd37a4a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
#5  0x00007fbe9e490182 in start_thread (arg=0x7fbe9af0b700) at pthread_create.c:312
#6  0x00007fbe9dca347d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3 (Thread 0x7fbe9a70a700 (LWP 22328)):
#0  0x00007fbe9dca3b13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fbe9cfc5308 in boost::asio::detail::epoll_reactor::run(bool, boost::asio::detail::op_queue<boost::asio::detail::task_io_service_operation>&) () from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#2  0x00007fbe9cfc8c78 in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#3  0x00007fbe9cfc1abf in eprosima::fastrtps::rtps::ListenResourceImpl::run_io_service() ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#4  0x00007fbe9cd37a4a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
#5  0x00007fbe9e490182 in start_thread (arg=0x7fbe9a70a700) at pthread_create.c:312
#6  0x00007fbe9dca347d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Thread 2 (Thread 0x7fbe99f09700 (LWP 22329)):
#0  0x00007fbe9dca3b13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fbe9cfc5308 in boost::asio::detail::epoll_reactor::run(bool, boost::asio::detail::op_queue<boost::asio::detail::tas---Type <return> to continue, or q <return> to quit---
k_io_service_operation>&) () from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#2  0x00007fbe9cfc8c78 in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#3  0x00007fbe9cfc1abf in eprosima::fastrtps::rtps::ListenResourceImpl::run_io_service() ()
   from /home/gerkey/ros2_ws/install/lib/libfastrtps.so
#4  0x00007fbe9cd37a4a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
#5  0x00007fbe9e490182 in start_thread (arg=0x7fbe99f09700) at pthread_create.c:312
#6  0x00007fbe9dca347d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x7fbe9f437780 (LWP 22319)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007fbe9e2324bc in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007fbe9e6c744f in rmw_wait () from /home/gerkey/ros2_ws/install/lib/librmw_fastrtps_cpp.so
#3  0x00007fbe9ed5cc20 in rclcpp::executor::Executor::wait_for_work(std::chrono::duration<long, std::ratio<1l, 1000000000l> >) () from /home/gerkey/ros2_ws/install/lib/librclcpp__rmw_fastrtps_cpp.so
#4  0x00007fbe9ed5dd90 in rclcpp::executor::Executor::get_next_executable(std::chrono::duration<long, std::ratio<1l, 1000000000l> >) () from /home/gerkey/ros2_ws/install/lib/librclcpp__rmw_fastrtps_cpp.so
#5  0x00007fbe9ed5bbac in rclcpp::executor::Executor::spin_once(std::chrono::duration<long, std::ratio<1l, 1000000000l> >) ()
   from /home/gerkey/ros2_ws/install/lib/librclcpp__rmw_fastrtps_cpp.so
#6  0x000000000048e8f0 in rclcpp::executor::FutureReturnCode rclcpp::executor::Executor::spin_until_future_complete<std::shared_ptr<test_rclcpp::srv::AddTwoInts_Response_<std::allocator<void> > >, std::ratio<1l, 1000l> >(std::shared_future<std::shared_ptr<test_rclcpp::srv::AddTwoInts_Response_<std::allocator<void> > > >&, std::chrono::duration<long, std::ratio<1l, 1000l> >)
    ()
#7  0x000000000048d4a0 in rclcpp::executor::FutureReturnCode rclcpp::executors::spin_node_until_future_complete<std::shared_ptr<test_rclcpp::srv::AddTwoInts_Response_<std::allocator<void> > >, std::ratio<1l, 1000l> >(rclcpp::executor::Executor&, std::shared_ptr<rclcpp::node::Node>, std::shared_future<std::shared_ptr<test_rclcpp::srv::AddTwoInts_Response_<std::allocator<void> > > >&, std::chrono::duration<long, std::ratio<1l, 1000l> >) ()
#8  0x000000000048bd62 in rclcpp::executor::FutureReturnCode rclcpp::spin_until_future_complete<std::shared_ptr<test_rclcpp::srv::AddTwoInts_Response_<std::allocator<void> > >, std::ratio<1l, 1000l> >(std::shared_ptr<rclcpp::node::Node>, std::shared_future<std::shared_ptr<test_rclcpp::srv::AddTwoInts_Response_<std::allocator<void> > > >&, std::chrono::duration<long, std::ratio<1l, 1000l> >) ()
#9  0x0000000000488593 in test_services_client__rmw_fastrtps_cpp_test_add_noreqid_Test::TestBody() ()
#10 0x00000000004b4776 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ()
#11 0x00000000004afd66 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ()
#12 0x000000000049d81d in testing::Test::Run() ()
#13 0x000000000049df22 in testing::TestInfo::Run() ()
#14 0x000000000049e47e in testing::TestCase::Run() ()
#15 0x00000000004a2f24 in testing::internal::UnitTestImpl::RunAllTests() ()
#16 0x00000000004b56b7 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ()
#17 0x00000000004b0c70 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) ()
#18 0x00000000004a1e8f in testing::UnitTest::Run() ()
#19 0x0000000000489032 in main ()

@gerkey
Copy link
Member Author

gerkey commented Jan 5, 2016

Turns out that I wasn't getting a core file because I had an old one already present in the same directory; rookie mistake.

After rebuilding in RelWithDebInfo, here's a stack trace from a server-side segfault, showing that info_->request_subscriber_ is null here. The crash results from trying to invoke takeNextData() on that null pointer (this=0x0 in the bottom frame).

#0  eprosima::fastrtps::Subscriber::takeNextData (this=0x0, data=data@entry=0x7efec0000ad0, info=info@entry=0x7efeccd53980)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/subscriber/Subscriber.cpp:38
#1  0x00007efecf4e9a4f in ServiceListener::onNewDataMessage (this=0xad01310, sub=<optimized out>)
    at /home/gerkey/ros2_ws/src/eProsima/ROS-RMW-Fast-RTPS-cpp/rmw_fastrtps_cpp/src/functions.cpp:904
#2  0x00007efecde2877a in eprosima::fastrtps::rtps::StatelessReader::change_received (this=this@entry=0xad98540, change=0xad92530)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/reader/StatelessReader.cpp:101
#3  0x00007efecde28a04 in eprosima::fastrtps::rtps::StatelessReader::processDataMsg (this=0xad98540, change=0xfb3fd0)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/reader/StatelessReader.cpp:182
#4  0x00007efecde2f57d in eprosima::fastrtps::rtps::MessageReceiver::proc_Submsg_Data (this=this@entry=0xfa15c0, msg=msg@entry=0xfa15d0, 
    smh=smh@entry=0x7efeccd53b80, last=last@entry=0x7efeccd53b7f)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/messages/MessageReceiver.cpp:493
#5  0x00007efecde318ba in eprosima::fastrtps::rtps::MessageReceiver::processCDRMsg (this=0xfa15c0, RTPSParticipantguidprefix=..., 
    loc=loc@entry=0xfa13b0, msg=0xfa15d0) at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/messages/MessageReceiver.cpp:190
#6  0x00007efecde0995b in eprosima::fastrtps::rtps::ListenResourceImpl::newCDRMessage (this=0xfa1310, err=..., msg_size=<optimized out>)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/resources/ListenResourceImpl.cpp:121
#7  0x00007efecde0c013 in operator() (a2=<optimized out>, a1=..., p=<optimized out>, this=0x7efeccd53cd0)
    at /usr/include/boost/bind/mem_fn_template.hpp:280
#8  operator()<boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, const boost::system::error_code&, long unsigned int>, boost::_bi::list2<const boost::system::error_code&, long unsigned int const&> > (a=<synthetic pointer>, f=..., this=0x7efeccd53ce0)
    at /usr/include/boost/bind/bind.hpp:392
#9  operator()<boost::system::error_code, long unsigned int> (a2=@0x7efeccd53cf8: 76, a1=..., this=0x7efeccd53cd0)
    at /usr/include/boost/bind/bind_template.hpp:102
#10 operator() (this=0x7efeccd53cd0) at /usr/include/boost/asio/detail/bind_handler.hpp:127
#11 asio_handler_invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::system::error_code, unsigned long> > (function=...)
    at /usr/include/boost/asio/handler_invoke_hook.hpp:64
#12 invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::system::error_code, unsigned long>, boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> > > (context=..., function=...)
    at /usr/include/boost/asio/detail/handler_invoke_helpers.hpp:37
#13 boost::asio::detail::reactive_socket_recvfrom_op<boost::asio::mutable_buffers_1, boost::asio::ip::basic_endpoint<boost::asio::ip::udp>, boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> > >::do_complete (owner=0xfa1450, base=0xfc4450) at /usr/include/boost/asio/detail/reactive_socket_recvfrom_op.hpp:120
#14 0x00007efecde1071f in complete (bytes_transferred=<optimized out>, ec=..., owner=..., this=0xfc4450)
    at /usr/include/boost/asio/detail/task_io_service_operation.hpp:37
#15 do_run_one (ec=..., this_thread=..., lock=..., this=<optimized out>) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:384
#16 boost::asio::detail::task_io_service::run (this=0xfa1450, ec=...) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:153
#17 0x00007efecde0966f in run (this=0xfa1330) at /usr/include/boost/asio/impl/io_service.ipp:59
#18 eprosima::fastrtps::rtps::ListenResourceImpl::run_io_service (this=0xfa1310)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/resources/ListenResourceImpl.cpp:333
#19 0x00007efecdb7ea4a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
#20 0x00007efece5d3182 in start_thread (arg=0x7efeccd54700) at pthread_create.c:312
#21 0x00007efececf047d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

@gerkey
Copy link
Member Author

gerkey commented Jan 5, 2016

In the stack trace I posted, there's a missing step between frame 2, where onNewCacheChangeAdded() is called, and frame 1, where we're in onNewDataMessage(). Looking through the code, I would guess that we're going through this function, and that's it's unavailable in the debugger due to inlining or optimization.

@gerkey
Copy link
Member Author

gerkey commented Jan 5, 2016

After adding some debug prints, I can see that, when the segfault occurs, this line is about to execute (i.e., I see a print from immediately before), but it doesn't complete (i.e., I don't see a print from immediately after). Looks like somebody is trying to use the service object before it's finished being created.

@richiprosima
Copy link
Contributor

Hi I couldn't replicate the segmentation fault. But I know what is happening. I've updated FastRTPS and ROS-RMW-Fast-RTPS-cpp repositories. Also I fixed a deadlock.

@gerkey
Copy link
Member Author

gerkey commented Jan 11, 2016

Thanks!

The nightly build (which runs each test 20 times to check for sporadic failures) has been passing on Linux for several days now, which is very promising: http://ci.ros2.org/view/nightly/job/nightly_linux/.

There are still some FastRTPS test failures in the nightly for OSX: http://ci.ros2.org/view/nightly/job/nightly_osx/182/testReport/, but that might not be the fault of FastRTPS. I'll look into it, and reopen this ticket or open a new one if needed.

@gerkey gerkey closed this as completed Jan 11, 2016
@gerkey gerkey reopened this Jan 11, 2016
@gerkey
Copy link
Member Author

gerkey commented Jan 11, 2016

When built in Debug mode (pass --cmake-args -DCMAKE_BUILD_TYPE=Debug to ament.py when building), I'm still seeing pretty frequent deadlocks. And I just got another segfault in the server process. Below is the backtrace from the two threads that seem relevant; maybe they're fighting over access to the logging system (both threads are at this line).

(gdb) thread 2
[Switching to thread 2 (Thread 0x7f6f8affd700 (LWP 15246))]
#0  0x00007f6f928adbe8 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
(gdb) bt
#0  0x00007f6f928adbe8 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x00007f6f928add06 in std::ostreambuf_iterator<char, std::char_traits<char> > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<long>(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, long) const ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007f6f928ae2bd in std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::do_put(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, long) const () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007f6f928ba06e in std::ostream& std::ostream::_M_insert<long>(long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007f6f913d5d4e in eprosima::fastrtps::rtps::operator<< (output=..., enI=...)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/include/fastrtps/rtps/writer/../common/Guid.h:280
#5  0x00007f6f913e640d in eprosima::fastrtps::rtps::operator<< (output=..., guid=...)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/include/fastrtps/rtps/writer/../common/Guid.h:433
#6  0x00007f6f9140e93d in eprosima::fastrtps::rtps::MessageReceiver::proc_Submsg_Data (this=0xddbe60, msg=0xddbe70, smh=0x7f6f8affc7a0, 
    last=0x7f6f8affc718) at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/messages/MessageReceiver.cpp:486
#7  0x00007f6f9140d1ac in eprosima::fastrtps::rtps::MessageReceiver::processCDRMsg (this=0xddbe60, RTPSParticipantguidprefix=..., 
    loc=0xddb5c0, msg=0xddbe70) at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/messages/MessageReceiver.cpp:190
#8  0x00007f6f913c311b in eprosima::fastrtps::rtps::ListenResourceImpl::newCDRMessage (this=0xddb520, err=..., msg_size=420)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/resources/ListenResourceImpl.cpp:121
#9  0x00007f6f913d3fee in boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>::operator() (this=0x7f6f8affcaa0, p=0xddb520, a1=..., a2=420) at /usr/include/boost/bind/mem_fn_template.hpp:280
#10 0x00007f6f913d3e9e in boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()>::operator()<boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list2<boost::system::error_code const&, unsigned long const&> > (this=0x7f6f8affcab0, f=..., a=...)
    at /usr/include/boost/bind/bind.hpp:392
#11 0x00007f6f913d3dd0 in boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> >::operator()<boost::system::error_code, unsigned long> (this=0x7f6f8affcaa0, a1=..., a2=@0x7f6f8affcac8: 420)
    at /usr/include/boost/bind/bind_template.hpp:102
#12 0x00007f6f913d3d49 in boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::system::error_code, unsigned long>::operator() (this=0x7f6f8affcaa0)
    at /usr/include/boost/asio/detail/bind_handler.hpp:127
#13 0x00007f6f913d3b7b in boost::asio::asio_handler_invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::system::error_code, unsigned long> > (function=...)
    at /usr/include/boost/asio/handler_invoke_hook.hpp:64
#14 0x00007f6f913d3724 in boost_asio_handler_invoke_helpers::invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::system::error_code, unsigned long>, boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> > > (function=..., 
    context=...) at /usr/include/boost/asio/detail/handler_invoke_helpers.hpp:37
#15 0x00007f6f913d3119 in boost::asio::detail::reactive_socket_recvfrom_op<boost::asio::mutable_buffers_1, boost::asio::ip::basic_endpoint<boost::asio::ip::udp>, boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> > >::do_complete (owner=0xddb610, base=0xe00590) at /usr/include/boost/asio/detail/reactive_socket_recvfrom_op.hpp:120
#16 0x00007f6f913c65d6 in boost::asio::detail::task_io_service_operation::complete (this=0xe00590, owner=..., ec=..., bytes_transferred=0)
    at /usr/include/boost/asio/detail/task_io_service_operation.hpp:37
#17 0x00007f6f913c8ce9 in boost::asio::detail::task_io_service::do_run_one (this=0xddb610, lock=..., this_thread=..., ec=...)
    at /usr/include/boost/asio/detail/impl/task_io_service.ipp:384
#18 0x00007f6f913c8801 in boost::asio::detail::task_io_service::run (this=0xddb610, ec=...)
    at /usr/include/boost/asio/detail/impl/task_io_service.ipp:153
#19 0x00007f6f913c9047 in boost::asio::io_service::run (this=0xddb540) at /usr/include/boost/asio/impl/io_service.ipp:59
#20 0x00007f6f913c45f6 in eprosima::fastrtps::rtps::ListenResourceImpl::run_io_service (this=0xddb520)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/resources/ListenResourceImpl.cpp:333
#21 0x00007f6f913d4e89 in boost::_mfi::mf0<void, eprosima::fastrtps::rtps::ListenResourceImpl>::operator() (this=0xe007d8, p=0xddb520)
    at /usr/include/boost/bind/mem_fn_template.hpp:49
#22 0x00007f6f913d4dec in boost::_bi::list1<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*> >::operator()<boost::_mfi::mf0<void, eprosima::fastrtps::rtps::ListenResourceImpl>, boost::_bi::list0> (this=0xe007e8, f=..., a=...) at /usr/include/boost/bind/bind.hpp:253
#23 0x00007f6f913d4a83 in boost::_bi::bind_t<void, boost::_mfi::mf0<void, eprosima::fastrtps::rtps::ListenResourceImpl>, boost::_bi::list1<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*> > >::operator() (this=0xe007d8)
    at /usr/include/boost/bind/bind_template.hpp:20
#24 0x00007f6f913d45e8 in boost::detail::thread_data<boost::_bi::bind_t<void, boost::_mfi::mf0<void, eprosima::fastrtps::rtps::ListenResourceImpl>, boost::_bi::list1<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*> > > >::run (this=0xe00620)
    at /usr/include/boost/thread/detail/thread.hpp:117
#25 0x00007f6f91011a4a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
#26 0x00007f6f91c19182 in start_thread (arg=0x7f6f8affd700) at pthread_create.c:312
#27 0x00007f6f9234447d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f6f8bfff700 (LWP 15243))]
#0  0x00007f6f913d5cfd in eprosima::fastrtps::rtps::operator<< (output=..., enI=...)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/include/fastrtps/rtps/writer/../common/Guid.h:280
280     output<<(int)enI.value[0]<<"."<<(int)enI.value[1]<<"."<<(int)enI.value[2]<<"."<<(int)enI.value[3];
(gdb) bt
#0  0x00007f6f913d5cfd in eprosima::fastrtps::rtps::operator<< (output=..., enI=...)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/include/fastrtps/rtps/writer/../common/Guid.h:280
#1  0x00007f6f9142b523 in eprosima::fastrtps::SubscriberHistory::received_change (this=0xaa40ba8, a_change=0xaad2540)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/subscriber/SubscriberHistory.cpp:132
#2  0x00007f6f91403503 in eprosima::fastrtps::rtps::StatelessReader::change_received (this=0xaad8a60, change=0xaad2540)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/reader/StatelessReader.cpp:94
#3  0x00007f6f91403af6 in eprosima::fastrtps::rtps::StatelessReader::processDataMsg (this=0xaad8a60, change=0xda3000)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/reader/StatelessReader.cpp:182
#4  0x00007f6f9140e9d0 in eprosima::fastrtps::rtps::MessageReceiver::proc_Submsg_Data (this=0xd8ff60, msg=0xd8ff70, smh=0x7f6f8bffe7a0, 
    last=0x7f6f8bffe718) at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/messages/MessageReceiver.cpp:493
#5  0x00007f6f9140d1ac in eprosima::fastrtps::rtps::MessageReceiver::processCDRMsg (this=0xd8ff60, RTPSParticipantguidprefix=..., 
    loc=0xd8f670, msg=0xd8ff70) at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/messages/MessageReceiver.cpp:190
#6  0x00007f6f913c311b in eprosima::fastrtps::rtps::ListenResourceImpl::newCDRMessage (this=0xd8f5d0, err=..., msg_size=76)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/resources/ListenResourceImpl.cpp:121
#7  0x00007f6f913d3fee in boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>::operator() (this=0x7f6f8bffeaa0, p=0xd8f5d0, a1=..., a2=76) at /usr/include/boost/bind/mem_fn_template.hpp:280
#8  0x00007f6f913d3e9e in boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()>::operator()<boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list2<boost::system::error_code const&, unsigned long const&> > (this=0x7f6f8bffeab0, f=..., a=...)
    at /usr/include/boost/bind/bind.hpp:392
#9  0x00007f6f913d3dd0 in boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> >::operator()<boost::system::error_code, unsigned long> (this=0x7f6f8bffeaa0, a1=..., a2=@0x7f6f8bffeac8: 76)
    at /usr/include/boost/bind/bind_template.hpp:102
#10 0x00007f6f913d3d49 in boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::system::error_code, unsigned long>::operator() (this=0x7f6f8bffeaa0)
    at /usr/include/boost/asio/detail/bind_handler.hpp:127
#11 0x00007f6f913d3b7b in boost::asio::asio_handler_invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::system::error_code, unsigned long> > (function=...)
    at /usr/include/boost/asio/handler_invoke_hook.hpp:64
#12 0x00007f6f913d3724 in boost_asio_handler_invoke_helpers::invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> >, boost::system::error_code, unsigned long>, boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> > > (function=..., 
    context=...) at /usr/include/boost/asio/detail/handler_invoke_helpers.hpp:37
#13 0x00007f6f913d3119 in boost::asio::detail::reactive_socket_recvfrom_op<boost::asio::mutable_buffers_1, boost::asio::ip::basic_endpoint<boost::asio::ip::udp>, boost::_bi::bind_t<void, boost::_mfi::mf2<void, eprosima::fastrtps::rtps::ListenResourceImpl, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*>, boost::arg<1> (*)(), boost::arg<2> (*)()> > >::do_complete (owner=0xd8f710, base=0xdb4830) at /usr/include/boost/asio/detail/reactive_socket_recvfrom_op.hpp:120
#14 0x00007f6f913c65d6 in boost::asio::detail::task_io_service_operation::complete (this=0xdb4830, owner=..., ec=..., bytes_transferred=0)
    at /usr/include/boost/asio/detail/task_io_service_operation.hpp:37
#15 0x00007f6f913c8ce9 in boost::asio::detail::task_io_service::do_run_one (this=0xd8f710, lock=..., this_thread=..., ec=...)
    at /usr/include/boost/asio/detail/impl/task_io_service.ipp:384
#16 0x00007f6f913c8801 in boost::asio::detail::task_io_service::run (this=0xd8f710, ec=...)
    at /usr/include/boost/asio/detail/impl/task_io_service.ipp:153
#17 0x00007f6f913c9047 in boost::asio::io_service::run (this=0xd8f5f0) at /usr/include/boost/asio/impl/io_service.ipp:59
#18 0x00007f6f913c45f6 in eprosima::fastrtps::rtps::ListenResourceImpl::run_io_service (this=0xd8f5d0)
    at /home/gerkey/ros2_ws/src/eProsima/Fast-RTPS/src/cpp/rtps/resources/ListenResourceImpl.cpp:333
#19 0x00007f6f913d4e89 in boost::_mfi::mf0<void, eprosima::fastrtps::rtps::ListenResourceImpl>::operator() (this=0xdb4a78, p=0xd8f5d0)
    at /usr/include/boost/bind/mem_fn_template.hpp:49
#20 0x00007f6f913d4dec in boost::_bi::list1<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*> >::operator()<boost::_mfi::mf0<void, eprosima::fastrtps::rtps::ListenResourceImpl>, boost::_bi::list0> (this=0xdb4a88, f=..., a=...) at /usr/include/boost/bind/bind.hpp:253
#21 0x00007f6f913d4a83 in boost::_bi::bind_t<void, boost::_mfi::mf0<void, eprosima::fastrtps::rtps::ListenResourceImpl>, boost::_bi::list1<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*> > >::operator() (this=0xdb4a78)
    at /usr/include/boost/bind/bind_template.hpp:20
#22 0x00007f6f913d45e8 in boost::detail::thread_data<boost::_bi::bind_t<void, boost::_mfi::mf0<void, eprosima::fastrtps::rtps::ListenResourceImpl>, boost::_bi::list1<boost::_bi::value<eprosima::fastrtps::rtps::ListenResourceImpl*> > > >::run (this=0xdb48c0)
    at /usr/include/boost/thread/detail/thread.hpp:117
#23 0x00007f6f91011a4a in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
#24 0x00007f6f91c19182 in start_thread (arg=0x7f6f8bfff700) at pthread_create.c:312
#25 0x00007f6f9234447d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

@gerkey
Copy link
Member Author

gerkey commented Feb 9, 2016

With the latest changes, we haven't seen any failures in our nightly CI in some time, which is great!

When locally testing on Linux, I'm still able to get the test to hang sometimes, but I'm not seeing any crashes. I'll close this ticket and open a new one if and when I can provide more information about the hangs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants