Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make intra-process manager thread safe, rename IPMState to IPMImpl #165

Merged
merged 2 commits into from
Dec 3, 2015

Conversation

jacquelinekay
Copy link
Contributor

To implement a lock-free IntraProcessManagerImpl in the future, I will extend from IntraProcessManagerImplBase and use lock-free structures instead of mutexes.

@jacquelinekay jacquelinekay added the in progress Actively being worked on (Kanban column) label Dec 1, 2015
@jacquelinekay jacquelinekay self-assigned this Dec 1, 2015
@tfoote tfoote added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels Dec 1, 2015
@@ -184,6 +186,7 @@ class IntraProcessManagerState : public IntraProcessManagerStateBase
size_t & size
)
{
std::lock_guard<std::mutex> lock(runtime_mutex_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this lock be moved into the next block, where we iterate over publishers_? Or do we need to hold it all the way through the other two blocks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To implement this more optimally, I could have one mutex for publishers_ and then a mutex for each PublisherInfo entry in publishers_. That way one thread could look up an entry in publishers_ while the other is looking up something in the map owned by an entry in publishers (I believe that would be fine). And yes, that implementation would include moving the mutex for publishers_ into the block where find is invoked one the map.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A mutex per item sounds like overkill; I would leave it as is.

I hadn't dug into the data types being handled, and didn't realize that you still need exclusion after pulling the items out of publishers_.

@gerkey
Copy link
Member

gerkey commented Dec 3, 2015

Without this change, I pretty reliably get a segfault after a dozen or so iterations of the test added in ros2/system_tests#72. With this change, I'm now at > 100 iterations with no problem.

FYI, the segfault (from a Debug build of ROS 2, but with a non-Debug build of opensplice) starts like this:

#0  0x00007f721d621b63 in std::_Rb_tree_increment(std::_Rb_tree_node_base const*) ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x00007f721dd4bb49 in std::_Rb_tree_const_iterator<unsigned long>::operator++ (this=0x7f721960b930)
    at /usr/include/c++/4.8/bits/stl_tree.h:270
#2  0x00007f721dd4a05e in std::__find<std::_Rb_tree_const_iterator<unsigned long>, unsigned long> (__first=..., 
    __last=..., __val=@0x7f721960b9b8: 161) at /usr/include/c++/4.8/bits/stl_algo.h:140
#3  0x00007f721dd49280 in std::find<std::_Rb_tree_const_iterator<unsigned long>, unsigned long> (__first=..., 
    __last=..., __val=@0x7f721960b9b8: 161) at /usr/include/c++/4.8/bits/stl_algo.h:4441
#4  0x00007f721dd48865 in rclcpp::intra_process_manager::IntraProcessManagerState<std::allocator<void> >::take_intra_process_message (this=0x973a78, intra_process_publisher_id=158, message_sequence_number=1, 
    requesting_subscriptions_intra_process_id=161, size=@0x7f721960ba68: 0)
    at /home/gerkey/ros2_ws/src/ros2/rclcpp/rclcpp/include/rclcpp/intra_process_manager_state.hpp:209
#5  0x00000000004ed496 in rclcpp::intra_process_manager::IntraProcessManager::take_intra_process_message<test_rclcpp::msg::UInt32_<std::allocator<void> >, std::allocator<void>, std::default_delete<test_rclcpp::msg::UInt32_<std::allocator<void> > > > (this=0x979380, intra_process_publisher_id=158, message_sequence_number=1, 
    requesting_subscriptions_intra_process_id=161, message=...)
    at /home/gerkey/ros2_ws/install/include/rclcpp/intra_process_manager.hpp:316

@jacquelinekay
Copy link
Contributor Author

I can run up to iteration 650 of the multithreaded test and then I get an error from OpenSplice:

Description : The Handle Server ran out of handle space

I get a similar error when I repeat single-threaded tests indefinitely, however, so I don't think it's relevant.

@gerkey
Copy link
Member

gerkey commented Dec 3, 2015

@jacquelinekay I get the same OpenSplice error, in my case after 116 iterations. Agreed that it's not related to this change (though it may indicate a resource management issue that we'll have to tackle at some point).

@gerkey
Copy link
Member

gerkey commented Dec 3, 2015

+1

jacquelinekay added a commit that referenced this pull request Dec 3, 2015
Make intra-process manager thread safe, rename IPMState to IPMImpl
@jacquelinekay jacquelinekay merged commit f73ebcb into master Dec 3, 2015
@jacquelinekay jacquelinekay deleted the intra_process_lock branch December 3, 2015 17:34
@jacquelinekay jacquelinekay removed the in review Waiting for review (Kanban column) label Dec 3, 2015
@dirk-thomas
Copy link
Member

Please create a ticket for the resource problem to keep track of it.

@jacquelinekay
Copy link
Contributor Author

ros2/rmw_opensplice#99

nnmm pushed a commit to ApexAI/rclcpp that referenced this pull request Jul 9, 2022
* add timer test

* more tests

* another one just for fun

* uncrustify
DensoADAS pushed a commit to DensoADAS/rclcpp that referenced this pull request Aug 5, 2022
Generate rclcpp::Node before start_recording since rclcpp::Node will set
parameters of use_sim_time and publish message to parameter_events.
This will cause the wrong messages count in the test.

Signed-off-by: evshary <evshary@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants