-
Notifications
You must be signed in to change notification settings - Fork 436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support dedicated executor for component nodes #1781
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably we would want to discuss on implementation details, and test would be ideal too.
I think it's better to add an item |
@iuhilnehc-ynos i thought that too, i think that also makes sense. |
@iuhilnehc-ynos is it necessary to add a new item |
From my point of view, I think that I think that the user does not want to get What do you think? |
Sorry, I made a mistake. After reconsideration, I think it's OK to use |
according to suggestions from @fujitatomoya @iuhilnehc-ynos , i load |
@fujitatomoya @iuhilnehc-ynos Can you take another look? Zhengpeng made the requested changes:
|
@gezp i think i am good to go with test cases. |
With your approval, what's the block to merging? I actually don't know for rclcpp, do you require 2 approvals from maintainers? |
i am not sure if that is a rule, but i would like to hear at least one maintainer on this one. |
i fixed some issues and added test for
|
Any update? |
This pull request has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2021-10-28/22947/1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
i rebased this PR, please retrigger CI. @fujitatomoya |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments, could you please check them?
rclcpp_components/include/rclcpp_components/component_manager.hpp
Outdated
Show resolved
Hide resolved
rclcpp_components/include/rclcpp_components/component_manager.hpp
Outdated
Show resolved
Hide resolved
thanks for your suggestions @fujitatomoya @iuhilnehc-ynos , i updated this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Love that green ❤️ |
@ivanpauno @wjwwood @jacobperron requesting maintainer review on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding this option seems like too much configurability to me.
I would rather have an specialized component manager in another repo for this use case or just use manual composition when needed.
I understand this can be handy though, so I'm happy with it if another maintainer approves.
I have left some minor comments in the implementation.
rclcpp_components/include/rclcpp_components/component_manager.hpp
Outdated
Show resolved
Hide resolved
exec->remove_node(wrapper.second.get_node_base_interface()); | ||
} | ||
} | ||
for (auto & exec : dedicated_executors_) { | ||
exec.second->cancel(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cancel() has some issues, particularly it's ignored if it happens before the call to spin()
.
That can causes races if someone starts and interrupt the component manager fast.
I would rather use spin_until_future_complete(), and complete the promise when you need to cancel the executor thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rclcpp/rclcpp_components/src/component_manager.cpp
Lines 257 to 258 in 5fcb789
dedicated_executors_[node_id] = exec; | |
dedicated_threads_[node_id] = std::thread([exec]() {exec->spin();}); |
It is added to dedicated_executors_
right before being spun, so it should always have been called spin
to be on that map in the first place. I suppose the only way to make it more bulletproof is if those 2 lines are swapped so that its only added to dedicated_executors_
after its been spinning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Swapping the lines doesn't help.
The issue is that though the thread is already spawned, you don't actually know when there will be a context switch and spin()
will be actually called.
In an extremely unlucky scenario, the thread is spawn and an unload request is received before spin()
is called, resulting in the cancellation request being lost.
the "right way" to do it is:
std::promise<void> prom;
dedicated_threads_[node_id] = std::thread([exec, cancel_token=prom->get_future()]() {exec->spin_until_future_complete(cancel_token);});
dedicated_promises_[node_id] = std::move(promise);
- dedicated_executor->second->cancel();
+ dedicated_promise->second->set_value();
(the code will need some more changes than that, that's just the idea)
there's no race possible in that case, as a future/promise channel is stateful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the specific suggestion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for more details about how to implement, it's helpful to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to agree with @ivanpauno; rather than introducing a new option, I think it would be better to create a different component container that uses this dedicated executor approach (e.g. then you don't necessarily need the extra argument use_dedicated_thread
).
Technically, we can create a new executable where we pass a different executor to ComponentManager
or create an extension of ComponentManager
to use multiple single-threaded executors (like in this PR).
For example, we ended up creating a custom container in the domain_bridge that extends ComponentManager
to add a new option for setting the domain ID, versus adding more options in the base class here.
options.use_intra_process_comms(extra_argument.get_value<bool>()); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious about the decision to move this out of create_node_options
. Is there a particular reason?
It technically could break any subclasses that were depending on this option being set in this method (though I admit it is not particularly well-documented).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for pointing this problem, i agree with you that subclasses could be affected. use_intra_process_comms
and use_dedicated_thread
both are also extra argument, so i wanted to avoid duplicated code by using map to search extra argument, but i didn't notice this problem.
Agreed, I figure it better to be 'complete' though and its only an extra 6 lines of code to provide it out of the box initially. |
I am also good to go with this. |
i use a simple way to implement isolate executor container, which only support dedicated single threaded executor. accoding to @SteveMacenski suggestions, in addition, i use |
Its hard to tell without seeing it. It sounds like you would only need to change
I think that sounds the cleanest to me. You'll obviously need a little bit of code in the constructor / destructor as well, but little to any duplication if you call the base class constructor / destructors' to handle their own resources. What do you think? |
i updated PR.
need |
std::unordered_map<uint64_t, std::shared_ptr<rclcpp::Executor>> dedicated_executors_; | ||
std::unordered_map<uint64_t, std::thread> dedicated_threads_; | ||
std::unordered_map<uint64_t, std::promise<void>> dedicated_promises_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can create a custom struct for the three values (executor, thread, promise) and create only one unordered map.
You could also use a tuple instead of defining a custom struct (defining a custom struct makes the code more readable than using tuples).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. A custom struct is the cleanest way
} | ||
|
||
void | ||
ComponentManagerIsolated::on_load_node( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the code here is repeated from the base class and could be shared with it.
Same in on_unload_node().
Anyways the approach seems reasonable!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to ignore this comment
|
||
TEST_F(TestComponentManager, components_api) | ||
{ | ||
test_components_api(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nicer to use a parameterized test for this, see https://github.com/google/googletest/blob/main/docs/advanced.md#value-parameterized-tests.
If you feel that's an overkill here, add scoped traces so it's more clear where the errors come from.
You could also return an AssertionResult to avoid this pitfall.
I would like that, yes. I think that would make this future proof. This may require making the normal
👍 |
@@ -0,0 +1,33 @@ | |||
// Copyright 2019 Open Source Robotics Foundation, Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be your own copyright 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, the copyright can remain OSRF, and there may be (legal) reasons to prefer leaving it as-is for both the contributor and OSRF. But, there's precedent for allowing additional copyright holders in ROS 2 source code. I would defer to @clalancette and/or @tfoote for clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's certainly not required to put your own name there, but it's also totally fine as long as the changes to the file are non-trivial or it's a new file (which is the case here).
So, putting yourself there is ok, and leaving it as-is is also ok. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, putting yourself there is ok, and leaving it as-is is also ok. :)
Yes, exactly. Since this is a non-trivial amount of new work, it is perfectly acceptable to make yourself the copyright holder. In that case, you would change the line to be // Copyright 2021 <my name here>
. On the other hand, it is also perfectly fine to "assign" the copyright to the Open Source Robotics Foundation, which gives the foundation most of the legal power over the code. Totally up to you which way you want to go.
(the usual disclaimer of "I am not a lawyer" and "this is not legal advice" applies here)
protected: | ||
RCLCPP_COMPONENTS_PUBLIC | ||
void | ||
on_load_node( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll want to add documentation here like you see in the other .hpp files for auto-doc generation
std::unordered_map<uint64_t, std::shared_ptr<rclcpp::Executor>> dedicated_executors_; | ||
std::unordered_map<uint64_t, std::thread> dedicated_threads_; | ||
std::unordered_map<uint64_t, std::promise<void>> dedicated_promises_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. A custom struct is the cleanest way
const std::shared_ptr<LoadNode::Request> request, | ||
std::shared_ptr<LoadNode::Response> response) | ||
{ | ||
(void) request_header; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entire code block L46-L87 and L103-114 are exactly the same as in the other container.
What you could do instead is what I suggested above. Put the things in L88-L102 into a separate, virtual function with the implementation that is overrided. Then you won't even need to override or implement on_load_node
. This is really good not to duplicate code + makes it so any change to the main container are automatically applied here.
If you did that too, it would resolve Ivan's comment above on duplication as well, because you wouldn't have on_load_node
at all.
i updated PR.
please review again @SteveMacenski @ivanpauno . |
|
||
#include "rclcpp_components/component_manager.hpp" | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its up to core maintainers (@ivanpauno ) but I would rather the other containers follow suit with the ExecutorT
so that you can remove the constructor field. The template for type + executor constructor argument are redundant. You can have the same effect by removing all constructor arguments from the existing component_manager.hpp
and making that also templated, which constructs its internal executor by instantiating ExecutorT()
.
I'm not strongly opinioned one way or another - my thoughts are only for consistency which we may not need to strive for here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, needs the function documentation inline (see component_manager.hpp, you added there, just add here too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we're not taking any advantage of the template parameter here.
We can delete it or be consistent in the base class, I would just remove it as it doesn't seem to be needed.
(with the template parameter we could've not added the new virtual methods, but both approaches are fine)
This looks really good to me. This has my stamp of approval after the inline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I would prefer to see the component_container_mt_isolated
removed before merging.
See my other comment.
according to suggestions from @ivanpauno , @wjwwood , @SteveMacenski , i remove |
@gezp this seems to need a rebase to make CI pass, thanks! |
Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com> update default value of use_dedicated_executor Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com> create ComponentManagerIsolated Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com> use executor wrapper Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com> remove duplicated code Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com> template ComponentManagerIsolated and add component_container_mt_isolated Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com> update test Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com> updaate CMakeLists.txt Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com> add some docs comment for function Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com> remove component_container_mt_isolated and add CLI option Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com>
I approve this, though I would feel better if the normal container also contained the multi-threaded option and removed the separate container, but I leave it to maintainers. |
That seems fine to me. We should check the impact of removing the executable, I guess some launch files will break. I would do that in a new PR instead of here, as deprecating/removing an executable will likely require changes in other core repos, which will block this PR further. |
Thanks @gezp for the contribution!! |
👨🌾 I think this PR broke our Windows CI builds: The error:
This PR was modifying |
Signed-off-by: zhenpeng ge <zhenpeng.ge@qq.com>
Signed-off-by: zhenpeng ge zhenpeng.ge@qq.com
Related to #1774
it offer an “executor per component” behavior for container, we can use
use_dedicated_thread
parameter to enable this.