-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test for multithreaded execution #72
Conversation
#endif | ||
|
||
static inline void multi_consumer_pub_sub_test(bool intra_process) { | ||
rclcpp::init(0, nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This calls init
twice which currently is not a problem but might be as soon as the function does some real initialization work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I don't have a custom main which calls init before starting the tests. Where does init get called twice?
Also, I call shutdown
in all of the test cases, generally to stop the while loop in spin
. Do I have to call init
at the beginning of each test case since it is likely that the previous test case called shutdown
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both tests multi_consumer_single_producer
and multi_consumer_intraprocess
are calling this function and therefore init
.
Currently we don't have a function symmetric to init
. So I can't recommend anything but calling it only once in the main.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't shutdown
be made symmetric to init
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be made symmetric. I was just referring to that it currently is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (calling init, then shutdown, then init) actually is giving me a problem when I run all the test cases at once. I believe ros2/rclcpp#152 fixes the problem.
ament_add_gtest( | ||
gtest_multithreaded__${middleware_impl} | ||
"test/test_multithreaded.cpp" | ||
TIMEOUT 30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long is the expected runtime? Could the timeout be shorter?
I'm moving this back to In Progress to work on a segfault I've discovered in this test. |
580c94e
to
aebfb2d
Compare
ee18287
to
154e6c3
Compare
With the fix @gerkey pushed yesterday, I ran this test on repeat and got a segfault in the multi_access_publisher_intra_process case after 245 iterations. Previously it would take 1-10 iterations to get a segfault on my machine. I also no longer see deadlock. I will investigate the segfault and try to reproduce it. The core dump appears to have a corrupt stack and the code wasn't built in Debug mode. |
For better debugging I would also recommend to consider using a debug build of the rmw impl. you are using. |
@jacquelinekay, are you testing with the changes in ros2/rclcpp#165? |
With the changes made in ros2/rclcpp#165, I have run the test for over 400 iterations without seeing a segfault or deadlock. |
154e6c3
to
6aacc17
Compare
I've rebased this branch |
+1 (with ros2/rclcpp#165 required for reliable execution) This test helped us find some nasty races. Good stuff. |
@@ -12,6 +12,9 @@ endif() | |||
|
|||
find_package(ament_cmake REQUIRED) | |||
|
|||
#set (CMAKE_CXX_COMPILER "clang++") | |||
#set (CMAKE_LINKER "llvm-ld") | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed before merging.
80b57dc
to
31e9014
Compare
dbe538c
to
7feaba6
Compare
Test for multithreaded execution
This PR seem to be the reason that the master build on Windows is failing now: http://ci.ros2.org/job/ci_windows_opensplice/733/ Why did the previous CI jobs pass? Did you change further stuff after that? |
With the PR ros2/rclcpp#167 the code builds at least. But some tests fail: http://ci.ros2.org/job/ci_windows_opensplice/734/testReport/ |
Those test cases pass reliably on my Linux machine. I suspect the failures could be due to quirks in the Windows thread scheduler. |
While running these tests I also saw the following output. The test passed but the values for the first iterations look pretty wrong. Shouldn't this fail the test somehow?
|
The tests actually never check the data, they only check the number of times the callback was called (store in atomic uint called counter). I also never see that error on my machine. Which platform are you running on? And is it every time you run the test case or does it happen randomly? |
The snippet is from http://ci.ros2.org/job/ci_osx/541/consoleFull at time 33:57.272. |
This PR tests MultiThreadedExecutor
spin
andspin_some
in 4 test cases:Connects to ros2/ros2#92