Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create SLAM launch files analog to localization launch files #1630

Closed
SteveMacenski opened this issue Apr 8, 2020 · 22 comments
Closed

Create SLAM launch files analog to localization launch files #1630

SteveMacenski opened this issue Apr 8, 2020 · 22 comments

Comments

@SteveMacenski
Copy link
Member

SteveMacenski commented Apr 8, 2020

This is to make it easy to switch between and give a clear separation of ideas of positioning systems from the core navigation code ("look, you can use AMCL or ST, without changing anything else, all positioning systems are fair game"). Add the saver server to this to be analog to the map server server.

@ruffsl
Copy link
Member

ruffsl commented Apr 16, 2020

I'd like to get around to updating the turtlebot3_demo from the security working group to ROS2 Foxy soon after its release, as it give a viable example of a modest stack to showcase SROS2 with. Switching from cartographer to SLAM toolbox with provided launch files from here might be nice.
https://github.com/ros-swg/turtlebot3_demo

BWT is there a ticket to track for foxy release of navigation2 we could follow?

@SteveMacenski
Copy link
Member Author

Foxy milestone here: https://github.com/ros-planning/navigation2/milestone/15 though from where I look right now, I think most of those won’t be done in the next few weeks.

I’ve been working off this queue: https://github.com/ros-planning/navigation2/projects/2 focused on stability and completeness over new feature development after Matt and Co. let the project. The last month or so I’ve been working on a new planner so one of the reasons my relative push traffic has dropped.

I’m pretty sure a minor adaption of the launch file in slam toolbox will be sufficient. I just haven’t gotten to A) port/testing B) testing on hardware to make some nice demo content and C) write up some wiki/blog/website page on it.

I’d be more than happy to have an SROS2 demo/use-case/case study as well to show some tangible integration into a relatively complete application!

@SteveMacenski
Copy link
Member Author

SteveMacenski commented May 5, 2020

@AlexeyMerzlyakov while we're designing the semantics stuff, this would be a good ticket to compliment your map server work. This launch file should include the slam toolbox launch file/config, and your new map saver server. See https://github.com/ros-planning/navigation2/blob/master/nav2_bringup/bringup/launch/localization_launch.py as a guide.

You can also add it to https://github.com/ros-planning/navigation2/blob/master/nav2_bringup/bringup/launch/bringup_launch.py with a conditional based on a param to launch in slam or localization mode.

That would give direct exposure to your work. Optionally update the tutorial on slam to use it and explain the map save server https://navigation.ros.org/tutorials/docs/navigation2_with_slam.html

(Additionally, from the state of the design discussions, it might be a good idea to start looking at how to change the map server to load YAML or XML based on extension)

@AlexeyMerzlyakov
Copy link
Collaborator

AlexeyMerzlyakov commented May 7, 2020

WIP. Now I am on AlexeyMerzlyakov@02e6815. There will be two new parameters for tb3_simulation_launch.py and nested bringup_launch.py:

  • run_slam
  • run_navigation

Each parameters triggered itself SLAM, navigation or SLAM&navigation mode. Default values are switched to run in navigation mode for backward compatibility.
Examples of usage:

ros2 launch nav2_bringup tb3_simulation_launch.py
ros2 launch nav2_bringup tb3_simulation_launch.py run_slam:=False run_navigation:=True
ros2 launch nav2_bringup tb3_simulation_launch.py run_slam:=True run_navigation:=False
ros2 launch nav2_bringup tb3_simulation_launch.py run_slam:=True run_navigation:=True

There are still some problems. The biggest one - is that when I running SLAM Toolbox, it gives correct odom -> map transform seen in view_frames -> frames.pdf. However, this transform is not visible in rviz2 and both for navigation nodes. Therefore SLAM&Navigation mode does not work. Also, the same problem I got when tried to pass through https://navigation.ros.org/tutorials/docs/navigation2_with_slam.html tutorial. So, it looks like a local TF problem. Will look further to fix it.

@SteveMacenski
Copy link
Member Author

I think you can just do a slam param default to false. If they just want slam and no navigation, they really shouldn’t be using this package’s launch files 😉

@AlexeyMerzlyakov
Copy link
Collaborator

There are still some problems

Status update:
There are 3 problems I've met. Two of them have been fixed. One is in progress.

[Issue No.1: tf time issue] <- Fixed
odom->map SLAM transform and other transforms were published in different time basis:
Screenshot_2020-05-22_16-47-17
This was resolved by PR SteveMacenski/slam_toolbox#203 in SLAM Toolbox by adding the support of use_sim_time parameter in launch files.

@AlexeyMerzlyakov
Copy link
Collaborator

AlexeyMerzlyakov commented May 22, 2020

[Issue No.2: can not start navigation] <- Fixed
After navigation with SLAM started, robot can not produce a path to any goal.
The history of the problem starts in #1521 ticket where RealSense robot camera became to block laser scan and cause some flickering. After fix was applied, flickering visually disappeared. But in my case it looks like sometimes still appear, not notable by visual view but causing an artifact on map:
map_artifact
camera_artifact_gazebo
This causing costmap to be generated with high and lethal cots around the robot which actually won't allow it to move:
lethal_costmap_around_robot_2
The problem was fixed for me just by slightly lowering camera in a waffle model:

--- a/nav2_bringup/bringup/worlds/waffle.model
+++ b/nav2_bringup/bringup/worlds/waffle.model
@@ -413,7 +413,7 @@
           <mass>0.035</mass>
         </inertial>
         <collision name="collision">
-          <pose>0 0.047 -0.004 0 0 0</pose>
+          <pose>0 0.047 -0.005 0 0 0</pose>
           <geometry>
             <box>
               <size>0.008 0.130 0.022</size>
@@ -441,7 +441,7 @@
         </sensor>
 
         <collision name="collision">
-          <pose>0 0.047 -0.004 0 0 0</pose>
+          <pose>0 0.047 -0.005 0 0 0</pose>
           <geometry>
             <box>
               <size>0.008 0.130 0.022</size>

@AlexeyMerzlyakov
Copy link
Collaborator

AlexeyMerzlyakov commented May 22, 2020

[Issue No.3: robot sometimes hangs] <- Not fixed yet
The problem appears after navigation goal is set and robot became to navigation with simultaneously running SLAM. Sometimes during navigation robot stops with the message:

[controller_server-9] [ERROR] [1590144725.885012683] [tf_help]: Transform data too old when converting from odom to map
[controller_server-9] [ERROR] [1590144725.885060537] [tf_help]: Data time: 17s 0ns, Transform time: 16s 766000000ns
[controller_server-9] [ERROR] [1590144725.885617956] [controller_server]: Unable to transform robot pose into global plan's frame
[controller_server-9] [WARN] [1590144725.885676212] [controller_server_rclcpp_node]: [follow_path] [ActionServer] Aborting handle.

After that, robot won't react on any goal (looks like controller has stopped its work after exception thrown along with this message).
The root cause of the problem looks like described by @SteveMacenski in #1585 (comment). I've tried to add 1second ahead to SLAM toolbox when publishing TF messages:

   msg.header.frame_id = "/map";
-  msg.header.stamp = this->now();
+  msg.header.stamp = this->now() + rclcpp::Duration(1000000000);
   tfB_->sendTransform(msg);

After it these problems became to appear more rarely, but still not fixed by overall. Continuing to look into it.

@AlexeyMerzlyakov
Copy link
Collaborator

I think you can just do a slam param default to false. If they just want slam and no navigation, they really shouldn’t be using this package’s launch files wink

This was done: there will be only one slam parameter (False by default) enabling SLAM Toolbox + Map Saver Server during the navigation. I am currently on AlexeyMerzlyakov@3579850

@SteveMacenski
Copy link
Member Author

I'm really confused why you're running into so many issues. Several users and I have been working with Slam Toolbox in Navigation2 for months now using the launch files in the repo, I haven't experienced any of these issues. All this should require is a launch file to include that one and a bool param to toggle localization or slam.

What's the current state of things - you posted a number of issues that have been fixed, what's not fixed that's blocking this, specifically. SLAM Toolbox already has a transform timeout param to lead the transforms https://github.com/SteveMacenski/slam_toolbox/blob/eloquent-devel/src/slam_toolbox_common.cpp#L233. What's the blocker to shipping this?

@AlexeyMerzlyakov
Copy link
Collaborator

AlexeyMerzlyakov commented May 25, 2020

Yes, the last Issue No.3 is a blocking problem for now. Today, I've tested increasing transform_timeout SLAM Toolbox parameter from 0.2 -> to 1.0. Surprisingly it works well if FollowPath.transform_tolerance DWB parameter will be increased from 0.2 -> to 1.0 as well. That helped me: robot in Navigation+SLAM mode became to work stable:
navigation_with_slam

It looks like when SLAM starting to publish messages 1 second ahead, DWB should have increased tolerance on this time as well. Maybe this increases the number of valid TF-s available for DWB, that appears to be critical on high-workload environment.

What do you think about such workaround? If you are OK, I may submit one more PR to SLAM Toolbox for supporting transform_timeout parameter to set via launch files; and will set 1.0 values for these two parameters when slam:=True.

@SteveMacenski
Copy link
Member Author

That should have nothing to do with it. If AMCL is running on 0.2, then SLAM can as well, there's nothing different between them as far as TF is concerned. Changing DWB as well doesn't build confidence that this is the right choice since that is fully decoupled from this issue.

@AlexeyMerzlyakov
Copy link
Collaborator

Confirm: the problem No.3 is a local PC performance problem. It looks like TF is getting late because of high topics workload concerned with huge data of PCL2 publishing during TB3 simulation (RealSense depth camera is included into Navigation2 waffle.model; PCL2 data rate is about ~10 MBytes/second for default RealSense publish rate 5Hz). After I've temporary disabled RealSense depth camera, the problem completely disappeard on my PC when running navigation with SLAM toolbox. Also, after increasing of RealSense publish rate, I've discovered once or twice that the same messages became to appear for AMCL in Navigation mode as well. This also might explain why the problem did not observed when launching simulation through turtlebot3_gazebo package: TB3 model from this package does not contain RealSense depth simulation.

So, it looks to be local performance problem and for now it is not required to fix it in mainstream. I think we might return to this, if it will appear somewhere else.

Everything already is done locally and Navigation+SLAM is works as intended. So, if you don't mind I would like to complete this task. SteveMacenski/slam_toolbox#204 to be closed as invalid and #1768 will be re-submitted again with simplifications made.

@SteveMacenski
Copy link
Member Author

If you don't have a strong computer, it could also be due to the high CPU load. That depth camera plugin is brutal.

@ruffsl
Copy link
Member

ruffsl commented May 28, 2020

That depth camera plugin is brutal.

I think things could improve with this PR, but it need someone to prod it into mainline

[ros2] CUDA accelerate depth camera 981
ros-simulation/gazebo_ros_pkgs#981

@SteveMacenski
Copy link
Member Author

-1 for Cuda, it should be openCL so it works on all machines, that doesn't help me (or alot of people) at all.

@ruffsl
Copy link
Member

ruffsl commented May 28, 2020

I wish cuda didn't have such a monopoly in the ML community. As an aside, I haven't seen others using AMD hardware acceleration in Docker. Have you had any luck with non-nvidia GPUs?

@SteveMacenski
Copy link
Member Author

SteveMacenski commented May 28, 2020

The Intel CPUs have small GPUs in them that you can use for things like this with OpenCL. Overall, for open source, I want everything to be openCL so that its good for all use cases. I agree with ML there is a bit of a reason for Cuda because you couldn't get AMD or other branded GPUs to do what you needed to do for training and deployment.

Robotics is a different beast entirely and uses a bunch of Intel stuff and on the robot itself, the GPU is basically un-utilized since most robots don't have screens driven from the host machine (or no screen at all). Its a wasted resource that something like that could really utilize. If someone implemented some absolutely amazing thing in navigation2 with Cuda, I'd probably accept it, but I'd recommend anyone before starting to do OpenCL to be flexible.

A good example would be the obstacle/voxel layer, AMCL, DWB maybe, TEB for sure, inflation layer maybe, really should be GPU optimized. You don't need much, the embedded GPU would likely be enough for most of those if not all of tasks.

But yes, I ahve before had luck with non-nvidia GPUs for smaller GPU tasks. I wouldn't run my 3 stage CNN or a deep net on it, but its totally suitable for typical non-deep-net applications like ray casting, particle updates, planning and control sampling, even less-deep AI applications like random forests and other techniques. I suppose it would also work well for shallow and small networks (e.g. non-image data networks, think IMU, laser, etc).

@ruffsl
Copy link
Member

ruffsl commented May 28, 2020

Robotics is a different beast entirely...
Its a wasted resource that something like that could really utilize.

I've seen a number of robotic planning and control projects already using GPUs for embedded

The core idea behind MPPI is to sample thousands of trajectories really fast. This is accomplished by implementing the sampling step on a GPU, for which you will need CUDA.
https://github.com/AutoRally/autorally#1-install-prerequisites

@SteveMacenski
Copy link
Member Author

Your link makes alot of sense, things like DWAs or controllers that are sampling a bunch of paths, GPUs make a ton of sense. And especially because TEB, DWB, etc never run as fast as I want, optimizing there would be a really high-impact role. If we can speed up the controllers and we can speed up ray casting for the costmap layers, we could potentially run the local costmap at 10hz or something and the controller at 50hz

@SteveMacenski
Copy link
Member Author

SteveMacenski commented May 28, 2020

Great link, I haven't seen that before, I'll have to look over that later today. Do you know if they have local planners worth using / porting to ROS2?

@ruffsl
Copy link
Member

ruffsl commented May 29, 2020

Not sure, but we could ask on there repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants