-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate 2D Lidar data into Donkeycar #910
Comments
Integrating 2D Lidar data into DonkeycarThe 2D Lidar dataA spinning lidar captures data in 360 degree arc around it. It provides each data point as a distance from the lidar and the angle at which that reading was taken. It has some maximum working range. That range can be thought of as the radius of a circle where the lidar is at the center of that circle. Beyond that range we may actually get some readings, but we treat those a noise. The lidar's 'zero' angle is relative to itself and so is all the data that it provides to us (it's super narcissistic). But we care about where the points are relative to the vehicle that the lidar is mounted on. If we treat directly forward as 'zero' angle for the vehicle, then we need to know the offset of the zero angle on the lidar relative to the forward direction of the vehicle. We can then adjust the angle component of each data point provided by the lidar to turn it into an angle relative to the vehicle. It may well be that the way the lidar is mounted that the zero angle of the lidar exactly coincides with forward on the vehicle, but we should include this offset (which in that case would be zero) anyway so we have a more general model that can be applied to other lidars that may not have their zero in the same place. Plotting the 2D Lidar dataSo once we turn the lidar data into a distance and angle relative to the vehicle, then we can use trigonometry to calculate (x,y) coordinates relative to the vehicle; the vehicle is defined as being at (0,0). We would likely want to set a bounding box in the 2D data space, so we could clip data we don't want. We would also want the dimensions of the bitmap we are to render onto.
A complicationAnother thing to think about is the skew inherent in the data taken from a moving vehicle. Remember that the each data point provided by the lidar is given as a distance from the lidar and the angle at which the lidar was pointing when it took the reading. But if the lidar is on a moving vehicle, then the position from which each point is taken is changing. The lidar has some rate at which it can make a full 360 degree sweep. This is typically 5 to 10 hz. If the vehicle is stopped, then all data points are from a circle with the same center. However, as the vehicle moves, each point is taken from a different location. If the vehicle is moving 1 meter per second, which is pretty slow for a race car, and the lidar is spinning at 5hz, then the lidar has moved 1/5 of a meter between taking the first point in a sweep and the last point. There is additional skew that happens if the vehicle is turning. A solutionIf we know the vehicle's position and orientation as it moves, then we can simply adjust each point provide by the lidar using the vehicle's position and orientation at the time that point was taken. If we have good odometry, then we can apply a kinematic model that estimates the vehicles position and orientation as it moves. Such estimates are good over short distances and we only need them to be good over a meter or so. If we have wheel encoders or an encoder on the drive motor, then we can use those. If we have an IMU, then we can also use that for the same purpose. Having both would be ideal. But what if we don't have good odometry? That is typical of a Donkeycar. We know the throttle value and steering value, but we don't actually know the speed of the car. However, with a little calibration we can roughly estimate the speed of the vehicle from the throttle value. The calibration would involve taking some measurements over a known distance at different constant throttle values and measuring the time to traverse the distance. Then we can do interpolation to estimate the speed. That would provide a very rough estimate. It is affected by the particular surface the vehicle is on and by the battery charge level. If the vehicle is changing speeds, then this model also does not take into account the lag in acceleration and deceleration. However, it is better than not doing any adjustment of the data. We also need to calibrate the steering angle as well; measure the radius of a full right and full left turn; then we can interpolate to estimate turn angle given a steering value. With those two things in place, we can then use the speed and steering angle as input to a kinematic model to estimate the vehicle's relative position and orientation over time. A Donkeycar 2D Lidar pipelineAcquiring the data from the 2D lidar and transforming it into data useful to the CNN involves integrating a number of new parts into the vehicle pipeline. 2D Lidar pipeline involves 4 parts that work to gather data from the 2D Lidar, adjust for the changing position of the vehicle and render it to a bitmap so it can be saved to the tub.
We also want a part that will convert the normal RGB camera images into a bird's eye view image using a camera calibration matrix. This bird's eye view would then be the input image for the Lidar Imaging Part and the resulting combined image would be input to the CNN. See a more detailed discussion below. A Neural Network using 2D Lidar dataIt may be that there are other neural network archictures that would use the 2D lidar data directly, but Donkeycar is built around a CNN that takes images in and outputs steering and throttle. So if we stick with that, then we want to use the 2D lidar as an image and use this image as an input into the CNN. I can think of two ways to do that. One is pretty simple and one is more complex, but likely better. One way to use the Lidar data is to concatenate the 2D lidar image with the camera image we already have. So if we are capturing a 160x120 camera image, we would want our 2DLidar image to be the same width (120 pixels), then we would use OpenCV to 'stack' the two images and make them into a single image. That single image would then be the input to our CNN. Note that we would need to change the code that calculates the CNN to use the dimensions of this new image, not the dimensions of the camera image. Further, when we drive on autopilot, we need to do this concatenation step and use the concatenated image to infer the steering and throttle. I think a model build on that might be ok, but it's hard to know without testing it. It definitely would slow things down because the image would be larger. Also, if you think about this, it's not super intuitive that this would work well. The two images are in two different 'spaces'; it would not be obvious to a person how these correlate and it might not be obvious to a CNN either. Another way, which I think would be better, would be to overlay the 2D lidar image onto the camera image in a way where the lidar pixels are correctly placed in the image. Each lidar data point represents a point in space around the vehicle where something exists, because it reflected that light. We should be able to place the 2D lidar pixel onto the pixel in the camera image that represents the spot in 3D space that reflected that light. I think there are two possible ways to do this; the first approach attempts to take the Lidar data and project it into the 3D space that the camera image represents. The second method instead projects the camera image into a bird's eye view and then overlays the 2D lidar data on it. These two alternatives are discussed in more detail. Option A: we could project the Lidar data points onto the camera image, applying a 3D transform that positions the pixels correctly in the image. To do this we must calibrate the camera and use that calibration to understand how to project the points in the 2D lidar data into the camera's world view. Remember that each Lidar point represents a point in the 3D world space around the vehicle. In this option, we take that 3D point and calculate the corresponding pixel in the camera image and draw it there. This is kinda cool, but I don't think it is the best way to do this because of perspective; as points get farther away, they are likely to occupy the same pixel as a nearby point because of perspective. That means we start to lose data. Option B: To avoid the perspective data loss issues, we could use the camera calibration values to reformat the camera image so it appears as a bird's eye view; like we are looking at it from above. We essentially reformat the image so it looks like it was taken from directly above the Donkeycar. Then it is basically trivial to overlay the 2D lidar data, because it already is like a bird's eye view. We merge the pixels from the 2D Lidar image into the bird's eye camera view. Even better, we can generate this bird's eye camera view before we render the 2D lidar image and rather then render a 2D lidar image to it's own bitmap, we draw the pixels directly on the bird's eye camera image. That would be faster because we can avoid merging two bitmaps. I think this bird's eye image would make the 2D lidar data much more powerful in predicting steering angle and throttle (assuming we slow down for obstacles). This approach would require another part; a part that takes the camera calibration data and the camera image and generates a bird's eye camera image. We would also make the Lidar Image Part accept an input image and rather than image dimensions, it would get a second bounding box that represents the area on the input image (the bird's eye camera image) where the 2D lidar data should be drawn. I'm getting pretty excited about this second approach. It has other benefits;
This approach does require another calibration step; calibrating the camera. This is a well known process and there exists plenty of code samples in Python to generate the necessary calibration data. Further, there is plenty of Python code out there to apply this calibration data to turn a camera image into a bird's eye view; this is a common process in many robotics projects. The bird's eye view work is broken out into issue 1096. |
Great preparation, Ed! After digging a bit into lidars during the iros2020 race and currently trying to find the bug when trying to save sim lidar data into tubs (SIM_RECORD_LIDAR = True) I would like to add the following feature requests and offer to help:
Maybe we can have a look to the Autoware.auto pipeline, https://www.apex.ai/autoware-course, see Lecture 7 | Object Perception: LIDAR, as well. |
@Ezward, I presently have three Slamtec RPLidars available to support this effort: A1M8-R5, A1M8-R6 (latest firmware) and a A2M8 (latest firmware). I have one running on a Rpi 4B, one running on a Nano 4GB, and the A2M8 running on a Xavier NX. |
I really like option B as it would allow me to operate my robot chassis without the need for lanes, as in a traditional DC environment, and allow for autonomous movement around a outside or inside perimeter. |
@Ezward, Fast RPLidar Uses C/C++ and a python wrapper to improve speed Adafruit CircuitPython RPLidar Library A compilation of GitHub Lidar interface code lidar_dewarping The purpose of the code here is to remove the distortions from the lidar scan. SLAMTEC RPLiDAR A2 C++ Library Clear Path: An interesting look at the A1M8 |
fyi: Maxime did an update on gym-donkeycar for solving this: “ ability to set position of cameraA, cameraB and lidar independent from each other.” I will test this today. |
@Heavy02011 thanks for those links. I see in the gist that you are using the trapezoidal method for warping the image perspective; I think we should allow for that method as it is pretty easy to configure. We can also allow the use of camera calibration matrix if the user has calibrated their camera as that also eliminates other distortions (like those from a wide angle lens). |
How will this affect depth cameras like the Intel D435i depth camera as I would like to use the Slamtec A2M8 with that depth camera? |
In terms of the data format that is generated by the 2D Lidar part, I think it makes sense to save the all the data on each frame, even though we may only be getting some of it on each frame. What I mean here is that because the lidar will stream in data in segments as it collects it, we will constantly be reading some subset of the 360 degrees. However, we want to maintain a full array data with each frame we save to the tub. So that means we want to stream data in and update any prior data as the new data comes in. The donkey pipeline is running 20hz or more and the lidar is only running somewhere between 4hz to 10hz. We don't want to wait for a whole 360 degrees of data before updating the data. I think most 2D lidars will stream data and tell you the angle range that it represents, which will be some subset of the 360 degrees arc. The ROS message for 2D lidar is here: http://docs.ros.org/en/noetic/api/sensor_msgs/html/msg/LaserScan.html . The nice thing about this format is that it can hold some subset of the 360 degree arc or it can be used to hold all 360 degrees. I can also see that we may want to tell the Donkeycar part that we only care about data from 0 to 180 degrees, for instance, so we are not writing more data that we will actually use. Again, we can still update this range as it streams in. We would just ignore data from 180 to 360 degrees in that case. |
I would hope camera and lidar would be totally orthogonal; they should not interfere with each other. Of course if you have both of these, you may have a bunch of redundant data and so are writing more data then you need, but it should work. Short of some underlying driver incompatibilities anyway. I'm going to try the 2d lidar without the D435; with just a normal RPi 160 degree camera. But I don't see any reason why the D435 and RPLidar would interfere; unless we define RPLidar part as a camera; since we only configure one camera at a time, so the 2D Lidar part should not be defined as a camera. Rather I think we should have a new kind of part for 2d planar lidar, then we can implement config and code for the various vendors. So for cameras we have
We would have additional configuration common to all 2d lidars. I'm not sure how to handle configuring the number of samples we can expect in that desired range; that is determined by how fast the Lidar is spinning and the number of readings per second. I believe readings per second are generally fixed, but the motor speed is commonly adjustable via PWM. So perhaps we have config for Ok, after writing all that I see we do have |
Yes, I was going to recommend that you look at zlite's lidar.py part to see how the data capture range is limited and how many Lidars are presently supported. Unfortunately this data capture range limiting only works for the A1M8 that references the zero degree start point of the turret scan at the back of the chassis where the motor is located. |
I think we can detect the zero point on any 2D lidar; we simply look at that last reading we saved; if the current reading's angle is less than the previous reading's angle, then we have crossed the zero point. We can double-buffer scan; when we cross zero we start a new buffer and move the last buffer (which should be a full 360) to a 'last-buffer' status. Further, I think we can merge these two buffers in a way that gets us a single 360 degree scan, which may include points from the latest scan and points from the previous scan, but no redundant angles. So I don't think we actually need the zero point provided by the RPLidar, but we do need to know where the lidar considers zero degrees relative to 'forward' so we can adjust the output of all lidars so they have zero as forward. |
Even though there is a YDLidar class in the lidar.py part, zlite told me that it was not presently functional. |
Lidar and Camera data fusion GitHub links: Lidar and Camera Fusion for 3D Object Detection based on Deep Learning for Autonomous Driving CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection Sensor Fusion NanoDegree- Camera Course Looks to be the best lidar-camera-fusion Lidar/Camera calibration |
Here are a couple of libaries that can manipulate point clouds and do a few other things (like scene segmentation) |
Discussion of Lidar odometry and deskewing lidar data (skewing that happens when scan data is acquired on a moving platform). https://youtu.be/9FhKgAEQTOg Here is the github for the KISS ICP library https://github.com/PRBonn/kiss-icp This includes Python bindings. This is the associated paper https://arxiv.org/pdf/2209.15397.pdf Here is an article on lidar odometry based on ICP using 2D lidar https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8587105/ Here is another article on 2D lidar odometry using ICP. Includes a code sample http://andrewjkramer.net/lidar-odometry-with-icp/ |
This is a discussion issue. The topic is how to integrate 2D lidar data, as from an RPLidar or YDLidar, into the DonkeyCar framework. It is likely that this will result in a number of following-on issues to actually implement the design that we decide upon. I'll start the discussion by adding a strawman as the first comment in this ticket.
So I realize there is a lot of work already done here. donkeycar/parts/lidar.py has code to read RPLidar, YDLidar and has code for plotting the data. It even has BreezySLAM. So when reading the strawman below, know that a bunch of this work is already done. I think the big piece we need to do is to integrate the camera and 2D lidar images so they can be fed to the CNN. Then we need to make sure our support for various lidar models and brands is working in that regard. Finally we may want to handle the issue of data skew while moving so our scans are more accurate. Also I think we can optimize the RPLidar driver a little more; it can filter the data as it is collected and it can insert the data sorted so we can avoid doing that each time we read the data.
The text was updated successfully, but these errors were encountered: