Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need help with aligning RGB and depth images from D435i using pyrealsense2 and Orange Pi #11758

Closed
CharmingZh opened this issue May 2, 2023 · 10 comments
Labels

Comments

@CharmingZh
Copy link

Dear community,

I am facing an issue while trying to align RGB and depth images from a D435i camera using pyrealsense2 on an Orange Pi (similar to Raspberry Pi). The operation of aligning frames is taking a lot of time, resulting in a transfer rate of only 2.5 images per second.

frames         = pipeline.wait_for_frames()                   # Get Frame from D435i
aligned_frames = align.process(frames)      # Almost cost 400 ms on OrangePi, 80 times longer than other process 

To overcome this issue, I tried to serialize the composite_frame object using pickle.dumps() and send it over TCP. However, I got the following error "TypeError: cannot pickle 'pyrealsense2.pyrealsense2.composite_frame' object". I would appreciate any suggestions on how I can serialize this object so that I can send it over TCP.

frames_serial = pickle.dumps(frames) gives the following error
TypeError: cannot pickle 'pyrealsense2.pyrealsense2.composite_frame' object

I also attempted to align the frames manually by first extracting the depth and color frames and aligning them using the camera's internal and external reference, but I found that the depth map after coordinate mapping and RGB cannot be aligned. I have attached my code and the result of my attempt below. Can someone please point out my mistake or help me with aligning the images correctly?

Thank you all in advance for your help.

def pixel_to_pixel(u1, v1, K1, R1, t1, R1to2, t1to2, K2):
    # Convert pixel coordinates to coordinates in the normalized camera coordinate system
    x1 = (u1 - K1[0,2]) / K1[0,0]
    y1 = (v1 - K1[1,2]) / K1[1,1]
    z1 = 1

    # Convert the coordinates in the normalized camera coordinate system to the coordinates in the camera 1 coordinate system
    X1 = np.array([x1, y1, z1])
    X1 = X1 / np.linalg.norm(X1)

    # Convert the coordinates in the camera 1 coordinate system to the world coordinate system
    X = R1.dot(X1) + t1

    #  Convert the coordinates in the world coordinate system to the coordinates in the camera 2 coordinate system
    X2 = R1to2.dot(X) + t1to2

    # Convert the coordinates in the camera 2 coordinate system to the coordinates in the normalized camera coordinate system
    x2, y2, z2 = X2[0], X2[1], X2[2]
    x2 /= z2
    y2 /= z2

    # Convert coordinates under normalized camera coordinate system to pixel coordinates
    u2 = K2[0,0] * x2 + K2[0,2]
    v2 = K2[1,1] * y2 + K2[1,2]

    return int(u2), int(v2)

if __name__ == '__main__':
    color_image = cv2.imread("RGB_1.png")
    depth_image = np.load("2.npy")

    K_1 = np.eye(3)
    R_1 = np.eye(3)
    t_1 = np.array([0, 0, 0])
    u_1, v_1 = 470, 470
    x_1, y_1 = pixel_to_pixel(u_1, v_1,
                              color_intrinsics,
                              R_1, t_1,
                              color_to_depth_rotation,
                              color_to_depth_translation,
                              depth_intrinsics)

    depth_temp = np.zeros_like(depth_image, dtype=np.float32)
    for i in range(color_image.shape[0]):
        for j in range(color_image.shape[1]):

            x, y = pixel_to_pixel(i, j,
                                  color_intrinsics,
                                  R_1, t_1,
                                  color_to_depth_rotation,
                                  color_to_depth_translation,
                                  depth_intrinsics)
            depth_temp[i][j] = depth_image[x][y] / 1000.0

            print(depth_image[x][y])

    cv2.imshow("test", depth_temp)
    color_image[depth_temp == 0.0] = (0, 0, 255)
    cv2.imshow("test_2", color_image)
    cv2.waitKey()
    cv2.destroyAllWindows()

image
The Intrinsic parameters of RGB camera, depth camera and Extrinsic parameters from RGB to depth camera are all obtained by rs-sensor-control and written to the program manually.

# Intrinsic parameters of depth camera
depth_intrinsics = np.array([[387.873657226562, 0.0,              323.414398193359],
                             [0.0,              387.873657226562, 238.799575805664],
                             [0.0,              0.0,              1.0]])

# Intrinsic parameters of color camera
color_intrinsics = np.array([[604.584350585938, 0.0,              309.198364257812],
                             [0.0,              604.356323242188, 254.28271484375],
                             [0.0,              0.0,              1.0]])

# Extrinsic parameters from depth to color camera
depth_to_color_rotation = np.array([[ 0.999885,   0.00396869, 0.0146583],
                                    [-0.00393271, 0.999989,  -0.0024827],
                                    [-0.014668,   0.00242477, 0.999889]])

depth_to_color_translation = np.array([0.0148551082238555,
                                       3.56109194399323e-05,
                                       0.000355124968336895])

# Extrinsic parameters from color to depth camera
color_to_depth_rotation = np.array([[0.999885,  -0.00393271, -0.014668],
                                    [0.00396869, 0.999989,    0.00242477],
                                    [0.0146583, -0.0024827,   0.999889]])

color_to_depth_translation = np.array([-0.0649821, -9.5427e-05, -0.000572748])
@MartyG-RealSense
Copy link
Collaborator

Hi @CharmingZh Have you tested the RealSense SDK's pyrealsense2 depth-color alignment example program align_depth2color.py please?

https://github.com/IntelRealSense/librealsense/blob/master/wrappers/python/examples/align-depth2color.py

@CharmingZh
Copy link
Author

CharmingZh commented May 3, 2023

Thanks for responding, @MartyG-RealSense
I know the code, which creates an align object and gets the aligned depth and color frames from the align object. But this code is very slow on my raspberry pi:

frames = pipeline.wait_for_frames()

I was hoping to send the frames instance to my desktop via tcp and get the aligned frames on the desktop, but I don't know how to convert the frames to a byte stream. That's why I was hoping to align them manually, with the help of camera parameters.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented May 3, 2023

The main alternative to performing alignment with align_to is to map color onto depth with the pc.map_to and pc.calculate instructions to create a textured pointcloud, as demonstrated in a script at #4612 (comment) - though this would not be suitable if you do not want a 3D pointcloud.

Aligning manually would not be easy on a D435i as its color sensor's field of view (FOV) size is smaller than its depth sensor's FOV size. This difference is normally adjusted for automatically when performing align_to or map_to.

If you only need XYZ coordinates for a specific coordinate instead of the entire image then you can save processing by using the rs2_project_color_pixel_to_depth_pixel instruction to convert a single XY color pixel coordinate to an XYZ depth pixel coordinate instead of performing alignment.

@CharmingZh
Copy link
Author

Thank You VVVVVVVVERY MUCHHHHHHHH MY BRO!!!@MartyG-RealSense, You really rescued me.
And you, my friend, you are the real hero. @gachiemchiep
These two examples solved my problem perfectly.
#5603 (comment)
https://blog.csdn.net/m0_46649553/article/details/121249649

@MartyG-RealSense
Copy link
Collaborator

You are very welcome, @CharmingZh - it's great news that you achieved a solution. :)

@CharmingZh CharmingZh reopened this May 3, 2023
@CharmingZh
Copy link
Author

@MartyG-RealSense
In practice, I found that I still needed to pass in the pyrealsense2.pyrealsense2.frame object at the first argument position if I wanted to align the frames on the receiver side, which went against my idea of not being able to use it on other machines. The problem of alignment on other machines is itself caused by the inability to transfer frame objects over the network. Is there any way I can construct a frame object myself from a numpy array transferred over the network?

@MartyG-RealSense
Copy link
Collaborator

In the past other RealSense Python users have attempted a numpy to rs2::frame conversion, if that is what you are aiming for, but to date there has not been any reports of success in doing so. A RealSense user at #8394 tried converting numpy to BufData instead but also did not succeed.

@gachiemchiep
Copy link

@CharmingZh
Thank you, man.
It's great that you overcame the bug.

@MartyG-RealSense
Copy link
Collaborator

Hi @CharmingZh Do you require further assistance with this case, please? Thanks!

@MartyG-RealSense
Copy link
Collaborator

Case closed due to no further comments received.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants