-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compressing depth frames with python #8117
Comments
Hi @yeric1789 I researched your request about Python compression extensively. The optimum solution available in regard to compressed bag recording though was to do so in the RealSense Viewer. In its settings interface (accessed by left-clicking on the gear-wheel in the top corner and selecting the Settings menu option), it has an option to enable bag compression under its Playback & Record tab. Bag files can also be reduced in size by recording them in ROS and using the split function of its rosbag record system to automatically split a recording into multiple smaller bag files. |
I still see very large file sizes even when compressed. The more I develop the project I realize that python might not have enough support or options and I have started transitioning over to developing the system in C++. I found one white paper which had to do with compressing depth data https://dev.intelrealsense.com/docs/depth-image-compression-by-colorization-for-intel-realsense-depth-cameras however I am unsure if this could be applied to the stream on a frame by frame basis as file is being recorded? Are there any other compression endcoding/decoding schemes I could employ on a recording within C++ that could possibly have a greater compression ratio. I would prefer it to be lossless but lossy is okay. |
An uncompressed 1280x720 image made up of unsigned 16bit pixels should be 1.8432 MB. 30 frames should be 55.296 MB. Are you recording color frames in the .bag as well? Personally I am not a huge fan of the colorization method. It is a great idea in theory; you can quickly convert a depthmap to a hue color ramp using a GLSL shader, and 8bit hardware-accelerated encoding is ubiquitous. However it has a few critical issues:
For high quality realtime recording of depthmaps I use a custom python capture script with lz4 compression (https://pypi.org/project/lz4/) for each individual depth frame. There are better compression algorithms for file size, but lz4 is simple, fast and has a BSD license. Because the raw byte array of a 16bit depthmap tends to repeat neighboring large bytes, it compresses very well. Using the default settings you can losslessly compress depth maps down to 30%-40% of their original size at 30fps on something as low powered as a Raspberry Pi 4. If you get clever and split the array into two 8bit image planes of big and little bytes, you can sometimes get that down to 20%. All of that said, a single channel 12bit HEVC based codec would probably be ideal for recording lossy depthmap videos, as long as the gamma stays universally consistent. |
Thanks so much @sam598 for your highly detailed advice for @yeric1789 :) |
Hi @sam598 thank you for the advice. Over the past few days I have been attempting to implement a python script for what you're describing. I am however running into quite a bit of trouble in encoding the depth frames as I am quite new to this. For example in one code snippet
I am attempting to store each frame not the frame data in a file. However I am uncertain what file type to store it or convert it into a byte like object. I am confused how I might go about decompressing it in a readable manner. I am able to produce compressed files but nothing of which I can decode to get the original file back. Any pointers you can give would be much appreciated. Thanks for all the suggestions. |
A lot of this is still uncharted territory, so really we are all new to this 🙂 I've never tried compressing a numpy object directly, but I don't know why it wouldn't work. Usually I compress just the raw data first:
Then I convert the decompressed raw data into a numpy array, or whatever it is I need:
All you are really doing here is compressing and decompressing a one dimensional array of uint16 pixels. So it does require you to remember the dimension of the image in order to reconstruct it. You could start to define your own custom depthmap file format by adding a header where there is a constant number of bytes at the start of each file that stores the dimensions and any other metadata you may need. I'm doing something along those lines for my projects because there is no "standard depthmap format". I know Open3D uses 16bit PNG files, but there's not really any program that can open that (if you try opening it up in Photoshop Adobe's color space will destroy your depth data). |
Hi @sam598, So I tried what you have suggested and attempted compression and decompression of a recorded stream. I was able to compress the file however I am having trouble decompressing it into something that is usable.
Above I have a visualize_and_comp_measurements method which will visualize each of the frames I am compressing. I am compressing the frames into a file with a .lz4 label on it. However when I finish decompressing instead of getting the original sized file I get 4 files (for 4 cameras) that are all of size 1800KB which is clearly incorrect. I am confused about how exactly I should implement the compression and decompression such that when I decompress the frame I can get the same size files. Please see the images attached for the file size. I know this is a lot of info but I have been struggling with this for the past few days, and any guidance you could provide for me would be much appreciated. Thank you so much for all your help so far. |
An image with a resolution of 1280x720 and uint16 pixels would be around 1,800 KB (1280 pixels x 720 pixels x 2 bytes = 1,843,200 bytes). It would look like it is only decompressing a single frame. When you compress a byte array using the LZ4 algorithm the first few bytes of the compressed data (it's header) contain information on how to decompress that data. If there is additional data appended to that compressed byte array, I assume it will not be decompressed. What is likely happening is that it is decompressing the first frame, and ignoring the rest. If you want to append all the frames into one single file you will need to come up with a way to separate those compressed frames. You could do something like:
This way you can separate each frame because you can calculate where each frame starts and stops. But for simplicity I have every frame recorded as a single file in a sub folder for each capture. For example:
It may not be the most efficient way to deal with it, but this way all that has to be done with each frame is decompress it into a raw 16bit array. Also to be clear everything I have been talking about is an alternative to the .bag format that Intel uses. I am not sure if there is a way to mix and match them. |
Hi @sam598, Your comment was extremely helpful and however I believe I am looking for a way to have all capture data stored in one file per camera. So the organization I was thinking was that each frame would be placed onto the same file. However I am having difficulty coming up with a solution to read each frame. For example say I have frames on one camera file
When I go to decompress and read the file I will want to decompress it in parts. So when I call
output_arr will be my compressed data for frame 1
output_arr will be my compressed data for frame 2 and so on. |
While it would be a lot easier to work with, unfortunately you cannot restrict the compression to a target size. Even if you could I would not recommend it for the following reasons:
In the previous message I described a possible way to record the size of each frame as you append it to a single file. It also seems like LZ4 has the ability to |
Hi @sam598, |
I don't know if the lz4 package has support for that. If you make your own custom format like how I suggested earlier (where you append the length of each compressed frame in between each appended frame) you could first run through the whole file to get the start and end bytes of each frame in the file. That way you could create a script that lets you quickly seek through frames to decompress. |
Hi @yeric1789 Do you require further assistance with this case, please? Thanks! |
Hi, I was using the lz4 compression library and noticed that at higher compressions I see a noticeable bottleneck on collecting enough frames. Is there any way to have the compression operate on another process. I tried using the multi threading module to no avail. And when I attempt to use the multi-processing module I get a pyrealsense2 object is not picklable. Is there any way to rectify this? |
It has been observed in the past that on some low-end devices the SDK's recorder may not be able to keep up when recording with compression enabled. There is also a past reference though to LZ4 compression having been disabled for D435i. I do not have information about whether this is still the case. It was disabled due to frame drops on high frequency streams on D435i. |
@yeric1789 are you setting a higher compression rate in LZ4? At the default settings it does lossless compresses extremely fast. The higher the compression rate are, the less likely you are going to be able to get realtime performance, especially on low-end devices. FWIW I haven't noticed significantly smaller file sizes in the higher compression setting. LZ4 is meant to be fast, not efficient. There are better and more efficient algorithms, but I haven't been able to find any good ones that run at realtime video rates. It is possible to use the threading module to gain some performance, but threading in python is a tricky and complicated subject. Because of Python's GIL (global interpreter lock) it actually only runs on one thread at a time, even when using the threading module. But certain functions in python (like network, disk IO, LZ4 compression, and parts of the RealSense SDK) do run outside of the GIL so a lot of the heavy lifting can be done in parallel. As a very simple example, you could have an empty list for Depth frames, and on the RealSense capture thread you could add to the list:
and then on the compression thread:
I also recently discovered the the SDK has a built in frame queue function which is a huge help in making sure that frames are not dropped. I haven't tried it yet without using threading but it may be possible. @MartyG-RealSense I don't believe we have been talking about using the SDK's recorder functionality for a while now, but I do want to make sure that @yeric1789 and I are on the same page that we are talking about a custom compression solution. |
Hi @yeric1789 Do you require further assistance with this case, please? Thanks! |
Case closed due to no further comments received. |
Hi,
I am attempting to compress bag files as they are being written, however I would prefer to utilize python. I have noticed this issue #3594 which mentions a function in C++ called
rs2_create_record_device_ex
. Is there something similar to this in pyrealsense2?If not are there other ways of compression using pyrealsense.
I noted some methods described in the pyrealsense documentation https://intelrealsense.github.io/librealsense/python_docs/_generated/pyrealsense2.html
One of which is a method depth_huffman_decoder() which "Decompresses Huffman-encoded Depth frame to standardized Z16 format". This implies there is a method for Huffman encoding a frame however there doesn't seem to be any method having that utility. Would I have to create a custom Huffman encoding algorithm for a depth frame or are there other options?
Another function of note would be the decimation_filter which down samples frames.
Any help with this issue for implementing a compression scheme for frames would be much appreciated.
Thanks
The text was updated successfully, but these errors were encountered: