Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compressing depth frames with python #8117

Closed
yeric1789 opened this issue Jan 5, 2021 · 18 comments
Closed

Compressing depth frames with python #8117

yeric1789 opened this issue Jan 5, 2021 · 18 comments

Comments

@yeric1789
Copy link


Required Info
Camera Model D435i
Firmware Version 5.12.09.00
Operating System & Version Raspbian GNU/Linux 10 (buster), Release version 10
Platform Raspberry Pi 4 4GB
SDK Version 2.40
Language python

Hi,
I am attempting to compress bag files as they are being written, however I would prefer to utilize python. I have noticed this issue #3594 which mentions a function in C++ called rs2_create_record_device_ex. Is there something similar to this in pyrealsense2?
If not are there other ways of compression using pyrealsense.
I noted some methods described in the pyrealsense documentation https://intelrealsense.github.io/librealsense/python_docs/_generated/pyrealsense2.html
One of which is a method depth_huffman_decoder() which "Decompresses Huffman-encoded Depth frame to standardized Z16 format". This implies there is a method for Huffman encoding a frame however there doesn't seem to be any method having that utility. Would I have to create a custom Huffman encoding algorithm for a depth frame or are there other options?

Another function of note would be the decimation_filter which down samples frames.

Any help with this issue for implementing a compression scheme for frames would be much appreciated.
Thanks

@MartyG-RealSense
Copy link
Collaborator

Hi @yeric1789 I researched your request about Python compression extensively. The optimum solution available in regard to compressed bag recording though was to do so in the RealSense Viewer. In its settings interface (accessed by left-clicking on the gear-wheel in the top corner and selecting the Settings menu option), it has an option to enable bag compression under its Playback & Record tab.

image

Bag files can also be reduced in size by recording them in ROS and using the split function of its rosbag record system to automatically split a recording into multiple smaller bag files.

#6578 (comment)

@yeric1789
Copy link
Author

I still see very large file sizes even when compressed.
I am looking for long term continuous use (over night recording (12-18 hours)) of up to 8 depth cameras. With 8 simultaneous streams. This will take up a significant amount of data.
Based on the current file sizes I am getting from the compressed bag files option with the realsense viewer recorder, I am seeing approximately 0.0715 GB taken up per second for each camera for 720p@30fps. With just 4 cameras I would fill up more data than a 4TB hard drive can hold in a 24 hour period.

The more I develop the project I realize that python might not have enough support or options and I have started transitioning over to developing the system in C++.

I found one white paper which had to do with compressing depth data https://dev.intelrealsense.com/docs/depth-image-compression-by-colorization-for-intel-realsense-depth-cameras however I am unsure if this could be applied to the stream on a frame by frame basis as file is being recorded?

Are there any other compression endcoding/decoding schemes I could employ on a recording within C++ that could possibly have a greater compression ratio. I would prefer it to be lossless but lossy is okay.

@sam598
Copy link

sam598 commented Jan 12, 2021

An uncompressed 1280x720 image made up of unsigned 16bit pixels should be 1.8432 MB. 30 frames should be 55.296 MB. Are you recording color frames in the .bag as well?

Personally I am not a huge fan of the colorization method. It is a great idea in theory; you can quickly convert a depthmap to a hue color ramp using a GLSL shader, and 8bit hardware-accelerated encoding is ubiquitous. However it has a few critical issues:

  • Standard video (and image) compression are fine-tuned for natural and realistic scenes, where patches of texture move around slightly from frame-to-frame. They do not handle quickly changing gradients well, which often causes major artifacts when decoding the color gradient.
  • h264 videos and jpg images are stored as YUV planes with the UV color channels at half the resolution of the Y channel, or even less. By storing depth data on the color portion of the image you would need a resolution of 2560x1440 to store a pixel matched 720p depth map.
  • UV channels are often much more heavily compressed than the Y channel because most of the detail of human perception is based on luminance.
  • Encoders and decoders are often black boxes where gamma and color changes depending on the hardware and software. You could encode something with a library on one machine, and then decode it with a different library on another machine, and end up with a gamma warped depth map.

For high quality realtime recording of depthmaps I use a custom python capture script with lz4 compression (https://pypi.org/project/lz4/) for each individual depth frame. There are better compression algorithms for file size, but lz4 is simple, fast and has a BSD license.

Because the raw byte array of a 16bit depthmap tends to repeat neighboring large bytes, it compresses very well. Using the default settings you can losslessly compress depth maps down to 30%-40% of their original size at 30fps on something as low powered as a Raspberry Pi 4. If you get clever and split the array into two 8bit image planes of big and little bytes, you can sometimes get that down to 20%.

All of that said, a single channel 12bit HEVC based codec would probably be ideal for recording lossy depthmap videos, as long as the gamma stays universally consistent.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jan 12, 2021

Thanks so much @sam598 for your highly detailed advice for @yeric1789 :)

@yeric1789
Copy link
Author

Hi @sam598 thank you for the advice. Over the past few days I have been attempting to implement a python script for what you're describing. I am however running into quite a bit of trouble in encoding the depth frames as I am quite new to this. For example in one code snippet

        import lz4.frame as frm

        # Funciton that gets each frame
        for (device, frame) in frames_devices.items():
            # Device is of type string, frame is a dict , get depth frame by indexing.
            depth_image = np.asarray(frame[rs.stream.depth].get_data())
            
            # Want to take each depth image and convert it into data that can be compressed in a file.

            input_data = depth_image  # redundant
            compressed = frm.compress(input_data)
            path = self.device_cfg_map[device]['compressed_path'] # A file path string
            with frm.open(path, mode='ab') as fp:
                bytes_written = fp.write(compressed)



            time_diff = str( int(time.time() - self.time_started))
            text_str = device + "  " + time_diff + " seconds"
            colormap = cv2.applyColorMap(cv2.convertScaleAbs(depth_image, alpha=0.5), cv2.COLORMAP_JET)
            cv2.putText(colormap, text_str, (50,50), cv2.FONT_HERSHEY_PLAIN, 2, (0,255,0), 2 )
            # Visualise the results
            text_str = 'Visual Representation, Device: ' + device
            cv2.namedWindow(text_str)
            cv2.imshow(text_str, colormap)

# ..................................................................................................................
#Depth compression attempts
           # file is a File type
           # file_decompressed is a BAG file
            with frm.open(file, mode='rb') as fp:
                output_data = fp.read()
            decompressed = frm.decompress(output_data)
            #os.remove(file)
            # Writing to decompressed file ?
            with frm.open(file_decompressed, mode='ab') as fp:
                fp.write(decompressed)

I am attempting to store each frame not the frame data in a file. However I am uncertain what file type to store it or convert it into a byte like object. I am confused how I might go about decompressing it in a readable manner. I am able to produce compressed files but nothing of which I can decode to get the original file back. Any pointers you can give would be much appreciated. Thanks for all the suggestions.

@sam598
Copy link

sam598 commented Jan 15, 2021

A lot of this is still uncharted territory, so really we are all new to this 🙂

I've never tried compressing a numpy object directly, but I don't know why it wouldn't work. Usually I compress just the raw data first:

depth_image = frames.get_depth_frame().get_data()
compressed = frm.compress(depth_image)
np_depth_image = np.asarray(depth_image)

Then I convert the decompressed raw data into a numpy array, or whatever it is I need:

decompressed = frm.decompress(output_data)
np_depth_image = np.frombuffer(decompressed, np.uint16).reshape([480, 848, 1])

All you are really doing here is compressing and decompressing a one dimensional array of uint16 pixels. So it does require you to remember the dimension of the image in order to reconstruct it.

You could start to define your own custom depthmap file format by adding a header where there is a constant number of bytes at the start of each file that stores the dimensions and any other metadata you may need. I'm doing something along those lines for my projects because there is no "standard depthmap format". I know Open3D uses 16bit PNG files, but there's not really any program that can open that (if you try opening it up in Photoshop Adobe's color space will destroy your depth data).

@yeric1789
Copy link
Author

Hi @sam598,

So I tried what you have suggested and attempted compression and decompression of a recorded stream. I was able to compress the file however I am having trouble decompressing it into something that is usable.
For the code I am using

    def visualise_and_comp_measurements(self, frames_devices):

        import lz4.frame as frm

        # Funciton that gets each frame
        for (device, frame) in frames_devices.items():
            # Device is of type string, frame is a dict , get depth frame by indexing.
            depth_data = frame[rs.stream.depth].get_data()
            depth_image = np.asarray(depth_data)
            
            # Want to take each depth image and convert it into data that can be compressed in a file.
            self.compress_depth_image(device, depth_data, frame[rs.stream.depth].get_data_size())


            self.device_cfg_map[device]['frame_count'] += 1
            time_diff = str( int(time.time() - self.time_started))
            text_str = device + "  " + time_diff + " seconds"
            colormap = cv2.applyColorMap(cv2.convertScaleAbs(depth_image, alpha=0.5), cv2.COLORMAP_JET)
            cv2.putText(colormap, text_str, (50,50), cv2.FONT_HERSHEY_PLAIN, 2, (0,255,0), 2 )
            # Visualise the results
            text_str = 'Visual Representation, Device: ' + device
            cv2.namedWindow(text_str)
            cv2.imshow(text_str, colormap)


    def decompress(self) -> None:

        import lz4.frame as frm
        import glob
        k = self.save_cmprsd_path + "*.lz4"
        lister = list(glob.glob(k))
        for device in self.device_cfg_map:
            file_compressed = self.device_cfg_map[device]['compressed_path']
            file_decompressed_path = self.device_cfg_map[device]['decompressed_path']
            with frm.open(file_compressed, mode='rb') as fp:
                output_data = fp.read()
            decompressed = frm.decompress(output_data)
            np_depth_image = np.frombuffer(decompressed, np.uint16).reshape([720, 1280, 1])
            with open(file_decompressed_path, mode='ab') as fp:
                fp.write(np_depth_image) 

    def compress_depth_image(self, device, depth_data, size) -> None:
        import lz4.frame as frm
        ##for device in self.device_cfg_map:
        compressed = frm.compress(depth_data)
        #np_depth_image = np.asarray(compressed)
        path = self.device_cfg_map[device]['compressed_path']
        with frm.open(path, mode='ab', source_size=size) as fp:
            bytes_written = fp.write(compressed)

Above I have a visualize_and_comp_measurements method which will visualize each of the frames I am compressing. I am compressing the frames into a file with a .lz4 label on it. However when I finish decompressing instead of getting the original sized file I get 4 files (for 4 cameras) that are all of size 1800KB which is clearly incorrect. I am confused about how exactly I should implement the compression and decompression such that when I decompress the frame I can get the same size files.

Please see the images attached for the file size.
Depth_Data_compressed
Depth_Data_Decompressed
Depth_Data_Uncompressed

I know this is a lot of info but I have been struggling with this for the past few days, and any guidance you could provide for me would be much appreciated. Thank you so much for all your help so far.

@sam598
Copy link

sam598 commented Jan 19, 2021

An image with a resolution of 1280x720 and uint16 pixels would be around 1,800 KB (1280 pixels x 720 pixels x 2 bytes = 1,843,200 bytes). It would look like it is only decompressing a single frame.

When you compress a byte array using the LZ4 algorithm the first few bytes of the compressed data (it's header) contain information on how to decompress that data. If there is additional data appended to that compressed byte array, I assume it will not be decompressed. What is likely happening is that it is decompressing the first frame, and ignoring the rest.

If you want to append all the frames into one single file you will need to come up with a way to separate those compressed frames.

You could do something like:

4 bytes - int32 of X bytes are in the first compressed frame
X bytes - compressed depth data of the first frame

4 bytes - int32 of Y bytes are in the second compressed frame
Y bytes - compressed depth data of the second frame

4 bytes - int32 of Z bytes are in the second compressed frame
Z bytes - compressed depth data of the second frame

This way you can separate each frame because you can calculate where each frame starts and stops. But for simplicity I have every frame recorded as a single file in a sub folder for each capture. For example:

> Camera_01
    > Capture_001
        > Frame_0001.depth
        > Frame_0002.depth
        > Frame_0003.depth
        > Frame_0004.depth
        ...
    > Capture_002
        > Frame_0001.depth
        > Frame_0002.depth
        > Frame_0003.depth
        > Frame_0004.depth
        ...
    ...

It may not be the most efficient way to deal with it, but this way all that has to be done with each frame is decompress it into a raw 16bit array.

Also to be clear everything I have been talking about is an alternative to the .bag format that Intel uses. I am not sure if there is a way to mix and match them.

@yeric1789
Copy link
Author

Hi @sam598,

Your comment was extremely helpful and however I believe I am looking for a way to have all capture data stored in one file per camera. So the organization I was thinking was that each frame would be placed onto the same file. However I am having difficulty coming up with a solution to read each frame. For example say I have frames on one camera file

<Frame 1>
Some Data.........
<Frame 2>
Some Data.....
<Frame 3>
Some Data......
.
.
.
<Frame n>
Some Data .....

When I go to decompress and read the file I will want to decompress it in parts. So when I call

with frm.open(depth_compressed_path, 'rb') as fp:
    output_arr = fp.read()

output_arr will be my compressed data for frame 1
Then when I call it again

with frm.open(depth_compressed_path, 'rb') as fp:
    output_arr = fp.read()

output_arr will be my compressed data for frame 2 and so on.
Is there away I can store each frame with a restriction on the number of bytes given to that frame and then iterate through all the frames in this one file using seperators or by reading in a certain number of bytes? If so could you provide some guidance on how I could go about doing this. Thanks.

@sam598
Copy link

sam598 commented Jan 20, 2021

While it would be a lot easier to work with, unfortunately you cannot restrict the compression to a target size. Even if you could I would not recommend it for the following reasons:

  1. Lossy compression algorithms (like JPG or H264) can adjust their settings to try and fit a specific size by throwing out detail. LZ4 is a lossless algorithm which preserves the data exactly as it was, while compressing it quickly.

  2. If a depth frame contains a large similar or empty area it will compress to a much smaller file size. For example if half of a frame is has no depth data and the pixels have a value of 0, that frame will probably be half the compressed size of a similar compressed frame. If you arbitrarily append extra data to fit a specific size you would end up wasting space.

  3. There are very rare instances where if every single value is random and there are no repeating patterns in the data to be compressed, the LZ4 compression will actually be larger than the original size. This should be impossible in this use case because depth frames have a lot of repeating values every other pixel. But that possibility is still there and if reliability is a concern that is still that one in a million edge case you would have to be prepared for.

In the previous message I described a possible way to record the size of each frame as you append it to a single file. It also seems like LZ4 has the ability to compress_chunk and decompress_chunk which might fit the workflow you are looking for.
https://readthedocs.org/projects/python-lz4/downloads/pdf/stable/

@yeric1789
Copy link
Author

Hi @sam598,
If there is no option for size restriction for each frame is it possible to go through a file and get each individual frame. Like iterate over a set of frames with the lz4 package. I can compress a chunk however in the document it describes that each frame should have a file header. Is there a way to find that file header and get an individual frame for each matching file header?

@sam598
Copy link

sam598 commented Jan 20, 2021

I don't know if the lz4 package has support for that.

If you make your own custom format like how I suggested earlier (where you append the length of each compressed frame in between each appended frame) you could first run through the whole file to get the start and end bytes of each frame in the file. That way you could create a script that lets you quickly seek through frames to decompress.

@MartyG-RealSense
Copy link
Collaborator

Hi @yeric1789 Do you require further assistance with this case, please? Thanks!

@yeric1789
Copy link
Author

Hi,

I was using the lz4 compression library and noticed that at higher compressions I see a noticeable bottleneck on collecting enough frames. Is there any way to have the compression operate on another process. I tried using the multi threading module to no avail. And when I attempt to use the multi-processing module I get a pyrealsense2 object is not picklable. Is there any way to rectify this?

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jan 28, 2021

It has been observed in the past that on some low-end devices the SDK's recorder may not be able to keep up when recording with compression enabled.

#2102 (comment)

There is also a past reference though to LZ4 compression having been disabled for D435i. I do not have information about whether this is still the case. It was disabled due to frame drops on high frequency streams on D435i.

#3594 (comment)

@sam598
Copy link

sam598 commented Feb 1, 2021

@yeric1789 are you setting a higher compression rate in LZ4? At the default settings it does lossless compresses extremely fast. The higher the compression rate are, the less likely you are going to be able to get realtime performance, especially on low-end devices. FWIW I haven't noticed significantly smaller file sizes in the higher compression setting. LZ4 is meant to be fast, not efficient. There are better and more efficient algorithms, but I haven't been able to find any good ones that run at realtime video rates.

It is possible to use the threading module to gain some performance, but threading in python is a tricky and complicated subject. Because of Python's GIL (global interpreter lock) it actually only runs on one thread at a time, even when using the threading module.

But certain functions in python (like network, disk IO, LZ4 compression, and parts of the RealSense SDK) do run outside of the GIL so a lot of the heavy lifting can be done in parallel.

As a very simple example, you could have an empty list for Depth frames, and on the RealSense capture thread you could add to the list:

frames = queue.wait_for_frame().as_frameset()

rawDepthFrame = frames.get_depth_frame()

if rawDepthFrame is not None:
    threadLock.acquire()
    depthFrameBuffer.append(rawDepthFrame)
    threadLock.release()

and then on the compression thread:

rawDepthFrame = None

threadLock.acquire()
if len(depthFrameBuffer) > 0:
    rawDepthFrame = depthFrameBuffer.pop(0)
threadLock.release()

if rawDepthFrame is not None:
    # Do compression processing
else:
    time.sleep(0.001)

I also recently discovered the the SDK has a built in frame queue function which is a huge help in making sure that frames are not dropped. I haven't tried it yet without using threading but it may be possible.
https://github.com/IntelRealSense/librealsense/blob/master/wrappers/python/examples/frame_queue_example.py

@MartyG-RealSense I don't believe we have been talking about using the SDK's recorder functionality for a while now, but I do want to make sure that @yeric1789 and I are on the same page that we are talking about a custom compression solution.

@MartyG-RealSense
Copy link
Collaborator

Hi @yeric1789 Do you require further assistance with this case, please? Thanks!

@MartyG-RealSense
Copy link
Collaborator

Case closed due to no further comments received.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants