Yiming Zuo · Willow Yang · Zeyu Ma · Jia Deng
Princeton Vision & Learning Lab (PVL)
We present a depth completion model that works well on unseen datasets and various depth patterns (zero-shot). It can be used to regularize Gaussian Splatting models to achieve better rendering quality, or work with LiDARs for dense mapping.
We recommend creating a python enviroment with anaconda.
conda create -n OMNIDC python=3.8
conda activate OMNIDC
# For CUDA Version == 11.3
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install mmcv==1.4.4 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html
pip install mmsegmentation==0.22.1
pip install timm tqdm thop tensorboardX tensorboard opencv-python ipdb h5py ipython Pillow==9.5.0 plyfile einops
We used NVIDIA Apex for multi-GPU training. Apex can be installed as follows:
git clone https://github.com/NVIDIA/apex
cd apex
git reset --hard 4ef930c1c884fdca5f472ab2ce7cb9b505d26c1a
conda install cudatoolkit-dev=11.3 -c conda-forge
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
You may face the bug ImportError: cannot import name 'container_abcs' from 'torch._six'
. In this case, change line 14 of apex/apex/_amp_state.py to import collections.abc as container_abcs
and re-install apex.
Download these .pt
files to src/pretrained
:
https://drive.google.com/drive/folders/1z2sOkIJHtg1zTYiSRhZRzff0AANprx4O?usp=sharing
Download from
https://drive.google.com/file/d/1SBRfdhozd-3j6uorjKOMgYGmrd578NvG/view?usp=sharing
and put it under the checkpoints
folder.
We save all evaluation datasets in a unified form (uniformat), and you can directly download it from here.
Put all npy files under the datasets/uniformat_release
folder:
uniformat_release
├───ARKitScenes_test_100
│ ├──000000.npy
... ├──000001.npy
└──...
We also provide instructions on how to process the original datasets to get these npy files, check this link:
cd src
# the real and virtual patterns from the 5 datasets reported in tab.1 and tab.2 in the paper
sh testing_scripts/test_robust_DC.sh
# additional results on the void and nyuv2 datasets
sh testing_scripts/test_void_nyu.sh
We recommend writing a dataloader to save your own dataset into the uniformat. You will need to provide an RGB image and a sparse depth map (with 0 indicating invalid pixels). A good starting point is the ibims dataloader.
Then follow the instructions here to convert your dataset into uniformat. Specifically, look into src/robust_dc_protocol/save_uniformat_dataset.py
.
Once done, you can run evaluation just as on any other datasets we tested on. See examples under src/testing_scripts
.
If you want to use our model for view synthesis (e.g., Gaussian Splatting), you may find the instructions here helpful. The ETH3D section describes how to convert COLMAP models to sparse depth maps.
We use 10x48GB GPUs (e.g., RTX A6000) and ~6 days. You can adjust the batch size depending on the memory and the numbers of GPU cards you have.
See here for instructions.
cd src
sh training_scripts/train_full.sh
If you find our work helpful please consider citing our paper:
@article{zuo2024omni,
title={OMNI-DC: Highly Robust Depth Completion with Multiresolution Depth Integration},
author={Zuo, Yiming and Yang, Willow and Ma, Zeyu and Deng, Jia},
journal={arXiv preprint arXiv:2411.19278},
year={2024}
}