MegaSaM

This code accompanies the paper

MegaSam: Accurate, Fast and Robust Casual Structure and Motion from Casual Dynamic Videos
Zhengqi Li, Richard Tucker, Forrester Cole, Qianqian Wang, Linyi Jin, Vickie Ye, Angjoo Kanazawa, Aleksander Holynski, Noah Snavely

This is not an officially supported Google product.

Clone

Make sure to clone the repository with the submodules by using: git clone --recursive git@github.com:mega-sam/mega-sam.git

Instructions for installing dependencies

Python Environment

The following codebase was successfully run with Python 3.10, CUDA11.8, and Pytorch2.0.1. We suggest installing the library in a virtual environment such as Anaconda.

To install main libraries, run:
conda env create -f environment.yml
To install xformers for UniDepth model, follow the instructions from https://github.com/facebookresearch/xformers. If you encounter any installation issue, we suggest installing it from a prebuilt file. For example, for Python 3.10+Cuda11.8+Pytorch2.0.1, run:
wget https://anaconda.org/xformers/xformers/0.0.22.post7/download/linux-64/xformers-0.0.22.post7-py310_cu11.8.0_pyt2.0.1.tar.bz2

conda install xformers-0.0.22.post7-py310_cu11.8.0_pyt2.0.1.tar.bz2
Compile the extensions for the camera tracking module:
cd base; python setup.py install

Downloading pretrained checkpoints

Download DepthAnything checkpoint to mega-sam/Depth-Anything/checkpoints/depth_anything_vitl14.pth
Download and include RAFT checkpoint at mega-sam/cvd_opt/raft-things.pth

Running MegaSaM on Sintel

Download and unzip Sintel data
Precompute mono-depth (Please modify img-path in the script): ./mono_depth_scripts/run_mono-depth_sintel.sh
Run camera tracking (Please modify DATA_PATH in the script. Adding argument --opt_focal to enable focal length optimization): ./tools/evaluate_sintel.sh
Running consistent video depth optimization given estimated cameras (Please modify datapath in the script): ./cvd_opt/cvd_opt_sintel.sh
Evaluate camera poses and depths:
python ./evaluations_poses/evaluate_sintel.py

python ./evaluations_depth/evaluate_depth_ours_sintel.py

Running MegaSaM on DyCheck

Download Dycheck data
Precompute mono-depth (Please modify img-path in the script): ./mono_depth_scripts/run_mono-depth_dycheck.sh
Running camera tracking (Please modify DATA_PATH in the script. Add argument --opt_focal to enable focal length optimization): ./tools/evaluate_dycheck.sh
Running consistent video depth optimization given estimated cameras (Please modify datapath in the script): ./cvd_opt/cvd_opt_dycheck.sh
Evaluate camera poses and depths:
python ./evaluations_poses/evaluate_dycheck.py

python ./evaluations_depth/evaluate_depth_ours_dycheck.py

Running MegaSaM on in-the-wild video, for example from DAVIS videos

Download example DAVIS data
Precompute mono-depth (Please modify img-path in the script): ./mono_depth_scripts/run_mono-depth_demo.sh
Running camera tracking (Please modify DATA_PATH in the script. Add argument --opt_focal to enable focal length optimization): ./tools/evaluate_demo.sh
Running consistent video depth optimization given estimated cameras (Please modify datapath in the script): ./cvd_opt/cvd_opt_demo.sh

Contact

For any questions related to our paper, please send email to zl548@cornell.com.

Bibtex

@inproceedings{li2024_megasam,
  title     = {MegaSaM: Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos},
  author    = {Li, Zhengqi and Tucker, Richard and Cole, Forrester and Wang, Qianqian and Jin, Linyi and Ye, Vickie and Kanazawa, Angjoo and Holynski, Aleksander and Snavely, Noah},
  booktitle = {arxiv},
  year      = {2024}
}

Copyright

All software is licensed under the Apache License, Version 2.0 (Apache 2.0); you may not use this file except in compliance with the Apache 2.0 license. You may obtain a copy of the Apache 2.0 license at: https://www.apache.org/licenses/LICENSE-2.0

All other materials are licensed under the Creative Commons Attribution 4.0 International License (CC-BY). You may obtain a copy of the CC-BY license at: https://creativecommons.org/licenses/by/4.0/legalcode

Unless required by applicable law or agreed to in writing, all software and materials distributed here under the Apache 2.0 or CC-BY licenses are distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the licenses for the specific language governing permissions and limitations under those licenses.

This is not an official Google product.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MegaSaM

Clone

Instructions for installing dependencies

Python Environment

Downloading pretrained checkpoints

Running MegaSaM on Sintel

Running MegaSaM on DyCheck

Running MegaSaM on in-the-wild video, for example from DAVIS videos

Contact

Bibtex

Copyright

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Depth-Anything		Depth-Anything
UniDepth		UniDepth
base @ ee9ac6a		base @ ee9ac6a
camera_tracking_scripts		camera_tracking_scripts
checkpoints		checkpoints
cvd_opt		cvd_opt
evaluations_depth		evaluations_depth
evaluations_poses		evaluations_poses
mono_depth_scripts		mono_depth_scripts
tools		tools
.gitmodules		.gitmodules
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
colmap_read_model.py		colmap_read_model.py
environment.yml		environment.yml

License

luanfujun/mega-sam

Folders and files

Latest commit

History

Repository files navigation

MegaSaM

Clone

Instructions for installing dependencies

Python Environment

Downloading pretrained checkpoints

Running MegaSaM on Sintel

Running MegaSaM on DyCheck

Running MegaSaM on in-the-wild video, for example from DAVIS videos

Contact

Bibtex

Copyright

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages