Code for reproducing the results in the following paper:
This repo, together with image-play and pose-hg-train (branch image-play
), hold the code for reproducing the results in the following paper:
Forecasting Human Dynamics from Static Images
Yu-Wei Chao, Jimei Yang, Brian Price, Scott Cohen, Jia Deng
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
Check out the project site for more details.
-
The role of this repo is to implement training step 2 (Sec. 3.3), i.e. pre-training a 3D skeleton converter to recover 3D joint locations from 2D heatmaps.
-
This is later used to initialize the 3D skeleton converter sub-network in training step 3 (Sec. 3.3), i.e. training the full system.
Please cite Skeleton2D3D if it helps your research:
@INPROCEEDINGS{chao:cvpr2017,
author = {Yu-Wei Chao and Jimei Yang and Brian Price and Scott Cohen and Jia Deng},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
title = {Forecasting Human Dynamics from Static Images},
year = {2017},
}
This repo contains one submodules (pose-hg-train
), so make sure you clone with --recursive
:
git clone --recursive https://github.com/ywchao/skeleton2d3d.git
- Download Pre-Computed Models and Prediction
- Dependencies
- Setting Up Human3.6M
- Setting Up Penn Action
- Training 3D Skeleton Converter on Ground-Truth Heatmaps
- Fine-Tuning 3D Skeleton Converter on Predicted Heatmaps
- Comparison with Zhou et al. [40]
- Evaluation
If you just want to run the training of the full system, i.e. image-play, you can simply download the pre-computed models and prediction (108M) and skip the remaining content.
./scripts/fetch_s2d3d_models_prediction.sh
./scripts/setup_symlinks_models.sh
This will populate the exp
folder with precomputed_s2d3d_models_prediction
and set up a set of symlinks.
You can also now set up Human3.6M and run the evaluation demo with the downloaded prediction. This will ensure exact reproduction of the paper's results.
To proceed to the remaining content, make sure the following are installed.
- Torch7
- We used commit bd5e664 (2016-10-17) with CUDA 8.0.27 RC and cuDNN v5.1 (cudnn-8.0-linux-x64-v5.1).
- All our models were trained on a GeForce GTX TITAN X GPU.
- matio-ffi
- torch-hdf5
- MATLAB
The Human3.6M dataset is used for training and evaluation.
-
Download the Human3.6M dataset. Only the Poses RawAngles and Videos files are required:
Poses_RawAngles_S1.tgz Poses_RawAngles_S5.tgz Poses_RawAngles_S6.tgz Poses_RawAngles_S7.tgz Poses_RawAngles_S8.tgz Poses_RawAngles_S9.tgz Poses_RawAngles_S11.tgz Videos_S1.tgz Videos_S5.tgz Videos_S6.tgz Videos_S7.tgz Videos_S8.tgz Videos_S9.tgz Videos_S11.tgz
Place these files under
external/Human3.6M
. -
Extract the files:
for i in external/Human3.6M/*.tgz; do tar zxvf $i -C external/Human3.6M; done
This will populate the
external/Human3.6M
folder withS1
,S5
,S6
,S7
,S8
,S9
, andS11
. -
Download the Human3.6M dataset code:
./h36m_utils/fetch_h36m_code.sh
This will populate the
h36m_utils
folder withRelease-v1.1
. -
Generate meta files. Start MATLAB
matlab
underskeleton2d3d
. You should see the messageadded paths for the experiment!
followed by the MATLAB prompt>>
. Run the following command:H36MDataBase.instance;
Set the data path to
external/Human3.6M
and the config file directory toh36m_utils/Release-v1.1
. This will create a new fileH36M.conf
underskeleton2d3d
. -
Preprocess data for training and evaluation:
matlab -r "generate_data_h36m; quit"
This will populate the
data/h36m
folder withframes
,train.mat
, andval.mat
. -
Optional: Visualize 3D pose sequences:
matlab -r "vis_3d_pose; quit"
The output will be saved in
output/vis_3d_pose
.
The Penn Action dataset is used for running prediction.
-
Download the Penn Action dataset to
external
.external
should containPenn_Action.tar.gz
. Extract the files:tar zxvf external/Penn_Action.tar.gz -C external
This will populate the
external
folder with a folderPenn_Action
withframes
,labels
,tools
, andREADME
. -
Preprocess Penn Action by cropping the images:
matlab -r "prepare_penn_crop; quit"
This will populate the
data/penn-crop
folder withframes
andlabels
. -
Generate validation set and preprocess annotations:
matlab -r "generate_valid_penn; quit" python tools/preprocess.py
This will populate the
data/penn-crop
folder withvalid_ind.txt
,train.h5
,val.h5
, andtest.h5
.
We begin with training a 3D skeleton converter on Human3.6M. As the first step, we use ground-truth heatmaps as input to the network.
-
Before starting, make sure to remove the symlinks from the download section, if any:
find exp -type l -delete
-
Optional: Visualize training examples. Each example consists of input ground-truth heamaps and ground-truth 3D pose. The heamaps are artificially generated by projecting 3D pose onto the image plane. This is done in Torch7 each time we load a training sample. We provide a way to visualize this process in MATLAB:
matlab -r "vis_pose_proj; quit"
The output will be saved in
output/vis_pose_proj
. -
Start training:
./scripts/h36m/res-64.sh $GPU_ID
The output will be saved in
exp/h36m/res-64
. -
Optional: Visualize training loss and accuracy:
matlab -r "plot_loss_err; quit"
The output will be saved to
output/plot_res-64.pdf
. -
Optional: Visualize prediction on a subset of the validation set:
matlab -r "vis_preds_h36m; quit"
The output will be saved in
output/vis_res-64/h36m_val
. The predicted pose is colored by blue, green, and red, and the ground-truth pose is colored by black, cyan, and magenta. -
Optional: Run prediction on Penn Action. Given the Human3.6M trained model, we can run prediction on Penn Action. Again, we use ground-truth heatmaps as input to the network. Note that Penn Action contains unlabeled joints, which will introduce empty heatmaps that were not seen during training.
./scripts/penn-crop/res-64-pred.sh $GPU_ID
The output will be saved in
exp/penn-crop/res-64
. -
Optional: Visualize prediction on a subset of the validation set:
matlab -r "vis_preds_penn; quit"
The output will be saved in
output/vis_res-64/penn_val
.
Rather than ground-trtuh heatmaps, often times the 3D skeleton converter is expected to take predicted heatmaps as input. We next fine-tune the pre-trained 3D skeleton converter on heatmaps produced by an hourglass network.
-
Obtain a trained hourglass model. This is done with the submodule
pose-hg-train
.Option 1: Download pre-computed hourglass models (50M): (recommended)
cd pose-hg-train ./scripts/fetch_hg_models.sh ./scripts/setup_symlinks_models.sh cd ..
This will populate the
pose-hg-train/exp
folder withprecomputed_hg_models
and set up a set of symlinks.Option 2: Train your own models.
-
Start training:
./scripts/h36m/hg-256-res-64-hg0-hgfix.sh $GPU_ID
The output will be saved in
exp/h36m/hg-256-res-64-hg0-hgfix
. -
Optional: Visualize training loss, error, and accuracy:
matlab -r "plot_loss_err_acc; quit"
The output will be saved to
output/plot_hg-256-res-64-hg0-hgfix.pdf
. -
Optional: Visualize prediction on a subset of the validation set:
matlab -r "vis_preds_h36m_hg; quit"
The output will be saved in
output/vis_hg-256-res-64-hg0-hgfix/h36m_val
. The predicted pose is colored by blue, green, and red, and the ground-truth pose is colored by black, cyan, and magenta. -
Optional: Run prediction on Penn Action. Again, rather than using ground-truth heatmaps as in the last section, we use predicted heatmaps as input here.
./scripts/penn-crop/hg-256-res-64-hg0-hgfix-pred.sh $GPU_ID
The output will be saved in
exp/penn-crop/hg-256-res-64-hg0-hgfix
. -
Optional: Visualize prediction on a subset of the validation set:
matlab -r "vis_preds_penn_hg; quit"
The output will be saved in
output/vis_hg-256-res-64-hg0-hgfix/penn_val
.
This demo shows how we compare 3D pose recovery with Zhou et al. [40] in the paper (Sec. 4.2).
-
Fine-tune the hourglass network on Human3.6M. We will use the hourglass output as input to Zhou et al.'s method. Our goal is to evaluate the 3D pose output on Human3.6M. Since the hourglass model from
pose-hg-train
is trained on MPII and Penn Action, we first fine-tune it on Human3.6M:./scripts/h36m/hg-256.sh $GPU_ID
The output will be saved in
exp/h36m/hg-256
. -
Run prediction:
./scripts/h36m/hg-256-pred.sh $GPU_ID
The output will be saved in
exp/h36m/hg-256
. -
Download Zhou et al.'s MATLAB code:
./shapeconvex/fetch_shapeconvex.sh
This will populate the
shapeconvex
folder withrelease
. -
Learn pose dictionary on Human3.6M:
matlab -r "shapeconvex_dl; quit"
The output will be saved to
shapeconvex/shapeDict_h36m.mat
. -
Run 3D pose estimation:
matlab -r "shapeconvex_run; quit"
The output will be saved to
shapeconvex/res_hg-256-pred/h36m_val
. -
Optional: Visualize prediction on a subset of the validation set:
matlab -r "shapeconvex_vis; quit"
The output will be saved to
shapeconvex/vis_hg-256-pred/h36m_val
. -
Finally, for a fair comparison, we also need to fine-tune our 3D skeleton converter using the fine-tuned hourglass.
./scripts/h36m/hg-256-res-64-hg1-hgfix.sh $GPU_ID
The output will be saved in
exp/h36m/hg-256-res-64-hg1-hgfix
.
This demo runs the MATLAB evaluation script and reproduces our results in the paper (Tab. 2). If you are using pre-computed prediction, and want to also output Zhou et al.'s results, make sure to first run step 3, 4, and 5 in the last section.
-
Compute mean per joint position errors (MPMJE):
matlab -r "eval_run; quit"
This will print out the MPMJE values.
-
Optional: Visualize Zhou et al.'s and our results.
matlab -r "vis_run; quit"
The output will be saved in
evaluation/shapeconvex
andevaluation/skeleton2d3d
.