This is the PyTorch implementation of ISMAR 2021 paper "SAR: Spatial-Aware Regression for 3D Hand Pose and Mesh Reconstruction from a Monocular RGB Image". We provide our research code for training and testing our proposed method on FreiHAND dataset.
- Python-3.7.11
- PyTorch-1.7.1
- torchvision-0.8.2
- pyrender-0.1.45 (for rendering mesh, please follow the official installation guide.)
We suggest to create a new conda environment and install all the relevant dependencies.
conda create -n SAR python=3.7
conda activate SAR
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio===0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
- For the MANO layer, we used manopth. The repo is already included in
./manopth
. - The MANO model file
MANO_RIGHT.pkl
from this link is already included in./manopth/mano/models
.
- Download FreiHAND dataset from this link.
- Download root joint coordinates from I2L-MeshNet from this link.
You need to follow directory structure of the data
as below:
${ROOT}
|-- data
| |-- FreiHAND
| | |-- training
| | | |-- rgb
| | | |-- mask
| | |-- evaluation
| | | |-- rgb
| | |-- evaluation_K.json
| | |-- evaluation_scale.json
| | |-- training_K.json
| | |-- training_scale.json
| | |-- training_mano.json
| | |-- training_xyz.json
| | |-- training_verts.json
| | |-- bbox_root_freihand_output.json
- Modify
./config.py
to specify the model and parameters for the training. - Run code
python train.py
.
We provide a training log example.
- Modify
./config.py
to specify the path of the trained model's weights in "checkpoint" and the the corresponding model parameters. - Run code
python test.py
(if visualize mesh,PYOPENGL_PLATFORM=osmesa python test.py
). - Zip
./output/pred.json
and submit the prediction zip file to FreiHAND Leaderboard to obtain the evaluation scores.
To reprodece our results, we provide the pretrained model (using ResNet-34 as backbone and two stages) and the corresponding prediction file. This pretrained model should generate the following results:
Evaluation 3D KP results:
auc=0.229, mean_kp3d_avg=6.14 cm
Evaluation 3D KP ALIGNED results:
auc=0.871, mean_kp3d_avg=0.65 cm
Evaluation 3D MESH results:
auc=0.228, mean_kp3d_avg=6.14 cm
Evaluation 3D MESH ALIGNED results:
auc=0.866, mean_kp3d_avg=0.67 cm
F-scores
[email protected] = 0.130 [email protected] = 0.724
[email protected] = 0.392 [email protected] = 0.981
We borrowed a part of the open-source code of I2L-MeshNet.