SAR: Spatial-Aware Regression for 3D Hand Pose and Mesh Reconstruction from a Monocular RGB Image

Introduction

This is the PyTorch implementation of ISMAR 2021 paper "SAR: Spatial-Aware Regression for 3D Hand Pose and Mesh Reconstruction from a Monocular RGB Image". We provide our research code for training and testing our proposed method on FreiHAND dataset.

Installation

Requirements

Python-3.7.11
PyTorch-1.7.1
torchvision-0.8.2
pyrender-0.1.45 (for rendering mesh, please follow the official installation guide.)

Setup with Conda

We suggest to create a new conda environment and install all the relevant dependencies.

    conda create -n SAR python=3.7
    conda activate SAR
    pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio===0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
    pip install -r requirements.txt

PyTorch MANO layer

For the MANO layer, we used manopth. The repo is already included in ./manopth.
The MANO model file MANO_RIGHT.pkl from this link is already included in ./manopth/mano/models.

Dataset

Download FreiHAND dataset from this link.
Download root joint coordinates from I2L-MeshNet from this link.

You need to follow directory structure of the data as below:

${ROOT}  
|-- data  
|   |-- FreiHAND
|   |   |-- training
|   |   |   |-- rgb
|   |   |   |-- mask
|   |   |-- evaluation
|   |   |   |-- rgb
|   |   |-- evaluation_K.json
|   |   |-- evaluation_scale.json
|   |   |-- training_K.json
|   |   |-- training_scale.json
|   |   |-- training_mano.json
|   |   |-- training_xyz.json
|   |   |-- training_verts.json
|   |   |-- bbox_root_freihand_output.json

Training

Modify ./config.py to specify the model and parameters for the training.
Run code python train.py.

We provide a training log example.

Evaluation

Modify ./config.py to specify the path of the trained model's weights in "checkpoint" and the the corresponding model parameters.
Run code python test.py (if visualize mesh, PYOPENGL_PLATFORM=osmesa python test.py ).
Zip ./output/pred.json and submit the prediction zip file to FreiHAND Leaderboard to obtain the evaluation scores.

To reprodece our results, we provide the pretrained model (using ResNet-34 as backbone and two stages) and the corresponding prediction file. This pretrained model should generate the following results:

Evaluation 3D KP results:
auc=0.229, mean_kp3d_avg=6.14 cm
Evaluation 3D KP ALIGNED results:
auc=0.871, mean_kp3d_avg=0.65 cm

Evaluation 3D MESH results:
auc=0.228, mean_kp3d_avg=6.14 cm
Evaluation 3D MESH ALIGNED results:
auc=0.866, mean_kp3d_avg=0.67 cm

F-scores
[email protected] = 0.130 	[email protected] = 0.724
[email protected] = 0.392 	[email protected] = 0.981

Acknowledgement

We borrowed a part of the open-source code of I2L-MeshNet.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
data		data
manopth		manopth
output/log		output/log
utils		utils
.gitignore		.gitignore
base.py		base.py
config.py		config.py
loss.py		loss.py
model.py		model.py
readme.md		readme.md
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAR: Spatial-Aware Regression for 3D Hand Pose and Mesh Reconstruction from a Monocular RGB Image

Introduction

Installation

Requirements

Setup with Conda

PyTorch MANO layer

Dataset

Training

Evaluation

Acknowledgement

About

Releases

Packages

Languages

zxz267/SAR

Folders and files

Latest commit

History

Repository files navigation

SAR: Spatial-Aware Regression for 3D Hand Pose and Mesh Reconstruction from a Monocular RGB Image

Introduction

Installation

Requirements

Setup with Conda

PyTorch MANO layer

Dataset

Training

Evaluation

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages