Dress Code Dataset

This repository presents the virtual try-on dataset proposed in:

D. Morelli, M. Fincato, M. Cornia, F. Landi, F. Cesari, R. Cucchiara
Dress Code: High-Resolution Multi-Category Virtual Try-On

[Paper] [Dataset Request Form] [Try-On Demo]

IMPORTANT!

By making any use of the Dress Code Dataset, you accept and agree to comply with the terms and conditions reported here.
The dataset will not be released to private companies.
When filling the dataset request form, non-institutional emails (e.g. gmail.com) are not allowed.
The signed release agreement form is mandatory (see the dataset request form for more details). Incomplete or unsigned release agreement form are not accepted and will not receive a response. Typed signature are not allowed.

Please cite with the following BibTeX:

@inproceedings{morelli2022dresscode,
  title={{Dress Code: High-Resolution Multi-Category Virtual Try-On}},
  author={Morelli, Davide and Fincato, Matteo and Cornia, Marcella and Landi, Federico and Cesari, Fabio and Cucchiara, Rita},
  booktitle={Proceedings of the European Conference on Computer Vision},
  year={2022}
}

Dataset

We collected a new dataset for image-based virtual try-on composed of image pairs coming from different catalogs of YOOX NET-A-PORTER.
The dataset contains more than 50k high resolution model clothing images pairs divided into three different categories (i.e. dresses, upper-body clothes, lower-body clothes).

Summary

53792 garments
107584 images
3 categories
- upper body
- lower body
- dresses
1024 x 768 image resolution
additional info
- keypoints
- skeletons
- human label maps
- human dense poses

Additional Info

Along with model and garment image pair, we provide also the keypoints, skeleton, human label map, and dense pose.

More info

Keypoints

For all image pairs of the dataset, we stored the joint coordinates of human poses. In particular, we used OpenPose [1] to extract 18 keypoints for each human body.

For each image, we provided a json file containing a dictionary with the keypoints key. The value of this key is a list of 18 elements, representing the joints of the human body. Each element is a list of 4 values, where the first two indicate the coordinates on the x and y axis respectively.

Skeletons

Skeletons are RGB images obtained connecting keypoints with lines.

Human Label Map

We employed a human parser to assign each pixel of the image to a specific category thus obtaining a segmentation mask for each target model. Specifically, we used the SCHP model [2] trained on the ATR dataset, a large single person human parsing dataset focused on fashion images with 18 classes.

Obtained images are composed of 1 channel filled with the category label value. Categories are mapped as follows:

 0    background
 1    hat
 2    hair
 3    sunglasses
 4    upper_clothes
 5    skirt
 6    pants
 7    dress
 8    belt
 9    left_shoe
10    right_shoe
11    head
12    left_leg
13    right_leg
14    left_arm
15    right_arm
16    bag
17    scarf

Human Dense Pose

We also extracted dense label and UV mapping from all the model images using DensePose [3].

Experimental Results

Low Resolution 256 x 192

Name	SSIM	FID	KID
CP-VTON [4]	0.803	35.16	2.245
CP-VTON+ [5]	0.902	25.19	1.586
CP-VTON* [4]	0.874	18.99	1.117
PFAFN [6]	0.902	14.38	0.743
VITON-GT [7]	0.899	13.80	0.711
WUTON [8]	0.902	13.28	0.771
ACGPN [9]	0.868	13.79	0.818
OURS	0.906	11.40	0.570

Code

Due to a firm collaboration, we cannot release the code. However, we supply an empty Pytorch project to load data.

References

[1] Cao, et al. "OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields." IEEE TPAMI, 2019.

[2] Li, et al. "Self-Correction for Human Parsing." arXiv, 2019.

[3] Güler, et al. "Densepose: Dense human pose estimation in the wild." CVPR, 2018.

[4] Wang, et al. "Toward Characteristic-Preserving Image-based Virtual Try-On Network." ECCV, 2018.

[5] Minar, et al. "CP-VTON+: Clothing Shape and Texture Preserving Image-Based Virtual Try-On." CVPR Workshops, 2020.

[6] Ge, et al. "Parser-Free Virtual Try-On via Distilling Appearance Flows." CVPR, 2021.

[7] Fincato, et al. "VITON-GT: An Image-based Virtual Try-On Model with Geometric Transformations." ICPR, 2020.

[8] Issenhuth, el al. "Do Not Mask What You Do Not Need to Mask: a Parser-Free Virtual Try-On." ECCV, 2020.

[9] Yang, et al. "Towards Photo-Realistic Virtual Try-On by Adaptively Generating-Preserving Image Content." CVPR, 2020.

Contact

If you have any general doubt about our dataset, please use the public issues section on this github repo. Alternatively, drop us an e-mail at davide.morelli [at] unimore.it or marcella.cornia [at] unimore.it.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
data		data
images		images
utils		utils
LICENCE		LICENCE
README.md		README.md
conf.py		conf.py
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dress Code Dataset

Dataset

Summary

Additional Info

Keypoints

Skeletons

Human Label Map

Human Dense Pose

Experimental Results

Low Resolution 256 x 192

Code

References

Contact

About

Contributors 2

Languages

License

aimagelab/dress-code

Folders and files

Latest commit

History

Repository files navigation

Dress Code Dataset

Dataset

Summary

Additional Info

Keypoints

Skeletons

Human Label Map

Human Dense Pose

Experimental Results

Low Resolution 256 x 192

Code

References

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

Languages