Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

bad performance on the same wild video #6

Open
bucktoothsir opened this issue Dec 12, 2018 · 14 comments
Open

bad performance on the same wild video #6

bucktoothsir opened this issue Dec 12, 2018 · 14 comments

Comments

@bucktoothsir
Copy link

bucktoothsir commented Dec 12, 2018

hello

  1. I downloaded the same skating video with 1920*1080 resolution from youtube.
  2. I predicted 2d coco joints for this video by the model you provided in Test in the wild #2
  3. I made a dataset file and replaced the res_w and res_h in h36m_dataset.py
  4. Then I get a result by d-pt-243.bin as follows.
    _01b44388-2391-4cd5-94c5-f599b690f4c7

Obviously it is wrose than your result
_69d6714a-4b64-42c0-ab0e-92008275b34e

I noticed that your video is with a high resolution and much more accurate 2d joints. Could you please release the original skate video and test in the wild code?

@Godatplay
Copy link

Godatplay commented Dec 12, 2018

In terms of the output resolution, you set that with --viz-size. I chose 10 and it seems close, the default is 5.

I'm not sure how much difference it'll make, but consider also changing center as well since all 3 are used to renormalize the camera.

How did you build your dataset file?

@dariopavllo
Copy link
Contributor

Did you follow the instructions mentioned in my last post here?

Also, in this comment I mentioned that we used CPN to extract the 2D keypoints for the videos in the wild, which produces slightly better results. Anyway, if you followed the steps correctly, Detectron poses should be very similar.

We took the video from YouTube as well, in 1080p resolution.

@wishvivek
Copy link

@bucktoothsir Regarding getting visualizations of in-the-wild videos, in the second step, where you converted the input video to individual frames, how did you preprocess this incoming frame (scale, crop, center, etc.?) before getting the output from the Detectron?

@bucktoothsir bucktoothsir changed the title bad performance in the same wild video bad performance on the same wild video Dec 13, 2018
@bucktoothsir
Copy link
Author

@Godatplay

In terms of the output resolution, you set that with --viz-size. I chose 10 and it seems close, the default is 5.

I'm not sure how much difference it'll make, but consider also changing center as well since all 3 are used to renormalize the camera.

How did you build your dataset file?

your advices works. thanks. Now I get a high resolution output, but the performance remains bad.

I built a dataset file as the same structure as original dataset file. Specifically, I built a fake 3d dataset file and a 2d dataset file. The structure is 'S0/skating' and you could rename subjects and actions, then change the corresponding name in your test scripts.

@bucktoothsir
Copy link
Author

@bucktoothsir Regarding getting visualizations of in-the-wild videos, in the second step, where you converted the input video to individual frames, how did you preprocess this incoming frame (scale, crop, center, etc.?) before getting the output from the Detectron?

I didn't take any preprocessing steps.

@bucktoothsir
Copy link
Author

In terms of the output resolution, you set that with --viz-size. I chose 10 and it seems close, the default is 5.

I'm not sure how much difference it'll make, but consider also changing center as well since all 3 are used to renormalize the camera.

How did you build your dataset file?

I also write a dataset file by myself.

@wishvivek
Copy link

wishvivek commented Dec 18, 2018

@bucktoothsir Thanks for the response. Also, I'm trying to get keypoints on my images using the Detectron Model (using the R-50-FPN End-to-End Keypoint-Only Mask R-CNN Baseline model, in this page), using the command:

python Detectron.pytorch/tools/infer_simple.py --dataset coco --cfg Detectron.pytorch/configs/baselines/e2e_keypoint_rcnn_R-50-FPN_1x.yaml --load_detectron Detectron.pytorch/data/pretrained_model/e2e_keypoint_rcnn_R-50-FPN_1x.pkl --image_dir videoframes --output_dir Detectron.pytorch/keypoints

but getting this error:

RuntimeError: The expanded size of the tensor (81) must match the existing size (2) at non-singleton dimension 0

So, it'll be great if you (or anyone else reading this) could provide any hints on how you're obtaining keypoints through this process. Thanks!

@bucktoothsir
Copy link
Author

@wishvivek which version of python do you use?

@wishvivek
Copy link

wishvivek commented Dec 20, 2018

@dariopavllo I have the 3D predictions from the model for my in-the-wild video, but they're all normalized (i.e., [-1,1]). So,

  1. How do I unnormalize these 3D predictions? (My objective is to visualize the 3D reconstruction, just like the results at the top of this page.)
  2. Usually, we use the mean and std of the dataset to normalize and unnormalize our data (Eg. as is done here. To my understanding, this is done w.r.t. the root joint. So, what is the normalization-unnormalization scheme used here?

Any help will be great, thanks!

@lxy5513
Copy link

lxy5513 commented Jan 12, 2019

How to get keypts and bboxes ?

for 12_2017_baselines/e2e_keypoint_rcnn_R-101-FPN_s1x.yaml.
Is there any already traind model to get 2D keyps and bboxes?
like this /path/to/e2e_keypoint_rcnn_R-101-FPN_s1x.kpl

or I need to train on Detetron to get the model?Anyone can help me? thanks a lot

@bucktoothsir
Copy link
Author

How to get keypts and bboxes ?

for 12_2017_baselines/e2e_keypoint_rcnn_R-101-FPN_s1x.yaml.
Is there any already traind model to get 2D keyps and bboxes?
like this /path/to/e2e_keypoint_rcnn_R-101-FPN_s1x.kpl

or I need to train on Detetron to get the model?Anyone can help me? thanks a lot

I used detectron as the author's advice.

@tobiascz
Copy link

tobiascz commented Jan 28, 2019

Thanks @bucktoothsir to point me to this issue!

As I already mentioned in #2 I also was able to run the code on a in the wild example with my own fork of this repository. I also have some notes for Detectron in there for the people with difficulties. My 3D results are also way worse than the results created by @dariopavllo. I think my 2D poses are not accurate enough - also thanks to @lxy5513 who also suggested that.

So my next step would be to actually run the detectron poses through CPN to get better 2D results! If someone has another opinion please share maybe I did something wrong in my code?

My output

myOutput

Authors output

authorsOutput

@YCyuchen
Copy link

@Godatplay @tobiascz I use the inference code to run my own video, taking Detecton's 2d keypoints as input. the buttocks in my output seems fixed, while i think it should move. Have you met similar problem? Is there any potential solution i can try to improve the result?
My output
crop-sport-lift2

@tobiascz
Copy link

tobiascz commented Feb 15, 2020

Hey @YCyuchen,

The reason for that is that the 3D Skeleton is always visualized relative to the center hip joint (you called it buttock). To avoid this you could use the ankles as the relative center of the visualization.
In your test video you can see that while the person is crouching the legs actually go up in the reconstruction.

#51

In this issue @dariopavllo already discussed this

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants