Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The reproduction results look poor -- (edit: stacked foreground object due to noisy motion mask) #12

Open
Yuhuoo opened this issue Oct 22, 2024 · 3 comments

Comments

@Yuhuoo
Copy link

Yuhuoo commented Oct 22, 2024

Why does the result I reproduced perform worse than the result reported in your report?
Here are the results of my reproduction.

2024-10-22.125247.mp4
@Junyi42
Copy link
Owner

Junyi42 commented Oct 22, 2024

Hi @Yuhuoo,

Thanks for your feedback. As mentioned in #3, our visualization is based on the ground truth motion mask of DAVIS.

To get a better visualization when the motion mask is noisy, you could simply add a --no_mask for the visualization code (i.e., python viser/visualizer_monst3r.py --data path/schoolgirls --no_mask), which will not stack all the background pointclouds.

Alternatively, you could also add a --use_gt_mask for the evaluation script here https://github.com/Junyi42/monst3r?tab=readme-ov-file#evaluation, which should be able to generate similar results as our online visualization.

Thanks.

@Junyi42
Copy link
Owner

Junyi42 commented Oct 22, 2024

Hi @Yuhuoo,

Here's the result of using --no_mask for the visualization, as reference.

Screen.Recording.2024-10-22.at.12.20.17.AM.mov

I have also updated the README for detailed information. Please feel free to let me know if you have any further questions.

Thanks.

@Junyi42 Junyi42 changed the title The reproduction results look poor The reproduction results look poor -- (edit: stacked foreground object due to noisy motion mask) Oct 22, 2024
@koalamind
Copy link

koalamind commented Oct 26, 2024

@Yuhuoo a couple of questions about your Screen.Recording.2024-10-22.at.12.20.17.AM.mov : 1) are the kids moving back and forth a model projection, that is, a simplified 3D prediction of points moving over time? 2) is it possible to operate a video sensistivity analysis on the 3D projections (i.e., simulating a kid initiatl movement and predicting its 3D move projections over time based on historical data generalising movements over time? Dan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants