Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move camera optimization out of datamanager and parallelize dataloading #2092

Merged
merged 83 commits into from
Oct 10, 2023

Conversation

kerrj
Copy link
Collaborator

@kerrj kerrj commented Jun 18, 2023

WIP for removing camera optimization from datamanger and moving it to the model. This will potentially allow significant speedups in dataloading as several people have already noted, since the most time consuming step in dataloading is ray generation, which currently depends on gradients from the camera optimizer and thus cannot be parallelized.

Currently this

  • Removes all camera optimization code from datamanager
  • Adds camera optimization to nerfacto.py and refactors param groups, optimizer config accordingly

Things that need to happen:

  • Think about how to merge this PR without breaking everyone's custom method configs which rely on the camera optimizer being inside the datamanager
  • Think about proper inheritance for using camera optimizer (currently it sits in nerfacto.py, but it might be better to move it inside base_model.py)
  • Parallelize all the ray generation code inside a proper dataloader

@ksnzh
Copy link
Contributor

ksnzh commented Aug 28, 2023

https://github.com/nerfstudio-project/nerfstudio/blob/justin/camera_opt_refactor/nerfstudio/pipelines/base_pipeline.py#L363 may be fixed to assert isinstance(self.datamanager, (VanillaDataManager, ParallelDataManager))

kerrj and others added 3 commits August 28, 2023 11:50
…ere the messages are coming from, add default optimizer to optimizers.py for camera_opt param group to avoid crashes
@maturk
Copy link
Collaborator

maturk commented Sep 4, 2023

I think this is ready for merge. Tested with Lerf repository out of the box, and warnings appeared as they should without breaking the pipeline. Users would have to pip install nerfstudio once again after this update, since viser version has been bumped up.

@maturk maturk marked this pull request as ready for review September 4, 2023 12:57
@maturk
Copy link
Collaborator

maturk commented Sep 27, 2023

pytest failed :/

@kerrj kerrj merged commit e694f34 into main Oct 10, 2023
4 checks passed
@kerrj kerrj deleted the justin/camera_opt_refactor branch October 10, 2023 21:52
dmholtz added a commit to dmholtz/nerfstudio that referenced this pull request Nov 29, 2023
The ParallelDataManager (see nerfstudio-project#2092) makes pytorch crash if the cameras instance's fx, fy, cx or cy
tensors are loaded from a common shared tensor
dmholtz added a commit to dmholtz/nerfstudio that referenced this pull request Nov 29, 2023
The ParallelDataManager (see nerfstudio-project#2092) makes pytorch crash if the cameras instance's fx, fy, cx or cy tensors
are loaded from a common shared tensor.

This PR fixes the issue by cloning the respective tensors before passing them to the Cameras(...) constructor.
dmholtz added a commit to dmholtz/nerfstudio that referenced this pull request Nov 29, 2023
The ParallelDataManager (see nerfstudio-project#2092) makes pytorch crash if the cameras instance's fx, fy, cx or cy tensors
are loaded from a common shared tensor.

This PR fixes the issue by cloning the respective tensors before passing them to the Cameras(...) constructor.
dmholtz added a commit to dmholtz/nerfstudio that referenced this pull request Nov 30, 2023
The ParallelDataManager (see nerfstudio-project#2092) makes pytorch crash if the cameras instance's fx, fy, cx or cy tensors
are loaded from a common shared tensor.

This PR fixes the issue by cloning the respective tensors before passing them to the Cameras(...) constructor.
brentyi pushed a commit that referenced this pull request Nov 30, 2023
The ParallelDataManager (see #2092) makes pytorch crash if the cameras instance's fx, fy, cx or cy tensors
are loaded from a common shared tensor.

This PR fixes the issue by cloning the respective tensors before passing them to the Cameras(...) constructor.
This was referenced Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants