Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance ideas / benchmarking #3

Open
hybridherbst opened this issue Oct 10, 2023 · 14 comments
Open

Performance ideas / benchmarking #3

hybridherbst opened this issue Oct 10, 2023 · 14 comments
Assignees
Labels
enhancement New feature or request

Comments

@hybridherbst
Copy link
Contributor

Some ideas in regards to performance. Ultimately it would be nice to get (a subset of) this to work on a Quest 2 / 3; currently that's running at 5-10 fps and very choppy. So are the other three.js implementations though!

  • using compressed data at runtime. Seems you started on this already! There are some ideas regarding compression formats here: https://aras-p.info/blog/2023/09/27/Making-Gaussian-Splats-more-smaller/
  • using better interleaved data so data fetching on the GPU is more localized (same link above has some info)
  • using alpha hashing instead of transparency, and then rendering back-to-front instead to get some early Z cutoff
  • some kind of LOD system - not sure if splats could be sorted by "importance" (e.g. less transparent ones are more important?) at runtime, or if the calculations would need to be done with less splats in the first place.

Regarding loading behaviour, I've dabbled a bit with creating splats already while loading, will see if I can make a PR for parts of that.

And it would be interesting to load compressed data, again Aras (link above) has some ideas around that and tooling to generate byte buffers that are already optimized (10-20x size reduction).

@quadjr quadjr self-assigned this Oct 10, 2023
@quadjr quadjr added the enhancement New feature or request label Oct 10, 2023
@quadjr
Copy link
Owner

quadjr commented Oct 10, 2023

Thank you for sharing interesting information. Also, thanks for the pull requests. I'll check them later.

I bought the Quest 3 today and made code modifications to support VR mode. It is very intriguing.
The performance still needs improvement, though. I'll look into the information you provided and consider about it.

@electrum-bowie
Copy link

Please pleaseee let me know too !

@quadjr
Copy link
Owner

quadjr commented Oct 11, 2023

using compressed data at runtime. Seems you started on this already! There are some ideas regarding compression formats here: https://aras-p.info/blog/2023/09/27/Making-Gaussian-Splats-more-smaller/

I've read Aras's impressive work!
In my current implementation, each splat uses 256 bits. This means 7.8x size reduction. I haven't evaluated the image quality yet, but I may be able to implement some of Aras's methods.

using alpha hashing instead of transparency, and then rendering back-to-front instead to get some early Z cutoff

Could you elaborate on this idea?

@quadjr
Copy link
Owner

quadjr commented Oct 11, 2023

I’ve studied alpha hashing. I’ll test it later. 🤓

@quadjr quadjr mentioned this issue Oct 12, 2023
@quadjr
Copy link
Owner

quadjr commented Oct 12, 2023

I've tested alpha hashing, but it didn't improve the performance
I think it won't reduce memory traffic because continuous pixel values are read in one operation.
Thus, discarding pixels on an individual basis won't impact memory traffic
Here is the code I tested.
https://github.com/quadjr/aframe-gaussian-splatting/tree/feature/alpha-hashing

@electrum-bowie
Copy link

@quadjr the alpha-hashing branch is 100% identical to the main branch

@quadjr
Copy link
Owner

quadjr commented Oct 14, 2023

@quadjr
Copy link
Owner

quadjr commented Oct 14, 2023

@hybridherbst
I've made several improvements based on your ideas.

For the LOD system, small splats with high transparency at a distance will be removed during the sorting process.
This method has significantly improved performance.
The threshold for removal requires further theoretical consideration.

Data compression might enhance performance. I need to set up image quality evaluation programs.
Alpha hashing and data localization might not boost the performance.

I've also implemented incremental loading.

I've done almost everything I can think for now. I'll shift my focus to the generation software.
I believe I can make further improvements to it. 🤓

@hybridherbst
Copy link
Contributor Author

hybridherbst commented Oct 15, 2023

Thank you, that does sound like great improvements!

The current threshold of -0.001 did have a very noticeable quality impact on my "FH Portrait" dataset though; I've set it to -0.0001 as a quick test which looks fine, but haven't looked for a proper upper bound. I'll do some more testing with your updates.
EDIT: On Quest -0.001 looks fine actually, so the number may need to be fov-based.

One question out of curiosity, the sortSplats method currently allocates new arrays on each run – doesn't that have a performance impact and/or would it be better to cache those instead?

@quadjr
Copy link
Owner

quadjr commented Oct 16, 2023

Thank you for the reports.
The threshold should be determined by the size of the splat on the screen, and it can be calculated using FOV and resolution.
I will work on implementing the threshold calculation later.

One question out of curiosity, the sortSplats method currently allocates new arrays on each run – doesn't that have a performance impact and/or would it be better to cache those instead?

Yeah, There are some unnecessary allocations during the loading and sorting processes.
I'm currently focused on the generation side of the model and am prioritizing that.
Once I've addressed that, I will optimize the memory usage and allocations of this viewer.

@JiamingSuen
Copy link

Maybe consider integrating https://github.com/mkkellogg/GaussianSplats3D, which uses a wasm module for sorting. The author has also done some other interesting optimizations.

@softyoda
Copy link

softyoda commented Nov 5, 2023

Hi, will the mkkellogg .splat format (that add further optimization) mkkellogg/GaussianSplats3D#28 will be compatible with the .splat of this implementation?

@dlazares
Copy link

dlazares commented Nov 29, 2023

@quadjr @hybridherbst
We've already done the work on making splats smaller!

I made this repo to share our small splats for renderer testing.
we have these running in our renderer at 90FPS on Quest 3 in browser. I'd love to help out so we can get something more sharable.
https://github.com/gmix-tech/small_splats

I tested this branch out with our small splat and I'm only pulling 45 FPS from AFrame in VR mode on Quest 3. It's unclear to me whether it's something to do with AFrame itself or with this component implementation.

Feel free to ping me at [email protected] if you wanna chat more about this

@dmarcos
Copy link

dmarcos commented Mar 8, 2024

@dlazares 90fps sounds great! How can I give your renderer a try? Couldn't find the repo. I'm working on integrating a component on A-Frame core. Thanks so much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants