-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
common: Use small buffer optimization for AutoDiffXd #12583
common: Use small buffer optimization for AutoDiffXd #12583
Conversation
@amcastro-tri FYI from our chat a week or two ago. This moves autodiff to the stack, in cases where we have <= 6 partials, or gracefully overflows to the heap for larger sizes. If you have some benchmarks you want to run, I'd be curious to hear if this helps or not. The broader idea is that we could switch all of framework + MbP to use this, and then add helpers that to chunking (#7039). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! I think we usually have a lot more than 6 partials so the chunking would be necessary to exploit this. An alternative would be to exploit the fact that the number of partials is typically constant through a large computation so memory could be doled out and repeatedly reused from a fixed-size pool.
Reviewable status: needs platform reviewer assigned, needs at least two assigned reviewers, labeled "do not merge"
Thank you so much @jwnimmer-tri! I'll see to write a quick MBP benchmark with this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 8 files at r1.
Reviewable status: 1 unresolved discussion, needs platform reviewer assigned, needs at least two assigned reviewers, labeled "do not merge" (waiting on @jwnimmer-tri)
common/eigen_autodiff_types.h, line 26 at r1 (raw file):
/* _Cols = */ 1, /* _Options = */ 0, /* _MaxRows = */ internal::kMaxRowsAtCompileTimeThatTriggersInlineStorage,
btw, I love these inline comments!
common/eigen_dense_storage_sbo.h, line 22 at r1 (raw file):
/** The magic MaxRowsAtCompileTime value that invokes SBO. */ constexpr int kMaxRowsAtCompileTimeThatTriggersInlineStorage = 1234567;
could this number be negative? so that there's no even chance someone will trigger this by accident.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 unresolved discussion, needs platform reviewer assigned, needs at least two assigned reviewers, labeled "do not merge" (waiting on @amcastro-tri)
common/eigen_dense_storage_sbo.h, line 22 at r1 (raw file):
Previously, amcastro-tri (Alejandro Castro) wrote…
could this number be negative? so that there's no even chance someone will trigger this by accident.
Eigen actually ends up using this template argument in various computations, so I would be worried that a negative would explode on us, even it we lucked into it working for now.
Yes, in many optimization programs we'd have hundreds or thousands of partials. But @amcastro-tri thought we might have some cases with only a few. The possible win of this approach is that we only have a single autodiff scalar at build-time, to keep build times and bindings sane -- and it adapts to all use cases -- those with only a few partials, those who use chunking, and those who need giant partial arrays on the heap. We could also play with the max here to be something like 15 instead of 6, in case that opens up a useful amount of more speed. Or maybe even 7 is a magical threshold (a single pose?).
Perhaps so. At minimum, this PR shows the trick for how to replace AutoDiff's storage protocol. Someone could rework it to use heap-only arenas or pools, or maybe even add the pool into the current implementation, to have both SBO + pools. |
Github will remember the code, removing the PR for now. |
Thanks @jwnimmer-tri. I am hopping to give this a meaningful spin test in the updated contact solver I am working on right now. Not forgotten! |
\CC @calderpg-tri FYI this is how to inject SBO into Eigen::VectorXd. |
This may be interesting for profiling.
This change is