benchmark: it is built with /Ob1
, so vector algorithm dispatcher is noticeable
#4496
Labels
test
Related to test code
Noticed while working #4495 . When I decided to use sized
if constexpr
dispatch, instead of using the same version for all element sizes, I observed significant perf degradation for small element sizes. A part of it is due to not inlining the dispatcher.The benchmark is built with
/Ob1
. Looks like it is implied due to CMakeRelWithDebugInfo
configuration, as opposed toRelease
.What are our takeaways?
I see the following options:
inline
, consider making other STL functionsinline
.RelWithDebugInfo
to inline STL, though it would obfuscate the debuggerRelease
by default, instead ofRelWithDebugInfo
RelWithDebugInfo
is convenient for profilingif constexpr
__std_reverse_copy_trivially_copyable...
The text was updated successfully, but these errors were encountered: