Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark: it is built with /Ob1, so vector algorithm dispatcher is noticeable #4496

Open
AlexGuteniev opened this issue Mar 21, 2024 · 1 comment
Labels
test Related to test code

Comments

@AlexGuteniev
Copy link
Contributor

AlexGuteniev commented Mar 21, 2024

Noticed while working #4495 . When I decided to use sized if constexpr dispatch, instead of using the same version for all element sizes, I observed significant perf degradation for small element sizes. A part of it is due to not inlining the dispatcher.

The benchmark is built with /Ob1. Looks like it is implied due to CMake RelWithDebugInfo configuration, as opposed to Release.

What are our takeaways?


I see the following options:

  • Mark vector algorithms dispatchers inline, consider making other STL functions inline.
    • This helps other projects with RelWithDebugInfo to inline STL, though it would obfuscate the debugger
  • Override the option in the benchmark
  • Make the benchmark Release by default, instead of RelWithDebugInfo
    • I wouldn't like that. RelWithDebugInfo is convenient for profiling
  • Accept the cost of dispatching as a penalty for vector algorithms that use dispatching
    • I don't think this is fair
  • Use specializations instead of if constexpr
    • Throughput?
  • Manually inline the dispatch like for __std_reverse_copy_trivially_copyable...
    • Copypasta
@AlexGuteniev AlexGuteniev added the question Further information is requested label Mar 21, 2024
AlexGuteniev added a commit to AlexGuteniev/STL that referenced this issue Mar 21, 2024
@StephanTLavavej
Copy link
Member

We talked about this at the weekly maintainer meeting and @barcharcraz had a suggestion - build as Release, but then add compiler options (e.g. /Zi) to generate debug info.

set(CMAKE_BUILD_TYPE RelWithDebInfo)
# /utf-8 affects <format>.
add_compile_options("$<$<COMPILE_LANGUAGE:CXX>:/diagnostics:caret;/W4;/WX;/w14265;/w15038;/w15262;/utf-8>")

Charlie was additionally concerned that we shouldn't be trying to force the build type here - it would be better to provide a default that can be overridden when building the benchmarks, if a contributor temporarily wants something different. We don't immediately know the proper incantations for that.

@StephanTLavavej StephanTLavavej added test Related to test code and removed question Further information is requested labels Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Related to test code
Projects
None yet
Development

No branches or pull requests

2 participants