Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<format>: Avoid unnecessary calls to _Getcvt() #1825

Closed
statementreply opened this issue Apr 11, 2021 · 4 comments
Closed

<format>: Avoid unnecessary calls to _Getcvt() #1825

statementreply opened this issue Apr 11, 2021 · 4 comments
Labels
fixed Something works now, yay! format C++20/23 format performance Must go faster

Comments

@statementreply
Copy link
Contributor

The performance of format degraded after #1815. Profiling shows that _Getcvt() is consuming a significant part of the CPU time.

We could investigate:

  • Avoid calling _Getcvt() when unnecessary (when the format string is interpreted as UTF-8 or UTF-16).
  • Reduce the number of calls to _Getcvt().
#include <chrono>
#include <format>
#include <iostream>
#include <iterator>
#include <ratio>

using namespace std;
using namespace std::chrono;

int main() {
    constexpr int iters = 10'000'000;

    string buf;
    buf.reserve(64);
    size_t dummy = 0;

    const auto start_time = steady_clock::now();

    for (int i = 0; i < iters; ++i) {
        buf.clear();
        format_to(back_inserter(buf), "run {0}/{1}", i + 1, iters);
        dummy += buf.size();
    }

    const auto end_time     = steady_clock::now();
    const auto average_time = duration<double, nano>{end_time - start_time} / iters;

    cout << "average time: " << average_time << "\n";
    cout << format("dummy: {}\n", dummy);
}
cl /EHsc /W4 /WX /std:c++latest /O2 /utf-8 temp.cpp

Before #1815: 179 ns
After #1815: 241 ns

@StephanTLavavej StephanTLavavej added format C++20/23 format performance Must go faster labels Apr 11, 2021
@barcharcraz
Copy link
Member

#1834 should have fixed this issue, I'm confirming now.

@CaseyCarter
Copy link
Member

#1834 should have fixed this issue, I'm confirming now.

After #1834 we no longer call __std_get_cvt (which is basically a reimplementation of _Getcvt that takes a code page number as an argument) at all when formatting wide strings or when formatting narrow strings if the execution character set is UTF-8.

We could still probably do a bit better for other cases by caching the result of __std_get_cvt at a higher level to reduce the total number of calls. My attempts to do so in the initial implementation were foiled by needing the info both while parsing format strings and while formatting; there was no good central place to cache and too little time for large-scale restructuring.

@barcharcraz
Copy link
Member

image

that's after #1834, and I've confirmed in benchmarks that no calls to locale getting functions occur (it was a sampling benchmark, but with how long those calls take I don't think that's a problem)

@MikeGitb
Copy link

Any idea what causes the 3x perf differencein back_insert_to_string?

@StephanTLavavej StephanTLavavej added the fixed Something works now, yay! label Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed Something works now, yay! format C++20/23 format performance Must go faster
Projects
None yet
Development

No branches or pull requests

5 participants