Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<iostream>: Bad performance of wcout operator<< compared to wprintf #605

Open
bernd5 opened this issue Mar 13, 2020 · 3 comments
Open

<iostream>: Bad performance of wcout operator<< compared to wprintf #605

bernd5 opened this issue Mar 13, 2020 · 3 comments
Labels
performance Must go faster

Comments

@bernd5
Copy link

bernd5 commented Mar 13, 2020

If I compile the following code:

for(auto idx = 0; idx < 10; idx++){
        const wchar_t* a = L"this is a test\n";      
        std::wcout << a;
}

it needs round about 18 milliseconds during runtime.

But if I use C functions like:

for(auto idx = 0; idx < 10; idx++){
        const wchar_t* a = L"this is a test\n";      
        wprintf(L"%s", a);
}

It needs only 5 milliseconds!
Why is std::cout / wcout so slow? I would assume that the C function should be slower because it has to analyze the format string, but...

I tried to optimize it by calling first: std::ios::sync_with_stdio(false);
But this seems to be completely ignored (I found no reference to _Sync...)?

@StephanTLavavej StephanTLavavej changed the title Bad performance: std::wcout operator<< compared to wprintf <iostream>: Bad performance of wcout operator<< compared to wprintf Mar 14, 2020
@StephanTLavavej StephanTLavavej added the performance Must go faster label Mar 14, 2020
@bernd5
Copy link
Author

bernd5 commented Mar 14, 2020

link to an old article:
Link

@bernd5
Copy link
Author

bernd5 commented Mar 15, 2020

with this call it becomes fast:
setvbuf(stdout, 0, _IOLBF, 4096);

Why is the default so bad?

@StephanTLavavej
Copy link
Member

Why is the default so bad?

As mentioned in my decade-old reply to that Connect bug you found, this is a CRT limitation:

(As documented at https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/setvbuf?view=msvc-160 [updated link] , our C Standard Library implementation isn't capable of line buffering, only full buffering or no buffering. Paragraph 7.19.3/7 of the 1999 C Standard requires that "As initially opened, the standard error stream is not fully buffered; the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device." Since we don't have line buffering, we must default to no buffering.)

We changed the iostreams code in that decade - now, xsputn attempts to avoid character-by-character processing, but only for 1-byte characters, so your wchar_t repro doesn't benefit from that optimization:

STL/stl/inc/fstream

Lines 634 to 636 in 280347a

} else { // non-chars always get element-by-element processing
return _Mysb::xsputn(_Ptr, _Count);
}

I'll leave this issue active for further investigation; it may be possible to extend this optimization to wchar_t.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests

2 participants