Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xchar.h wide strings are broken in VS 2022 preview 17.10 #3898

Closed
DmitryKo76 opened this issue Mar 16, 2024 · 9 comments
Closed

xchar.h wide strings are broken in VS 2022 preview 17.10 #3898

DmitryKo76 opened this issue Mar 16, 2024 · 9 comments

Comments

@DmitryKo76
Copy link

DmitryKo76 commented Mar 16, 2024

In most recent MSVC 19.40 (Visual Studio 2022 17.10 preview 2), calling println() with whar_t and std:wstring, or print with std:wstring, will result in compiler errors
[Edit] in c++20 mode when you also include "fmt\std.h", <iostream> or <ostream>.

#include <fmt/format.h>
#include <fmt/xchar.h>
//Compile error in MSVC 19.40 (VS 2022 17.10 preview) with /std:c++20 and one of the following headers
#include "fmt\std.h"
#include <ostream>
#include <iostream>
int main() {
	fmt::println(L"Error: φαρδιές χορδές");
	fmt::println(L"Error: {}", L"φαρδιές χορδές");
	fmt::print(L"{}", format(L"Errror: {}", L"φαρδιές χορδές") );
}
fmt\xchar.h(271,41): error C2672: 'fmt::v10::make_wformat_args': no matching overloaded function found
fmt\xchar.h(271,10): error C2661: 'fmt::v10::vprint': no overloaded function takes 1 arguments
fmt\xchar.h(139,47): error C2672: 'fmt::v10::make_wformat_args': no matching overloaded function found
fmt\xchar.h(139,10): error C2661: 'fmt::v10::vformat': no overloaded function takes 1 arguments

It worked fine in previous versions like MSVC 19.38:
https://godbolt.org/z/cqoz8esjn


Now print() with whar_t kind of works, but it will raise runtime exceptions for any non-ASCII symbols by default, unless I call _setmode to enable one of Unicode text translation modes in the CRT - then it works fine and prints correct non-ASCII symbols; in binary mode, it prints some mojibake.

print() and println() seem to work fine with UTF-8 strings - although displaying correct non-ASCII symbols requires /utf-8 compiler option, otherwise these will show as question marks or a different kind of mojibake.

I'm back to using wprintf_s() as I need to print localized COM error messages supplied by Windows API, and std::print only supports UTF-8 but not wchar_t.

@vitaut
Copy link
Contributor

vitaut commented Mar 16, 2024

Your repro (https://godbolt.org/z/EeEEKj1fz) shows that fmt::print is working. If you meant std::print then you need to report the issue to the Microsoft STL maintainers. Otherwise please provide the correct repro.

@vitaut vitaut closed this as completed Mar 16, 2024
@vitaut vitaut added the invalid label Mar 16, 2024
@DmitryKo76
Copy link
Author

DmitryKo76 commented Mar 17, 2024

It looks like compilation errors in MSVC 19.40 only happen when including "fmt\std.h", <iostream> or <ostream> and enabling c++20 mode; I am sorry for the confusion.

The latest available on godbolt is MSVC 19.38, which works fine.

I've updated the code example above, it compiles successfully in c++14 / c++17 modes, and breaks with /std:c++20 or /std:c++latest. Developer command prompt output is below.

C:\tmp\repro>type repro.cpp
#include "fmt\format.h"
#include "fmt\xchar.h"
//Compile error in MSVC 19.40 (VS 2022 17.10 preview) with /std:c++20 and one of the following headers
#include "fmt\std.h"
#include <ostream>
#include <iostream>
        int main()
        {
                fmt::println(L"Error: φαρδιές χορδές");
                fmt::println(L"Error: {}", L"φαρδιές χορδές");
                fmt::println(L"{}", fmt::format(L"{} : {}", L"Error", L"φαρδιές χορδές"));
        }


C:\tmp\repro>cl /EHsc /std:c++20 repro.cpp format.cc
Microsoft (R) C/C++ Optimizing Compiler Version 19.40.33617.1 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

repro.cpp
C:\tmp\repro\fmt\xchar.h(271): error C2672: 'fmt::v10::make_wformat_args': no matching overloaded function found
C:\tmp\repro\fmt\xchar.h(89): note: could be 'unknown-type fmt::v10::make_wformat_args(T ...)'
C:\tmp\repro\fmt\xchar.h(271): note: Failed to specialize function template 'unknown-type fmt::v10::make_wformat_args(T ...)'
C:\tmp\repro\fmt\xchar.h(271): note: With the following template arguments:
C:\tmp\repro\fmt\xchar.h(271): note: 'T={std::wstring}'
C:\tmp\repro\fmt\xchar.h(90): note: 'std::make_format_args': ambiguous call to overloaded function
C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.40.33617\include\format(3732): note: could be 'auto std::make_format_args<fmt::v10::wformat_context,std::wstring>(std::wstring &)' [found using argument-dependent lookup]
C:\tmp\repro\fmt\base.h(2003): note: or       'fmt::v10::detail::format_arg_store<fmt::v10::wformat_context,1,0,13> fmt::v10::make_format_args<fmt::v10::wformat_context,std::wstring,1,0,13,0>(std::wstring &)'
C:\tmp\repro\fmt\xchar.h(90): note: while trying to match the argument list '(std::wstring)'
C:\tmp\repro\fmt\xchar.h(271): note: the template instantiation context (the oldest one first) is
repro.cpp(9): note: see reference to function template instantiation 'void fmt::v10::println<>(fmt::v10::basic_format_string<wchar_t>)' being compiled
C:\tmp\repro\fmt\xchar.h(280): note: see reference to function template instantiation 'void fmt::v10::print<std::wstring>(fmt::v10::basic_format_string<wchar_t,std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t>>>,std::wstring &&)' being compiled
C:\tmp\repro\fmt\xchar.h(271): error C2661: 'fmt::v10::vprint': no overloaded function takes 1 arguments
C:\tmp\repro\fmt\ostream.h(163): note: could be 'void fmt::v10::vprint(std::basic_ostream<_CharT,std::char_traits<_Elem>> &,fmt::v10::basic_string_view<type_identity<T>::type>,detail::vformat_args<Char>::type)'
C:\tmp\repro\fmt\xchar.h(271): note: 'void fmt::v10::vprint(std::basic_ostream<_CharT,std::char_traits<_Elem>> &,fmt::v10::basic_string_view<type_identity<T>::type>,detail::vformat_args<Char>::type)': expects 3 arguments - 1 provided
C:\tmp\repro\fmt\xchar.h(271): note: while trying to match the argument list '(fmt::v10::wstring_view)'
C:\tmp\repro\fmt\xchar.h(139): error C2672: 'fmt::v10::make_wformat_args': no matching overloaded function found
C:\tmp\repro\fmt\xchar.h(89): note: could be 'unknown-type fmt::v10::make_wformat_args(T ...)'
C:\tmp\repro\fmt\xchar.h(139): note: Failed to specialize function template 'unknown-type fmt::v10::make_wformat_args(T ...)'
C:\tmp\repro\fmt\xchar.h(139): note: With the following template arguments:
C:\tmp\repro\fmt\xchar.h(139): note: 'T={_Ty}'
C:\tmp\repro\fmt\xchar.h(90): note: 'std::make_format_args': ambiguous call to overloaded function
C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.40.33617\include\format(3732): note: could be 'auto std::make_format_args<fmt::v10::wformat_context,std::wstring>(std::wstring &)' [found using argument-dependent lookup]
C:\tmp\repro\fmt\base.h(2003): note: or       'fmt::v10::detail::format_arg_store<fmt::v10::wformat_context,1,0,13> fmt::v10::make_format_args<fmt::v10::wformat_context,std::wstring,1,0,13,0>(std::wstring &)'
C:\tmp\repro\fmt\xchar.h(90): note: while trying to match the argument list '(_Ty)'
        with
        [
            _Ty=std::wstring
        ]
C:\tmp\repro\fmt\xchar.h(139): note: the template instantiation context (the oldest one first) is
repro.cpp(11): note: see reference to function template instantiation 'void fmt::v10::println<std::wstring>(fmt::v10::basic_format_string<wchar_t,std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t>>>,std::wstring &&)' being compiled
C:\tmp\repro\fmt\xchar.h(280): note: see reference to function template instantiation 'std::wstring fmt::v10::format<_Ty>(fmt::v10::basic_format_string<wchar_t,std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t>>>,_Ty &&)' being compiled
        with
        [
            _Ty=std::wstring
        ]
C:\tmp\repro\fmt\xchar.h(139): error C2661: 'fmt::v10::vformat': no overloaded function takes 1 arguments
C:\tmp\repro\fmt\xchar.h(157): note: could be 'std::basic_string<Char,std::char_traits<Char>,std::allocator<Char>> fmt::v10::vformat(const Locale &,const S &,detail::vformat_args<Char>::type)'
C:\tmp\repro\fmt\xchar.h(139): note: 'std::basic_string<Char,std::char_traits<Char>,std::allocator<Char>> fmt::v10::vformat(const Locale &,const S &,detail::vformat_args<Char>::type)': expects 3 arguments - 1 provided
C:\tmp\repro\fmt\xchar.h(129): note: or       'std::basic_string<_Elem,std::char_traits<_Elem>,std::allocator<_Ty>> fmt::v10::vformat(fmt::v10::basic_string_view<Char>,detail::vformat_args<Char>::type)'
C:\tmp\repro\fmt\xchar.h(139): note: 'std::basic_string<_Elem,std::char_traits<_Elem>,std::allocator<_Ty>> fmt::v10::vformat(fmt::v10::basic_string_view<Char>,detail::vformat_args<Char>::type)': expects 2 arguments - 1 provided
C:\tmp\repro\fmt\format.h(4370): note: or       'std::string fmt::v10::vformat(const Locale &,fmt::v10::string_view,fmt::v10::format_args)'
C:\tmp\repro\fmt\xchar.h(139): note: 'std::string fmt::v10::vformat(const Locale &,fmt::v10::string_view,fmt::v10::format_args)': expects 3 arguments - 1 provided
C:\tmp\repro\fmt\xchar.h(139): note: while trying to match the argument list '(fmt::v10::wstring_view)'
format.cc
Generating Code...

Compiler output from the IDE: msvc1940_fmtlib.txt

@phprus
Copy link
Contributor

phprus commented Mar 17, 2024

Try replacing lines

fmt/include/fmt/xchar.h

Lines 90 to 91 in c17816c

-> decltype(make_format_args<wformat_context>(args...)) {
return make_format_args<wformat_context>(args...);

with

    -> decltype(fmt::make_format_args<wformat_context>(args...)) {
  return fmt::make_format_args<wformat_context>(args...);

@DmitryKo76
Copy link
Author

Yes, specifying the fmt:: namespace did work. Thanks!

@DmitryKo76
Copy link
Author

DmitryKo76 commented Mar 17, 2024

While we're at it... by default, wide strings in print(L"") raise runtime exceptions, unless you excplicitly enable Unicode translation mode in C runtime wih _setmode(), available since Visual Studio 2005 (MSVC 14.0):

#include <io.h>
#include <fnctrl.h>
   _setmode(_fileno(stdout), _O_UWTEXT);
   _setmode(_fileno(stderr), _O_UWTEXT);

I believe this is safe to enable by default. CRT code page translation bugs were ironed out long time ago, according to posts by the late Michael S. Kaplan, and Windows 10 console implementaion natively supports Unicode UTF-8 in the text buffer since version 1809, replacing UCS-2 encoding used by Windows API (aka Win32)...

@vitaut vitaut reopened this Mar 17, 2024
@vitaut
Copy link
Contributor

vitaut commented Mar 18, 2024

The compilation errors should be fixed by #3899 (thanks to @phprus). For Unicode support make sure to compile with /utf-8.

@vitaut vitaut closed this as completed Mar 18, 2024
@vitaut vitaut removed the invalid label Mar 18, 2024
@DmitryKo76
Copy link
Author

DmitryKo76 commented Mar 18, 2024

/utf-8 compiler switch has no effect on multibyte strings, though it does enable UTF-8 encoded strings to display correctly.

Unless you enable UTF-16 text translation mode with _setmode(_O_WTEXT), the C runtime will trigger an exception when it encounters UTF-16 symbols not representable by the current ANSI code page:

fmt/include/fmt/format.h

Lines 112 to 116 in 12acd79

template <typename Exception> inline void do_throw(const Exception& x) {
// Silence unreachable code warnings in MSVC and NVCC because these
// are nearly impossible to fix in a generic code.
volatile bool b = true;
if (b) throw x;

Exception thrown at 0x00007FFB555C491C in repro.exe: Microsoft C++ exception: std::system_error at memory location 0x00000058402FE578.
 	KernelBase.dll!RaiseException()
 	vcruntime140d.dll!_CxxThrowException(void * pExceptionObject, const _s__ThrowInfo * pThrowInfo) Line 82
 	repro.exe!fmt::v10::detail::do_throw<std::system_error>(const std::system_error & x) Line 116
>	repro.exe!fmt::v10::vprint(_iobuf * f, fmt::v10::basic_string_view<wchar_t> fmt, fmt::v10::basic_format_args<fmt::v10::generic_context<fmt::v10::basic_appender<wchar_t>,wchar_t>> args) Line 258
 	repro.exe!fmt::v10::vprint(fmt::v10::basic_string_view<wchar_t> fmt, fmt::v10::basic_format_args<fmt::v10::generic_context<fmt::v10::basic_appender<wchar_t>,wchar_t>> args) Line 262
 	repro.exe!fmt::v10::print<std::wstring>(fmt::v10::basic_format_string<wchar_t,std::wstring> fmt, std::wstring && <args_0>) Line 271
 	repro.exe!fmt::v10::println<wchar_t const (&)[23]>(fmt::v10::basic_format_string<wchar_t,wchar_t const (&)[23]> fmt, const wchar_t[23] & <args_0>) Line 280
 	repro.exe!main() Line 19
 	repro.exe!invoke_main() Line 79
 	repro.exe!__scrt_common_main_seh() Line 288
 	repro.exe!__scrt_common_main() Line 331
 	repro.exe!mainCRTStartup(void * __formal) Line 17
 	kernel32.dll!00007ffb563654e0()
 	ntdll.dll!00007ffb57ac485b()	

@DmitryKo76
Copy link
Author

DmitryKo76 commented Mar 18, 2024

Here is my test case which prints three text strings, two of which contain symbols outside of my current system locale, first encoded as UTF-8 strings and then as UTF-16 wide strings.

C:\tmp\repro>type repro.cpp
#include "fmt\format.h"
#include "fmt\xchar.h"

#include <io.h>
#include <fcntl.h>

        int main()
        {
                fmt::println("Text0: 'regular strings'");
                fmt::println("Text1: {}", "'κανονικές συμβολοσειρά'");
                fmt::println("{}", fmt::format("{}: {}", "Text2", "'通常の文字列'"));

#ifdef SetMode
                //Runtime exception in vcruntime140d.dll if undefined
                _setmode(_fileno(stdout), _O_WTEXT);
#endif

                fmt::println(L"Text3: 'wide strings'");
                fmt::println(L"Text4: {}", L"'φαρδιές συμβολοσειρά'");
                fmt::println(L"{}", fmt::format(L"{}: {}", L"Text5", L"'ワイド文字列'"));
        }

Here is the output from four variants of the executable file, compiled with combinations of /utf-8 and /D switches.
Note that /D SetMode enables a conditional compile block that calls _setmode(_O_WTEXT)) - without this call, the output stops at the first non-English character in string 'Text4`, throwing a runtime exception described above.

C:\tmp\repro>cl /EHsc repro.cpp format.cc
Microsoft (R) C/C++ Optimizing Compiler Version 19.40.33617.1 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

repro.cpp
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03BA' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03B1' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03BD' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03BF' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03B9' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03AD' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03C2' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03C3' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03C5' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03BC' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03B2' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03BB' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03B5' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03C1' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03AC' cannot be represented in the current code page (1251)
repro.cpp(11): warning C4566: character represented by universal-character-name '\u901A' cannot be represented in the current code page (1251)
repro.cpp(11): warning C4566: character represented by universal-character-name '\u5E38' cannot be represented in the current code page (1251)
repro.cpp(11): warning C4566: character represented by universal-character-name '\u306E' cannot be represented in the current code page (1251)
repro.cpp(11): warning C4566: character represented by universal-character-name '\u6587' cannot be represented in the current code page (1251)
repro.cpp(11): warning C4566: character represented by universal-character-name '\u5B57' cannot be represented in the current code page (1251)
repro.cpp(11): warning C4566: character represented by universal-character-name '\u5217' cannot be represented in the current code page (1251)
format.cc
Generating Code...
Microsoft (R) Incremental Linker Version 14.40.33617.1
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:repro.exe
repro.obj
format.obj

C:\tmp\repro>repro
Text0: 'regular strings'
Text1: '????????? ????????????'
Text2: '??????'
Text3: 'wide strings'
Text4: '
C:\tmp\repro>cl /EHsc /utf-8 repro.cpp format.cc
Microsoft (R) C/C++ Optimizing Compiler Version 19.40.33617.1 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

repro.cpp
format.cc
Generating Code...
Microsoft (R) Incremental Linker Version 14.40.33617.1
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:repro.exe
repro.obj
format.obj

C:\tmp\repro>repro
Text0: 'regular strings'
Text1: 'κανονικές συμβολοσειρά'
Text2: '通常の文字列'
Text3: 'wide strings'
Text4: '
C:\tmp\repro>cl /EHsc /D SetMode repro.cpp format.cc
Microsoft (R) C/C++ Optimizing Compiler Version 19.40.33617.1 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

repro.cpp
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03BA' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03B1' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03BD' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03BF' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03B9' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03AD' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03C2' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03C3' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03C5' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03BC' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03B2' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03BB' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03B5' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03C1' cannot be represented in the current code page (1251)
repro.cpp(10): warning C4566: character represented by universal-character-name '\u03AC' cannot be represented in the current code page (1251)
repro.cpp(11): warning C4566: character represented by universal-character-name '\u901A' cannot be represented in the current code page (1251)
repro.cpp(11): warning C4566: character represented by universal-character-name '\u5E38' cannot be represented in the current code page (1251)
repro.cpp(11): warning C4566: character represented by universal-character-name '\u306E' cannot be represented in the current code page (1251)
repro.cpp(11): warning C4566: character represented by universal-character-name '\u6587' cannot be represented in the current code page (1251)
repro.cpp(11): warning C4566: character represented by universal-character-name '\u5B57' cannot be represented in the current code page (1251)
repro.cpp(11): warning C4566: character represented by universal-character-name '\u5217' cannot be represented in the current code page (1251)
format.cc
Generating Code...
Microsoft (R) Incremental Linker Version 14.40.33617.1
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:repro.exe
repro.obj
format.obj

C:\tmp\repro>repro
Text0: 'regular strings'
Text1: '????????? ????????????'
Text2: '??????'
Text3: 'wide strings'
Text4: 'φαρδιές συμβολοσειρά'
Text5: 'ワイド文字列'
C:\tmp\repro>cl /EHsc /utf-8 /D SetMode repro.cpp format.cc
Microsoft (R) C/C++ Optimizing Compiler Version 19.40.33617.1 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

repro.cpp
format.cc
Generating Code...
Microsoft (R) Incremental Linker Version 14.40.33617.1
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:repro.exe
repro.obj
format.obj

C:\tmp\repro>repro
Text0: 'regular strings'
Text1: 'κανονικές συμβολοσειρά'
Text2: '通常の文字列'
Text3: 'wide strings'
Text4: 'φαρδιές συμβολοσειρά'
Text5: 'ワイド文字列'

@vitaut
Copy link
Contributor

vitaut commented Mar 22, 2024

Wide streams are not recommended but if you use them you'll have to call _setmode from the program yourself. The library shouldn't modify global state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants