Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

P2465R3 Standard Library Modules std And std.compat #3108

Merged
merged 117 commits into from
Sep 19, 2022

Conversation

StephanTLavavej
Copy link
Member

@StephanTLavavej StephanTLavavej commented Sep 16, 2022

Fixes #2930.

Overview

I've added over 3,750 occurrences of _EXPORT_STD after auditing over 148,000 lines of headers. This was a manual process, checking against the Standard to verify that all user-visible machinery is being exported - not an automated search-and-replace.

The tests are passing with VS 2022 17.4 Preview 2 and a number of workarounds in product code. There are some known compiler bugs with no available workarounds (e.g. various ICEs), and @cdacamar is continuing to investigate those. The tracking issue for compiler bugs is #1694. As previously seen with Standard Library Header Units, adding test coverage for Standard Library Modules will help in two ways: (1) preventing compiler changes from introducing regressions, and (2) immediately identifying when library changes or additions have encountered previously-unknown bugs, so we can report and work around them.

Currently, only the MSVC compiler front-end is supported. The EDG front-end that's used for IntelliSense is still working on modules support. As usual, we'll support Clang as a first-class citizen as soon as compiler support is available. I am assuming that __cpp_lib_concepts is defined (i.e. not attempting to mark the non-concepts fallbacks as exported). This can be revisited if EDG gains support for modules significantly before concepts.

This adds two files, std.ixx and std.compat.ixx, in a modules subdirectory that's a sibling of the inc or include subdirectory for headers. These will be used to build IFC and OBJ files (by the user's build system). Notably, we will never ship prebuilt IFCs/OBJs. This allows the Standard Library Modules to handle all of the user's various compiler options and macro settings (e.g. release/debug, static/dynamic, _ITERATOR_DEBUG_LEVEL, warning suppression, restoring removed machinery, disabling Standard machinery, etc.). Building the modules is quite fast, and the std.compat.ixx build is lightweight.

The implementation strategy that I've chosen is minimally intrusive and should be easy to maintain going forward. Top-level Standard classes/functions/variables/etc. need to be marked with _EXPORT_STD, which will expand to export when building the Standard Library Modules, and will expand to nothing otherwise. Due to how modules work, we don't need to mark the contents of classes (e.g. member functions, nested classes, hidden friends), nor do we need to mark partial/explicit specializations. We also don't need to mark namespaces (they will be exported implicitly when anything within is exported). I've merged a bunch of previous cleanup PRs, increasing the STL's use of hidden friends, reducing the number of changes that are necessary here. I'll mention the rare occurrences where we need to export things that don't appear in the Standard, in order to make Standard code work.

As @CaseyCarter noted in a comment, we also don't need to export deduction guides. See [temp.deduct.guide]/1 "Deduction guides are not found by name lookup. Instead, when performing class template argument deduction ([over.match.class.deduct]), all reachable deduction guides declared for the class template are considered." and [module.reach]'s definition of "reachable".

Regarding operators, here's an example showing how we don't need to export hidden friends, but we do need to export ordinary non-members in order for them to be found through ADL:

Click to expand example:
C:\Temp>type meow.ixx
export module meow;

namespace meow {
    export struct point {
        int x;
        int y;

        friend point operator*(const point& l, const point& r) {
            return point{l.x * r.x, l.y * r.y};
        }
    };

    export point operator+(const point& l, const point& r) {
        return point{l.x + r.x, l.y + r.y};
    }

    point operator-(const point& l, const point& r) {
        return point{l.x - r.x, l.y - r.y};
    }
} // namespace meow
C:\Temp>type purr.cpp
import meow;

#include <format>
#include <iostream>

int main() {
    const meow::point a{20, 30};
    const meow::point b{2, 3};

    const meow::point c = a * b;
    std::cout << std::format("c: ({}, {}); hidden friend\n", c.x, c.y);

    const meow::point d = a + b;
    std::cout << std::format("d: ({}, {}); exported\n", d.x, d.y);

#ifdef TEST_NON_EXPORTED
    const meow::point e = a - b;
    std::cout << std::format("e: ({}, {}); non-exported\n", e.x, e.y);
#endif
}
C:\Temp>cl /EHsc /nologo /W4 /std:c++latest meow.ixx purr.cpp /Fepurr.exe && purr
meow.ixx
purr.cpp
Generating Code...
c: (40, 90); hidden friend
d: (22, 33); exported

C:\Temp>cl /EHsc /nologo /W4 /std:c++latest meow.ixx purr.cpp /Fepurr.exe /DTEST_NON_EXPORTED && purr
meow.ixx
purr.cpp
purr.cpp(17): error C2676: binary '-': 'const meow::point' does not define this operator or a conversion to a type acceptable to the predefined operator
purr.cpp(17): error C2737: 'e': const object must be initialized
Generating Code...

The other thing that we need, because we have a "classic" separately compiled build with ABI constraints, is the addition of extern "C++" in various places. This has no effect on non-modules code, so it's being added unconditionally, but in the modules world it has a special meaning.

Going forward, all PRs that are adding or refactoring features will need to mark things as _EXPORT_STD.

I've followed a strict policy of exporting only Standard machinery, with no exceptions for Microsoft extensions (like stdext::checked_array_iterator or <cvt/meow>) or TS machinery <experimental/meow>. Note that deprecated components are still Standard and must be exported. The quasi-exception to this policy is that I'm also exporting previously-Standard-but-removed components, following the philosophy that Standard Library Modules respect all compiler options/settings. (This has no effect for users who are using the default /std:c++latest setting of removing that old machinery. It's just that if users choose to restore it, then Modules won't stand in their way.)

I have chosen to consistently mark all declarations and definitions as exported, even though (I believe) only the first declaration is necessary to mark.

I've updated the wiki's Files To Edit When Adding Or Removing Files.

Changes

  • yvals_core.h
    • List P2465R3 as implemented, define _EXPORT_STD and __cpp_lib_modules.
    • _EXPORT_STD is activated by defined(_BUILD_STD_MODULE) (which is defined by std.ixx, not by users directly) and further guarded by _HAS_CXX23 because we don't want to allow Time-Traveling Standard Library Modules at this time. However, this isn't further guarded for compilers. If someone (including us) wants to build Standard Library Modules with an untested compiler, they can try. Only the feature-test macro is guarded to declare support for the compilers we've tested.
  • VSO_0157762_feature_test_macros/test.compile.pass.cpp
    • Test the feature-test macro.
  • Export headers. Here are the notable changes:
    • csetjmp
    • cstddef
      • Exported nullptr_t. This is special because it was neither defined in the STL nor in VCRuntime. Instead, it was predefined by the compiler (internal src/vctools/Compiler/CxxFE/sl/p1/c/typesrc.c). Now we need to define the alias again in order to export it.
      • Exported byte, its operators, and to_integer().
    • cmath
      • hypot is special: std.compat will need to provide only 2-arg in the global namespace. Therefore, I need an inline namespace _Binary_hypot. (This _Ugly namespace is implicitly exported, which is unavoidable, but also unobservable.)
      • Also exporting 3-arg hypot, lerp, and Special Math.
    • ranges
      • Exported _Pipe::operator|. This is a special case, where we must export something _Ugly in order to make Standard code work (as this will be found through ADL).
    • complex
    • strstream
      • Not exporting 4 non-Standard swap() overloads.
    • exception
      • Added exported declarations of exception and bad_exception. (This is necessary because we get them from VCRuntime.)
      • Note that _HAS_UNEXPECTED is blocked in C++23 mode, so nothing within is exported.
    • typeinfo
      • Added exported declarations of type_info, bad_cast, bad_typeid. (VCRuntime again.)
    • new
      • Added many exported declarations, including the global-scope allocation/deallocation functions. (VCRuntime yet again.) See extern "C++" note below.
      • This is the messiest part of the core exporting changes, because this stuff is separately compiled and normally declared by VCRuntime. To avoid even more verbosity, I'm not copying the SAL annotations (VCRuntime still provides them).
      • The function declarations are guarded by the _BUILD_STD_MODULE control macro to avoid disrupting the internal compiler objsize test.
      • I added a comment about how <exception> exports bad_alloc and bad_array_new_length for _HAS_EXCEPTIONS=0, as this was non-obvious.
  • We need to update the _BITMASK_OPS macro so we can conditionally export the operators that it defines.

extern "C" and extern "C++"

The surprising thing is how much of the STL's separately compiled code works without any additional changes. The compiler knows that extern "C" machinery is attached to the global module. That's why all UCRT headers work, and much of the STL's modern code where we've moved to flat extern "C" interfaces. However, for the older code, we need to add extern "C++". This has always been part of C++'s grammar but was previously near-useless. Now, it attaches declarations to the global module, see N4917 [module.unit]/7.

In two cases, the __std_fs types and the __std_tzdb types, we were defining types outside of extern "C" but using them in the signatures of extern "C" functions. We need to be more disciplined now, so I'm marking these types as extern "C". (However, we need to exclude the _BITMASK_OPS overloads.) This doesn't affect ABI or name mangling for classic code.

Then, we need to mark separately compiled functions/classes/objects as extern "C++". (I found these by searching for occurrences of _CRTIMP2_PURE etc. - the only changes that were semi-automated - and by fixing errors that occurred during testing. For example, this is why various facets are marked.)

For <iostream>, I am making one additional change/cleanup - __PURE_APPDOMAIN_GLOBAL and _CRTDATA2_IMPORT emit __declspecs (sometimes), but previously extern occurred in the middle. It's more consistent to keep them together, and to keep _EXPORT_STD extern "C++" together. This should have no effect on non-modules code.

The Modules Themselves

This adds stl/modules/std.ixx, which is (surprisingly?) simple after all of the above changes. It defines _BUILD_STD_MODULE (with a comment that it assumes that it's going to use classic headers - trying to use header units to build this named module would be weird). It has a global module fragment (the module; section) where the UCRT headers are included - this is not strictly necessary, but recommended by @cdacamar. Then we export module std;, silence a warning because we know what we're doing, and include all Standard and <cmeow> headers.

Then this adds stl/modules/std.compat.ixx, a lightweight module. It's super fast to build because it says export import std; to reuse the std.ifc build, then exports additional names in the global namespace. I looked at each C wrapper header, extracted its exported names, and filtered out names that should only be in std (e.g. byte, lerp, and Special Math are std-only). The names appear in Standard order here, but I'm not reordering the <cmeow> implementations.

I've commented duplicates for clarity (so that comparing <cmeow> to these lists doesn't involve mysteriously missing names).

As previously mentioned, <cmath>'s binary hypot() is special.

For <cstddef>, I'm exporting max_align_t (C11) and nullptr_t (C23) in the global namespace.

Then, we need a few infrastructure changes:

  • In .gitattributes, mark stl/modules as C++.
  • In tools/format/CMakeLists.txt, clang-format stl/modules/*.ixx.
  • Teach stl/CMakeLists.txt how to copy the modules.

Compiler Bug Workarounds

I currently have 3 workarounds which involve marking internal machinery as exported (very targeted and easy to remove later):

  • Work around VSO-1538698 "Better handling for non-exported friend function declarations".
  • Work around VSO-1582358 "Standard Library Modules: chrono formatting emits bogus error C3861: '_Fill_tm': identifier not found"
  • Work around VSO-1592329 "Standard Library Modules: Parallel sort() emits bogus error C2065: '_Atomic_counter_t': undeclared identifier".

One simple workaround for an EDG bug:

  • Work around VSO-1618988 "REPORTED: EDG rejects valid extern "C++" enum class Cats : int;".

There's one additional workaround for strange code in the UCRT:

  • Work around DevCom-1126857 "Visual Studio can't find time() function using modules and std.core".

This workaround is much more elaborate, as it involves defining our own functions instead of just using the UCRT's. (Scenarios that have both import std; and #include <ctime> would experience ambiguity, I believe.) Currently, this is the best known way to get the <ctime> machinery into the std module.

Testing

I'm substantially unifying the Header Units and Modules test coverage. The directories and their scripts remain separate, but the code that exercises each Standard header is shared.

  • Add /D_STL_CALL_ABORT_INSTEAD_OF_INVALID_PARAMETER to P1502R1_standard_library_header_units/env.lst. This was missed by Even more various cleanups #2656.
  • We don't need __cpp_lib_source_location guards.
  • clang-format no longer damages imports.
  • Rename to impl_test_source_location().
  • Sort <memory> before <memory_resource>, <string> before <string_view>.
  • Use separate functions, move dllexport to test_stacktrace().
    • This is better-organized, and will be friendler to code analysis (which doesn't like enormous functions).
  • Guard feature-test macro with TEST_HEADER_UNITS.
  • Generalize messages for both header units and named modules.
  • Move using namespace std; into each function.
    • The rationale is that when this is extracted into test_header_units_and_modules.hpp, we definitely don't want a file-scope using-directive. However, for the Standard headers, we don't really have to worry about Header Units or named Modules emitting vector in the global namespace, so we can just using namespace std;. (We'll be more disciplined about testing std.compat.)
  • Consistently use UDLs.
    • Avoidance was a workaround for compiler bugs that were fixed long ago.
  • Extract test_header_units_and_modules.hpp.
  • Print different messages for <version>.
  • Locally silence deprecation warnings. The headers were previously updated so that merely including/importing them doesn't warn.
  • Add P2465R3_standard_library_modules.
    • This includes changes to tell the test where the modules directory is.
    • This is structured into test.cpp (test Standard headers and the <cmeow> wrappers), test2.cpp (test std.compat), and classic.cpp (isolate force_include.hpp).
    • Compilation is easy - we can use a single command line, as long as we put the .ixx files first. Real usage will use more elaborate command lines to compile the .ixx files separately.
    • /analyze:only /analyze:autolog- isn't taking too long (unlike Header Units) so I am adding configurations here.
    • I am not attempting to exhaustively test every header - just use a bit of code from each header. This can be expanded over time, of course.

Internal Changes, Next Steps

I've prepared internal changes (for MSVC MSBuild, the Perl test harness, and Setup). The remaining things to do are:

  • Check with the Setup devs, as we haven't added a new directory to the STL in about a decade.
  • Notify vctowner as usual.

In followup internal work, we'll need to integrate this with the VS IDE, MSBuild for users, and CMake/Ninja builds.

@microsoft microsoft deleted a comment from CaseyCarter Sep 17, 2022
@microsoft microsoft deleted a comment from CaseyCarter Sep 17, 2022
@microsoft microsoft deleted a comment from CaseyCarter Sep 17, 2022
@microsoft microsoft deleted a comment from CaseyCarter Sep 17, 2022
@StephanTLavavej
Copy link
Member Author

Changelog:

In two cases, the __std_fs types and the __std_tzdb types, we were defining types outside of extern "C" but using them in the signatures of extern "C" functions.

@CaseyCarter:

Why is this a problem? (compiler bug?)

@cdacamar could confirm whether this was a compiler bug or by design. At the time (and a lot has changed in the compiler since I dealt with this), if an extern "C" function mentioned a non-extern "C" (and non-extern "C++") type in its signature, then that type's attachment to the std module would propagate to the function, preventing successful linking.

Even if this was a compiler bug that has been fixed, I still think it is more desirable and consistent for both the types and the functions to be extern "C".

@CaseyCarter
Copy link
Member

Even if this was a compiler bug that has been fixed, I still think it is more desirable and consistent for both the types and the functions to be extern "C".

I should have tagged this comment "No change requested". I'm perfectly happy with the change even if it's no longer necessary, I just wanted to understand why it is/was necessary.

@StephanTLavavej StephanTLavavej self-assigned this Sep 19, 2022
@StephanTLavavej
Copy link
Member Author

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej
Copy link
Member Author

I've pushed a 2-line commit to fix the internal compiler objsize test. I don't fully understand the root cause, but something about repeating <vcruntime_new.h>'s declarations in <new> was causing this "include most/all STL headers and run dumpbin /headers on the resulting object file" to detect 4 new text sections containing these COMDATs:

         COMDAT; sym= "void * __cdecl operator new(unsigned int,void *)" (??2@YAPAXIPAX@Z)
         COMDAT; sym= "void __cdecl operator delete(void *,void *)" (??3@YAXPAX0@Z)
         COMDAT; sym= "void * __cdecl operator new[](unsigned int,void *)" (??_U@YAPAXIPAX@Z)
         COMDAT; sym= "void __cdecl operator delete[](void *,void *)" (??_V@YAXPAX0@Z)

As the intention is to have zero effect on non-modules builds, I've guarded these function declarations with the _BUILD_STD_MODULE control macro. It seemed simplest/nicest to pair this with the #pragma warning push/disable/pop for the SAL annotations. (There are a couple of class declarations below, but since those aren't emitting any COMDATs, I thought it would be more confusing to extend the guard to them.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cxx23 C++23 feature modules C++23 modules, C++20 header units
Projects
None yet
Development

Successfully merging this pull request may close these issues.

P2465R3 Standard Library Modules std And std.compat
3 participants