Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<chrono>: Cache QueryPerformanceFrequency() #646

Merged
merged 16 commits into from
Apr 1, 2020

Conversation

AlexGuteniev
Copy link
Contributor

@AlexGuteniev AlexGuteniev commented Mar 27, 2020

Description

Address #448 by using std::atomic relaxed access

Checklist

Be sure you've read README.md and understand the scope of this repo.

If you're unsure about a box, leave it unchecked. A maintainer will help you.

  • Identifiers in product code changes are properly _Ugly as per
    https://eel.is/c++draft/lex.name#3.1 or there are no product code changes.
  • The STL builds successfully and all tests have passed (must be manually
    verified by an STL maintainer before automated testing is enabled on GitHub,
    leave this unchecked for initial submission).
  • These changes introduce no known ABI breaks (adding members, renaming
    members, adding virtual functions, changing whether a type is an aggregate
    or trivially copyable, etc.).
  • These changes were written from scratch using only this repository,
    the C++ Working Draft (including any cited standards), other WG21 papers
    (excluding reference implementations outside of proposed standard wording),
    and LWG issues as reference material. If they were derived from a project
    that's already listed in NOTICE.txt, that's fine, but please mention it.
    If they were derived from any other project (including Boost and libc++,
    which are not yet listed in NOTICE.txt), you must mention it here,
    so we can determine whether the license is compatible and what else needs
    to be done.

@AlexGuteniev AlexGuteniev requested a review from a team as a code owner March 27, 2020 11:26
@cbezault
Copy link
Contributor

Looks like you're triggering an ICE (Internal Compiler Error) when targeting x86.

@cbezault cbezault linked an issue Mar 27, 2020 that may be closed by this pull request
@AlexGuteniev
Copy link
Contributor Author

Looks like you're triggering an ICE (Internal Compiler Error) when targeting x86.

wow. I think __iso_volatile_store64 for x86, which is something new.
I guess I still should replace __iso_volatile_store64 with _InterlockedCompareExchange for x86.

@AlexGuteniev
Copy link
Contributor Author

AlexGuteniev commented Mar 27, 2020

Actually, for _InterlockedCompareExchange64 can be used every platform. Store is not frequent here.

@cbezault
Copy link
Contributor

Looks like you got some legitimate warnings you need to fix. Leaving the test results here because I need to re-run the CI on your PR because the machine running the x86 tests went down.
https://dev.azure.com/vclibs/STL/_build/results?buildId=2067&view=ms.vss-test-web.build-test-results-tab

@AlexGuteniev
Copy link
Contributor Author

C:\agent_work\1\a\x64\out\inc\chrono(617) : warning C28112: A variable (_Freq_cached) which is accessed via an Interlocked function must always be accessed via an Interlocked function. See line 620: It is not always safe to access a variable which is accessed via the Interlocked* family of functions in any other way.

:-)

I really meant to access the variable once with __iso_volatile_load64 and second time with _InterlockedCompareExchange64.

I think it is a compiler issue with __iso_volatile_load64 not being treated as interlocked access that compiler developers could look into, and for now I can just suppress it.

@cbezault
Copy link
Contributor

Looks like you need to suppress at 617 not 622.

@cbezault cbezault changed the title caching QueryPerformanceFrequency() #448 <chrono>: Cache QueryPerformanceFrequency() Mar 27, 2020
@AlexGuteniev
Copy link
Contributor Author

Now compilation succeeds, but there's some error with vcpkg:

https://dev.azure.com/vclibs/09a07da7-73b5-4bba-8263-2aad1a16c33d/_apis/build/builds/2074/logs/52

2020-03-27T19:48:49.2774039Z ##[error]The HTTP request timed out after 00:01:40.

Any idea?

@cbezault
Copy link
Contributor

That looks like the Azure cache storage falling over. There isn't anything we can do about that but I believe we can treat this run as a green checkmark.

@StephanTLavavej StephanTLavavej added the performance Must go faster label Mar 27, 2020
stl/inc/chrono Outdated Show resolved Hide resolved
stl/inc/chrono Outdated Show resolved Hide resolved
stl/inc/chrono Outdated Show resolved Hide resolved
stl/inc/chrono Outdated Show resolved Hide resolved
@BillyONeal BillyONeal self-assigned this Mar 28, 2020
Copy link
Member

@BillyONeal BillyONeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks!

stl/inc/chrono Outdated Show resolved Hide resolved
stl/inc/chrono Outdated Show resolved Hide resolved
stl/inc/chrono Outdated Show resolved Hide resolved
stl/src/xtime.cpp Outdated Show resolved Hide resolved
@AlexGuteniev
Copy link
Contributor Author

AlexGuteniev commented Mar 28, 2020

I have one concern. As on x86 std::atomic<long long>::store is effectively _InterlockedCompareExchange64, it would use CMPXCHG8B.
This instruction is not what is available on any /arch:IA32-compatible CPU.

So if this change would be available as update of old runtime libraries, it needs confirmation, if none of Windows version affected by the update can run on a CPU without CMPXCHG8B.

According to this, any system starting NT 5.1 (i.e. Windows XP / Windows 2003) will fail with UNSUPPORTED_PROCESSOR bugcheck if CMPXCHG8B is not persent, so looks like it is safe.

@BillyONeal
Copy link
Member

This instruction is not what is available on any /arch:IA32-compatible CPU.

It is true that IA32 CPUs exist without that instruction, but none supported by Visual C++. (The instruction was added on the original Pentium) We do support Opterons from 2005 that don't have cmpxchg16b though. (This is why <atomic> uses cmpxchg8b but not cmpxchg16b)

@AlexGuteniev
Copy link
Contributor Author

Sorry about unnecessary change. _CRTIMP2_PURE long long __cdecl _Query_perf_frequency() made me think it was needed.

@BillyONeal
Copy link
Member

Nothing to be sorry about :). I agree using the pure export macro is strange for that ¯\_(ツ)_/¯

@BillyONeal
Copy link
Member

@cbezault Is that TR1 failure the same race you investigated before?

@AlexGuteniev
Copy link
Contributor Author

It is a surprising to have a race in a single-threaded program. Unless it is a race against some (asynchronous) cache. Maybe something simpler, like out of disk space?

@BillyONeal
Copy link
Member

@AlexGuteniev Multiple copies of the same test .exe are running and trying to party on the same temporary file. I think.

@cbezault
Copy link
Contributor

Yes that's the spurious tr1 failure I just resolved in #647.

@cbezault
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

stl/src/xtime.cpp Outdated Show resolved Hide resolved
@BillyONeal BillyONeal merged commit 8e8453c into microsoft:master Apr 1, 2020
@BillyONeal
Copy link
Member

Thanks for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

<chrono>: Consider caching QueryPerformanceFrequency()
5 participants