Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement P0040R3 parallel specialized memory algorithms #3145

Merged

Conversation

frederick-vs-ja
Copy link
Contributor

@frederick-vs-ja frederick-vs-ja commented Oct 8, 2022

Fixes #525.

I decided to parallelize the following algorithms:

  • destroy
  • destroy_n
  • uninitialized_default_construct
  • uninitialized_default_construct_n
  • uninitialized_value_construct
  • uninitialized_value_construct_n

because IMO they are similar enough to for_each and for_each_n.

The following algorithms are currently not parallelized:

  • uninitialized_copy
  • uninitialized_copy_n
  • uninitialized_fill
  • uninitialized_fill_n
  • uninitialized_move
  • uninitialized_move_n

because the normal corresponding algorithms in <algorithm> are also not parallelized.

It is doubtful to me whether copy/move/fill families are really not worth being parallelized. But perhaps we should address this in another issue.

Parallelized algorithms:
* destroy
* destroy_n
* uninitialized_default_construct
* uninitialized_default_construct_n
* uninitialized_value_construct
* uninitialized_value_construct_n

Non-parallelized algorithms:
* uninitialized_copy
* uninitialized_copy_n
* uninitialized_fill
* uninitialized_fill_n
* uninitialized_move
* uninitialized_move_n
@frederick-vs-ja frederick-vs-ja requested a review from a team as a code owner October 8, 2022 04:06
@AlexGuteniev
Copy link
Contributor

Do we want to have some benchmark to prove parallelizing works?

@CaseyCarter CaseyCarter added the performance Must go faster label Oct 10, 2022
@CaseyCarter CaseyCarter added bug Something isn't working and removed performance Must go faster labels Oct 10, 2022
@StephanTLavavej StephanTLavavej self-assigned this Oct 12, 2022
stl/inc/execution Show resolved Hide resolved
stl/inc/execution Outdated Show resolved Hide resolved
stl/inc/execution Outdated Show resolved Hide resolved
stl/inc/execution Outdated Show resolved Hide resolved
stl/inc/execution Outdated Show resolved Hide resolved
stl/inc/memory Outdated Show resolved Hide resolved
@CaseyCarter CaseyCarter self-requested a review December 16, 2022 19:44
struct test_case_uninitialized_value_construct_unwrap_parallel {
template <class ExecutionPolicy>
void operator()(const size_t testSize, const ExecutionPolicy& exec) {
auto vec = vector<int>(testSize, bad_int);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To other reviewers: I'm not complaining about the use of AAA here, because I appreciate how it increases consistency in these test cases. I prefer auto x = f(args...); and auto y = T(args...); to auto x = f(args...); and T y(args...);. YMMV. (Also on 208.) (No change requested.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm (reluctantly) okay with this rationale.

@CaseyCarter CaseyCarter removed their assignment Jan 5, 2023
for (; _Count > 0; --_Count, (void) ++_UFirst) {
_STD _Construct_in_place(*_UFirst);
}
_STD _Seek_wrapped(_First, _UFirst);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change requested: I observe that we could _STD _Seek_wrapped(_First, _Operation._Basis._Populate(_Operation._Team, _UFirst)); above (right line 5186), then exhaust parallelism resources and fall through to here, where we _Seek_wrapped again. However, this is correct because _Seek_wrapped is a "seek to"/assignment, not a "seek by", and there are no other uses of _First that could be damaged by it having been unexpectedly updated.

Comment on lines +66 to +69
unique_ptr<T, deallocating_only_deleter<T>> up{al.allocate(n), deallocating_only_deleter<T>{n}};
for (size_t i = 0; i != n; ++i) {
allocator_traits<allocator<T>>::construct(al, up.get() + i);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change requested: it seems unusual that an array of elements is being stored in unique_ptr<T, DEL> instead of unique_ptr<T[], DEL>. However, I am unsure of the implications of making such a change (some are positive; unique_ptr<T[]> disables conversions that are bogus for arrays). This is only test code, so it's allowed to be somewhat squirrelly. 🐿️

@StephanTLavavej StephanTLavavej removed their assignment Jan 20, 2023
@StephanTLavavej StephanTLavavej dismissed barcharcraz’s stale review January 20, 2023 03:16

Changes requested in Nov 2022 have been made, purr!

@StephanTLavavej StephanTLavavej self-assigned this Jan 21, 2023
@StephanTLavavej
Copy link
Member

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej
Copy link
Member

I pushed a conflict-free merge with main, followed by a commit to disable the test for /clr due to VSO-1664463 "/clr C++20 alignas emits error C3821 'managed type or function cannot be used in an unmanaged function' instead of falling back to native codegen".

@StephanTLavavej StephanTLavavej merged commit c42483d into microsoft:main Jan 22, 2023
@StephanTLavavej
Copy link
Member

Thanks for truly finishing C++17! 😻 💯 🎉

@frederick-vs-ja frederick-vs-ja deleted the parallel-memory-algorithms branch January 22, 2023 06:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

<memory>: Missing C++17 ExecutionPolicy overloads
6 participants