-
Notifications
You must be signed in to change notification settings - Fork 187
Conversation
youyu3
commented
Aug 9, 2022
- Pulls the mdspan reference implementation from branch "stable" of the kokkos repo, https://github.com/kokkos/mdspan, up to PR 172.
- Uglified internal identifiers and made some naming convention updates.
include/cuda/std/detail/libcxx/include/experimental/__p0009_bits/config.hpp
Outdated
Show resolved
Hide resolved
include/cuda/std/detail/libcxx/include/experimental/__p0009_bits/default_accessor.hpp
Outdated
Show resolved
Hide resolved
include/cuda/std/detail/libcxx/include/experimental/__p0009_bits/extents.hpp
Outdated
Show resolved
Hide resolved
include/cuda/std/detail/libcxx/include/experimental/__p0009_bits/extents.hpp
Outdated
Show resolved
Hide resolved
include/cuda/std/detail/libcxx/include/experimental/__p0009_bits/extents.hpp
Outdated
Show resolved
Hide resolved
include/cuda/std/detail/libcxx/include/experimental/__p0009_bits/trait_backports.hpp
Outdated
Show resolved
Hide resolved
int main(int, char**) | ||
{ | ||
// workaround to avoid compilation errors | ||
cuda::std::experimental::detail::__make_dynamic_extent<int>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@youyu3 Could you please add details of the compiler version and flags here? This feels very much like a compiler bug. __make_dynamic_extent<T>()
literally just returns T(-1)
. Its definition should be available, so it shouldn't need to be explicitly instantiated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this is due to a compiler bug too. I'll create a reproducer and update here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@youyu3 If __make_dynamic_extent
doesn't work, we could try this:
template<class... Integrals>
requires((is_convertible_v<Integrals, size_t> && ...))
explicit extents(Integrals... indices) -> extents<size_t, size_t(indices, -1)...>;
What matters here is on the right side of the ->
. The (C++17) fold expression size_t(indices, -1)...
expands to something like size_t(-1)
, size_t(-1)
, ..., size_t(-1)
, where the number of -1
s is sizeof...(indices)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revision:
template <class... IndexTypes>
extents(IndexTypes...)
-> extents<std::size_t, std::size_t((IndexTypes(), -1))...>;
The comma expression needed an extra (), else it was being interpreted as a cast with two arguments (not a valid cast).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Pushed the workaround.
#include <cuda/std/array> | ||
#ifndef __cuda_std__ | ||
#include <array> | ||
#endif | ||
#ifdef __cpp_lib_span | ||
#include <span> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need to protect #include <span>
with #ifndef __cuda_std__
... #endif
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a libcudacxx
version of span
(or I didn't find one).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@youyu3 Oh, oof : - ( Should we file an issue to add one? We would need it for mdspan just for a few things that take span
as input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we have and it worked for my initial tests. More tests to be added.
@@ -73,21 +73,21 @@ class layout_left::mapping { | |||
struct __rank_count {}; | |||
|
|||
template <::std::size_t _r, ::std::size_t _Rank, class _Ip, class... _Indices> | |||
__MDSPAN_INLINE_FUNCTION | |||
__MDSPAN_HOST_DEVICE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@youyu3 I'm a bit confused -- doesn't MDSPAN_INLINE_FUNCTION
include MDSPAN_HOST_DEVICE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. But I'm only adding __host__ __device__
while matching the declaration before the change, where the function wasn't declared as inline
.
|
||
static_assert( m.is_exhaustive() == true ); | ||
|
||
assert( m.extents().rank() == 2 ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[OPTIONAL] rank()
or rank_dynamic()
are always static constexpr
and thus can be checked with static_assert
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched to static_assert
.
|
||
stdex::layout_right::mapping<stdex::extents<size_t,dyn, dyn>> m{stdex::dextents<size_t,2>{16, 32}}; | ||
|
||
static_assert( m.is_exhaustive() == true ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_exhaustive()
happens to be static constexpr
for layout_right
, but please note that this is not generally true for all layout mappings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'll change this to assert
then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's OK here -- I'm just pointing this out in case you want to write generic tests. Thanks!
Generally speaking I am extremely concerned about the implementation complexity here. I only looked at Especially compile times are going to be incredibly difficult here. Also what I am generally wondering is why we are putting this in namespace stdex. It was merged into the standard after all |
Hi @miscco ! We talked a bit about
This is a user library (not a vendor's implementation of the Standard Library). Thus, it's not allowed to put things in namespace mdspan was merged into C++23. The reference mdspan implementation works with C++14, 17, and 20 as well. For those back-porting use cases, it's important that we don't break applications' builds, in case those applications adopt a later compiler version that provides an mdspan implementation. Thus, we don't want to collide with |
@miscco I wonder if some of the complexity of the reference mdspan implementation comes from an earlier requirement to back-port to C++11 (with its much more restrictive requirements for |
I think the comment about extents could be simpler are fair. Part of the problem was trying to work around a number of issues with empty base optimization in MSVC and Intel compiler. MSVC did get much better though in the last couple years. That said: looking at some assembly it does look like the reference implementation compiles cleaner: |
Thinking more about this: there was a lot of work on extents, to essentially make the .extent(i) function optimize really well. Without the complication I think you end up with something which goes through some loop structures to figure out which element to access. I.e. you kinda rebuild the mapping for index to "which element from the dynamic extents array do I need to access" every time or so? But am not 100% sure. |
Yeah I believe my |
Note: with -O3 and GCC 11 it does compile into the same instruction sequence. I just told Mark, I am totally open to entertaining a whole sale replacement on our side. I think we should underpin this with some data down the line, for example we are working on using mdspan in Kokkos which then would make a whole lot of applications use it. If we collect data from that side regarding compile time/binary size/performance and from your applications we should have a good basis to make decisions. |
Can we defer any significant refactoring discussion to future work/PR? I don't want to delay @youyu3's PR any more than necessary. |
@jrhemstad Don't worry! These investigations should not affect the current release. Discussions continue on the reference mdspan implementation issue kokkos/mdspan#190 . |
// 1. Redistributions of source code must retain the above copyright | ||
// notice, this list of conditions and the following disclaimer. | ||
// | ||
// 2. Redistributions in binary form must reproduce the above copyright |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be sure, is that license compatible with the usual one.
Do we have SWIPAT clearance for adding. Binary distribution clauses scare me a lot
mdspan
reference implementationmdspan
reference implementation
#include <experimental/mdspan> | ||
#include <cassert> | ||
#include <array> | ||
#include <span> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include <span> | |
#ifdef __cpp_lib_span | |
#include <span> | |
#endif |
I don't think you can even include <span>
unless that feature test macro is defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Mark, you are right. However, I don't think an inclusion of <span>
is even needed here, since it's already included by <mdspan>
. Tests ran fine without this line.
Actually, #include <array>
is not needed here either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@youyu3 That's reasonable, though please note that mdspan only includes it if __cpp_lib_span
is defined. For example: https://github.com/kokkos/mdspan/blob/9e2b68a3be59c867587cdfbb641153199c813ff5/include/experimental/__p0009_bits/standard_layout_static_array.hpp#L56
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm aware of that. And any test that uses span
should be guarded by #ifdef __cpp_lib_span
as well (which has been fixed).
Hey @youyu3 sorry for the last minute change, but based on some internal conversation, I think we can drop the |
Okay. Will do. Then I suppose that we should move the files out of the |
@@ -52,9 +52,6 @@ | |||
#endif | |||
|
|||
_LIBCUDACXX_BEGIN_NAMESPACE_STD | |||
namespace experimental { | |||
|
|||
__MDSPAN_INLINE_VARIABLE constexpr auto dynamic_extent = ::std::numeric_limits<::std::size_t>::max(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed when span
is not available. Fixed in a later commit...
@@ -0,0 +1,64 @@ | |||
//===----------------------------------------------------------------------===// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent test!
It's far too much for me to review in detail, but I'm really impressed and happy with all the new unit tests! Thank you also for addressing those code comments we talked about earlier today! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not fully through with the product code.
That said, I want to reiterate, that I do not believe this is code is maintainable for us. I accept that we currently do not have capacity to write it from scratch and working code is definitely better than no code.
Nevertheless, I would want a commitment that we actually get the time to clean this up once we get the big features we are working on right now out of the window.
Before we can merge it we would need to remove all remnants of std::
and make sure we are using the proper namespaces throughout.
} | ||
|
||
__MDSPAN_INLINE_FUNCTION_DEFAULTED | ||
constexpr __compressed_pair() noexcept = default; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conventionally, we do not decorate defaulted or delete functions, so this would be
__compressed_pair() = default;
That said this is definitely something we can do afterwards on our own, so no change requested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@miscco wrote:
Conventionally, we do not decorate defaulted or delete functions....
Does this mean that =default
ed functions do not need the decoration? Historically, we've had issues with that.
include/cuda/std/detail/libcxx/include/__mdspan/trait_backports.hpp
Outdated
Show resolved
Hide resolved
include/cuda/std/detail/libcxx/include/__mdspan/trait_backports.hpp
Outdated
Show resolved
Hide resolved
include/cuda/std/detail/libcxx/include/__mdspan/trait_backports.hpp
Outdated
Show resolved
Hide resolved
* @tparam _ValsSeq | ||
* @tparam __sentinal | ||
*/ | ||
template <class _Tag, class _Tp, class _static_t, class _ValsSeq, _static_t __sentinal = static_cast<_static_t>(dynamic_extent), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole file is pain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@miscco I can never get a certain someone to spell "sentinel" right ; - ) .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit history appears to be broken. I'll try to resolve this locally and possibly issue a correction, but AFAICT there seems to be a replay of several commits.
I pulled commits from the
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests will likely need //UNSUPPORTED: c++11
below the license and above the includes.
This letds lit
know that the tests are not designed for pre-C++14.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Results are crunchy, but it looks like there is maybe only a few failure points.
I made a PR on gitlab that fixes the history. It drops the <span>
PR and gets rid of some strange re-merge, but I think the content is identical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Wesley. I can add the //UNSUPPORTED: c++11
to the github PR. Would you be able to merge that to the gitlab PR for CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what CUDA versions do our CI tests cover?
@@ -534,17 +534,9 @@ struct __standard_layout_psa<_Tag, _Tp, _static_t, _CUDA_VSTD::integer_sequence< | |||
//-------------------------------------------------------------------------- | |||
|
|||
__MDSPAN_INLINE_FUNCTION_DEFAULTED | |||
constexpr __standard_layout_psa() noexcept | |||
#if defined(__clang__) || defined(__MDSPAN_DEFAULTED_CONSTRUCTORS_INHERITANCE_WORKAROUND) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@youyu3
Did the above change fix this issue, making this Clang work-around no longer needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my tests with clang, yes the work-around is no longer needed.
@@ -19,7 +21,7 @@ void test_span_con() | |||
using TestFixture = TestExtents<T>; | |||
TestFixture t; | |||
|
|||
auto s = std::span( t.dyn_sizes ); | |||
auto s = std::span<const size_t, t.dyn_sizes.size()>( t.dyn_sizes ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@youyu3 If you're using std::span
, you'll need C++20. Would you consider using cuda::std::span
(if it exists) instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is for testing libcudacxx in the libcxx mode. std::span
is implemented in this mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah excellent, thank you for explaning!
@@ -49,18 +49,18 @@ int main(int, char**) | |||
test_array_con< cuda::std::tuple_element_t< 4, extents_test_types > >(); | |||
test_array_con< cuda::std::tuple_element_t< 5, extents_test_types > >(); | |||
|
|||
static_assert( is_array_cons_avail_v< cuda::std::dextents<int,2>, int , 2 > == true ); | |||
static_assert( is_array_cons_avail_v< cuda::std::dextents<int,2>, int , 2 > == true , "" ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI, these are changes to support C++14 -- C++17 implements one-argument static_assert
.
Also make it available in C++14 onwards