Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move definitions of execution space macros into cccl #1199

Merged
merged 5 commits into from
Dec 14, 2023

Conversation

miscco
Copy link
Collaborator

@miscco miscco commented Dec 8, 2023

We want to avoid redefining or undefing __host__, __device__ or __forceinline__ To make this error proof, we define our own macros when we know they are available

Fixes #1173

Copy link
Collaborator

@gevtushenko gevtushenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, make sure that there's no internal macro in docs and comments. I can also see __host__ __device__ in barrier header.

cub/cub/block/block_adjacent_difference.cuh Outdated Show resolved Hide resolved
cub/cub/block/block_adjacent_difference.cuh Show resolved Hide resolved
cub/cub/device/dispatch/dispatch_adjacent_difference.cuh Outdated Show resolved Hide resolved
thrust/thrust/detail/config/forceinline.h Outdated Show resolved Hide resolved
thrust/CHANGELOG.md Outdated Show resolved Hide resolved
We want to avoid redefining or undefing `__host__`, `__device__` or `__forceinline__`
To make this error proof, we define our own macros when we know they are available

Fixes NVIDIA#1173
cub/cub/device/device_merge_sort.cuh Outdated Show resolved Hide resolved
thrust/thrust/copy.h Outdated Show resolved Hide resolved
@miscco miscco merged commit a51b1f8 into NVIDIA:main Dec 14, 2023
538 checks passed
@miscco miscco deleted the unify_execution_space_macros branch December 14, 2023 07:23
dkolsen-pgi added a commit to dkolsen-pgi/cccl that referenced this pull request Dec 16, 2023
NVC++ stdpar mode enables CUDA support but does not define the macro
`__CUDACC__`.  Any code in CCCL (which is used by NVC++ stdpar under the
covers) needs to check for either `__CUDACC__` or `_NVHPC_CUDA` being
set.

NVIDIA#1199 only checks for `__CUDACC__`
when defining `_CCCL_HOST` and `_CCCL_DEVICE` and some other macros.
This change completely broke `nvc++ -stdpar` because all functions
became unannotated host functions.

Fix this by changing several files in
libcudacxx/include/cuda/std/detail/libcxx/include to check for either
`__CUDACC__` or `_NVHPC_CUDA`.
@dkolsen-pgi dkolsen-pgi mentioned this pull request Dec 16, 2023
miscco pushed a commit that referenced this pull request Dec 16, 2023
NVC++ stdpar mode enables CUDA support but does not define the macro
`__CUDACC__`.  Any code in CCCL (which is used by NVC++ stdpar under the
covers) needs to check for either `__CUDACC__` or `_NVHPC_CUDA` being
set.

#1199 only checks for `__CUDACC__`
when defining `_CCCL_HOST` and `_CCCL_DEVICE` and some other macros.
This change completely broke `nvc++ -stdpar` because all functions
became unannotated host functions.

Fix this by changing several files in
libcudacxx/include/cuda/std/detail/libcxx/include to check for either
`__CUDACC__` or `_NVHPC_CUDA`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[BUG]: Thrust should not be #define-ing __host__ and __device__ for the NVHPC compiler
3 participants