Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Add __forceinline__ to thrust::detail::wrapped_function::operator() #1647

Merged
merged 1 commit into from
May 5, 2022
Merged

Add __forceinline__ to thrust::detail::wrapped_function::operator() #1647

merged 1 commit into from
May 5, 2022

Conversation

mkuron
Copy link
Contributor

@mkuron mkuron commented Mar 26, 2022

This pull request adds forced inlining to thrust::detail::wrapped_function::operator(). This makes sure that when a function with the __forceinline__ attribute is wrapped (e.g. when it is passed to thrust::for_each), it is actually inlined into the caller (e.g. thrust::for_each) and not just inlined into the wrapper, which may or may not have been inlined into the caller automatically.

When a function that does not have the __forceinline__ attribute is wrapped, this pull request only has a minor effect. Previously the compiler could decide to inline the wrapper into the caller or the function into the wrapper (it would always do one or the other because the wrapper is so simple). Now it can only decide to inline the function into the wrapper or not as the wrapper is always inlined into the caller.

@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

@ericniebler ericniebler added type: enhancement New feature or request. P2: nice to have Desired, but not necessary. labels Mar 28, 2022
@ericniebler
Copy link
Collaborator

run tests

@alliepiper
Copy link
Collaborator

run tests

@alliepiper alliepiper added this to the 1.17.0 milestone May 5, 2022
@alliepiper alliepiper merged commit 8d45932 into NVIDIA:main May 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P2: nice to have Desired, but not necessary. type: enhancement New feature or request.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants