Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hiding symbols from shared object libraries #1723

Closed
wants to merge 10 commits into from

Conversation

cjnolet
Copy link
Member

@cjnolet cjnolet commented Aug 7, 2023

Partially addresses #1722.

@cjnolet cjnolet added feature request New feature or request non-breaking Non-breaking change labels Aug 7, 2023
@cjnolet cjnolet self-assigned this Aug 7, 2023
@cjnolet cjnolet requested a review from a team as a code owner August 7, 2023 21:29
@github-actions github-actions bot added the cpp label Aug 7, 2023
@ahendriksen
Copy link
Contributor

We should check the binary size of libraft after this change. We have a lot of duplicate kernels. When they are static they will be duplicated in each TU and linked into libraft.so. The impact might also be smaller than expected because the linker seems not to deduplicate kernels with weak linkage anyway(?)

@jrhemstad
Copy link

Strictly speaking, this is only half of the solution. The other half is to annotate all non-kernel template functions that aren't part of the API of any pre-built DLLs with __attribute__((visibility("hidden"))).

@cjnolet
Copy link
Member Author

cjnolet commented Aug 8, 2023

Strictly speaking, this is only half of the solution

@jrhemstad yep, that's right. The first half of the solution was a simple search and replace. I realized the second half is much more involved and is going to require scraping through the entire codebase and adding a macro before every host template function that launches a kernel.

@jrhemstad
Copy link

jrhemstad commented Aug 8, 2023

After discussing internally with @ahendriksen, allow me to contradict myself (again) and suggest that the macro you want is actually:

#if defined(__CUDACC__)
#   if !defined(__CUDACC_RDC__)
#       define _RAFT_KERNEL \
            static __global__
#   else
#       define _RAFT_KERNEL \
            __attribute__ ((visibility ("hidden"))) __global__
#   endif
#else
#   define _RAFT_KERNEL
#endif

This is because when compiling with -rdc=true then the device linker will deduplicate all the identical kernels in the same shared object library and you likely want to preserve that behavior by not marking the kernels as static. You still want to hide the symbols from the resulting .so though, so instead we annotate it with the hidden attribute.

For more info, see: NVIDIA/cccl#166 (comment)

@copy-pr-bot
Copy link

copy-pr-bot bot commented Aug 30, 2023

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@cjnolet
Copy link
Member Author

cjnolet commented Aug 30, 2023

/ok to test

@cjnolet
Copy link
Member Author

cjnolet commented Aug 30, 2023

/ok to test

@jrhemstad
Copy link

fyi, here's the solution we ended up using in Thrust/CUB: NVIDIA/cccl#443

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpp feature request New feature or request non-breaking Non-breaking change
Projects
Development

Successfully merging this pull request may close these issues.

4 participants