Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cuda::ptx:mbarrier_{try/test}_wait{_parity} #674

Merged
merged 5 commits into from
Nov 15, 2023

Conversation

ahendriksen
Copy link
Contributor

@ahendriksen ahendriksen commented Nov 8, 2023

Description

Add mbarrier.test_wait, mbarrier.try_wait exposure as well as the .parity variants.

closes #673

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@ahendriksen ahendriksen requested review from a team as code owners November 8, 2023 16:32
@ahendriksen ahendriksen requested review from griwes and miscco and removed request for a team November 8, 2023 16:32
Copy link
Contributor Author

@ahendriksen ahendriksen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs and source should be done and in reviewable shape. The test causes ptxas to segfault on CTK 12.2 (but the generated PTX seems to be okay). Still trying to find a way to prevent this.


#if __cccl_ptx_isa >= 700
NV_IF_TARGET(NV_PROVIDES_SM_80, (
if (threadIdx.x > thread_filter++) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These weird if statements have been successful in the past to prevent ptxas from segfaulting. Not anymore though. I am still working on a fix.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it help to offload the statement into a separate function?

@jrhemstad jrhemstad added the backport branch/2.3.x For backporting to the 2.3.x release branch label Nov 8, 2023

#if __cccl_ptx_isa >= 700
NV_IF_TARGET(NV_PROVIDES_SM_80, (
if (threadIdx.x > thread_filter++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it help to offload the statement into a separate function?

@ahendriksen ahendriksen changed the title [FEA] Add PTX mbarrier.test_wait/try_wait exposure Add cuda::ptx:mbarrier_{try/test}_wait{_parity} Nov 9, 2023
@jrhemstad jrhemstad merged commit 2776423 into NVIDIA:main Nov 15, 2023
516 checks passed
Copy link
Contributor

Successfully created backport PR for branch/2.3.x:

@jrhemstad jrhemstad removed the backport branch/2.3.x For backporting to the 2.3.x release branch label Nov 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[FEA]: Add PTX mbarrier.test_wait/mbarrier.try_wait .cluster scope
3 participants