Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update developer guide to recommend no default stream parameter. #10136

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 8 additions & 6 deletions cpp/docs/DEVELOPER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,9 @@ implemented using asynchronous APIs on the default stream (e.g., stream 0).

The recommended pattern for doing this is to make the definition of the external API invoke an
internal API in the `detail` namespace. The internal `detail` API has the same parameters as the
public API, plus a `rmm::cuda_stream_view` parameter at the end defaulted to
public API, plus a `rmm::cuda_stream_view` parameter at the end with no default value. If the
detail API also accepts a memory resource parameter, the stream parameter should be ideally placed
just *before* the memory resource. The public API will call the detail API and provide
`rmm::cuda_stream_default`. The implementation should be wholly contained in the `detail` API
definition and use only asynchronous versions of CUDA APIs with the stream parameter.

Expand All @@ -362,14 +364,14 @@ void external_function(...);

// cpp/include/cudf/detail/header.hpp
namespace detail{
void external_function(..., rmm::cuda_stream_view stream = rmm::cuda_stream_default)
void external_function(..., rmm::cuda_stream_view stream)
} // namespace detail

// cudf/src/implementation.cpp
namespace detail{
// defaulted stream parameter
// Use the stream parameter in the detail implementation.
void external_function(..., rmm::cuda_stream_view stream){
// implementation uses stream w/ async APIs
// Implementation uses the stream with async APIs.
rmm::device_buffer buff(...,stream);
CUDA_TRY(cudaMemcpyAsync(...,stream.value()));
kernel<<<..., stream>>>(...);
Expand All @@ -378,8 +380,8 @@ namespace detail{
} // namespace detail

void external_function(...){
CUDF_FUNC_RANGE(); // Auto generates NVTX range for lifetime of this function
detail::external_function(...);
CUDF_FUNC_RANGE(); // Generates an NVTX range for the lifetime of this function.
detail::external_function(..., rmm::cuda_stream_default);
}
```

Expand Down