Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NCCL RDMA expects fi_cq_data_entry, but OPX provider fills CQ with fi_cq_tagged_entry #609

Open
lsavers opened this issue Sep 18, 2024 · 1 comment

Comments

@lsavers
Copy link

lsavers commented Sep 18, 2024

Commit rdma: switch to untagged send/recv (12ed337) removed the use of tagged entries for NCCL protocol RDMA. Even though RDMA uses untagged send/recv operations, the CQ format attribute is initialized based on the capability flag from the provider.

if (info->caps & FI_TAGGED) {
cq_attr.format = FI_CQ_FORMAT_TAGGED;
} else {
cq_attr.format = FI_CQ_FORMAT_DATA;
}

OPX has FI_TAGGED set for the provider's capabilities, so the CQ format is being set to FI_CQ_FORMAT_TAGGED and then the CQ is filled with fi_cq_tagged_entry's by OPX. Should the spots that changed fi_cq_tagged_entry to fi_cq_data_entry still be fi_cq_tagged_entry to handle the CQ filling with that type?

static inline int handle_bounce_recv(nccl_net_ofi_rdma_device_t *device, int rail_id, struct fi_cq_data_entry *cq_entry,
static inline int handle_write_comp(struct fi_cq_data_entry *cq_entry, nccl_net_ofi_rdma_device_t *device, int rail_id)
static inline int process_completions(struct fi_cq_data_entry *cq_entry, uint64_t num_cqes, nccl_net_ofi_rdma_device_t *device,
struct fi_cq_data_entry cqe_buffers[cq_read_count];

Or should the cq_attr.format be set based on a different condition?

if (info->caps & FI_TAGGED) {

@rauteric
Copy link
Contributor

Hello,

Interestingly enough, this exact problem came up during the review for this code: #361 (comment). The conclusion was that FI_TAGGED is a "primary capability", which means it should not be enabled unless specifically requested by the application.

As Raghu noted, here is the relevant text from the Libfabric spec (https://ofiwg.github.io/libfabric/v1.22.0/man/fi_getinfo.3.html):

Capabilities may be grouped into three general categories: primary, secondary, and primary modifiers. Primary capabilities must explicitly be requested by an application, and a provider must enable support for only those primary capabilities which were selected. Primary modifiers are used to limit a primary capability, such as restricting an endpoint to being send-only. If no modifiers are specified for an applicable capability, all relevant modifiers are assumed. See above definitions for details.

My reading is that, since the plugin does not request FI_TAGGED capability, the OPX provider should not enable it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants