[BUG]: Unfavourable interaction between thrust::sort and thrust::sort_by_key #655

mvieth · 2023-11-03T11:04:53Z

Is this a duplicate?

I confirmed there appear to be no duplicate issues for this bug and that I agree to the Code of Conduct

Type of Bug

Silent Failure

Component

Not sure

Describe the bug

I noticed that thrust::sort_by_key fails/gives a wrong result in a very specific circumstance, namely:

thrust::sort_by_key is used in a shared library, let's say library A
thrust::sort is used in a second shared library, let's say library B
both are linked to the main program
thrust::sort_by_key is called with more than 4864 elements (threshold for selecting specific sorting algorithm)

With git bisect, I determined that this problem occurs since NVIDIA/cub@c4299c4 , meaning that all thrust/cub 2.x.y versions are affected, but 1.x.y versions are fine.
The problem occurs with GCC under Linux, but not with MSVC under Windows.
When I run it with compute-sanitizer, it shows Program hit cudaErrorMissingConfiguration (error 52) due to "__global__ function call is not configured" on CUDA API call to cudaGetLastError for histogram_kernel and exclusive_sum_kernel.
My best guess what happens: some symbols in library A and library B get confused during linking, possibly because some functions (like DeviceRadixSortExclusiveSumKernel) don't have ValueT in their template parameter list (which is cub::NullType for thrust::sort and something else for thrust::sort_by_key).
This might happen with other thrust functions (unconfirmed but possible, I think).
This bug was first noticed in PointCloudLibrary/pcl#5846

How to Reproduce

Here is a minimal reproducible example: thrust_test.zip

Expected behavior

thrust::sort_by_key always give the correct result (sorted)

Reproduction link

No response

Operating System

Linux (exact version or distro does not matter)

nvidia-smi output

Not relevant for the problem, as far as I can tell

NVCC version

Not relevant, but thrust/cub must be version 2.0.0 or newer (as described above)

The text was updated successfully, but these errors were encountered:

github-actions · 2023-11-03T11:05:08Z

Hi @mvieth!

Thanks for submitting this issue - the CCCL team has been notified and we'll get back to you as soon as we can!
In the mean time, feel free to add any relevant information to this issue.

gevtushenko · 2023-11-03T15:21:25Z

@mvieth thank you for reporting the issue! It should've been addressed by #443. Could you please try CCCL/main to see if you still can reproduce it?

mvieth · 2023-11-04T10:26:55Z

@gevtushenko Yes, that solves it. Thanks!

mvieth added the bug Something isn't working right. label Nov 3, 2023

github-actions bot assigned jrhemstad Nov 3, 2023

github-actions bot added the needs triage Issues that require the team's attention label Nov 3, 2023

mvieth closed this as completed Nov 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: Unfavourable interaction between thrust::sort and thrust::sort_by_key #655

[BUG]: Unfavourable interaction between thrust::sort and thrust::sort_by_key #655

mvieth commented Nov 3, 2023

github-actions bot commented Nov 3, 2023

gevtushenko commented Nov 3, 2023

mvieth commented Nov 4, 2023

[BUG]: Unfavourable interaction between thrust::sort and thrust::sort_by_key #655

[BUG]: Unfavourable interaction between thrust::sort and thrust::sort_by_key #655

Comments

mvieth commented Nov 3, 2023

Is this a duplicate?

Type of Bug

Component

Describe the bug

How to Reproduce

Expected behavior

Reproduction link

Operating System

nvidia-smi output

NVCC version

github-actions bot commented Nov 3, 2023

gevtushenko commented Nov 3, 2023

mvieth commented Nov 4, 2023