-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Unexpected outliers in TSNE results #3057
Comments
Ok addition to isolating the cause of these outliers, I think this exposes a larger problem, which is that we need a better way to test for potential issues like this. Something to keep on mind: I wonder if we would create a test harness on some real-world datasets and find a good density-based or graph clustering to use for validation, in addition to trustworthiness. |
Bisected to 6a93762 with ~95% confidence. This is difficult to assess because
Note that even the best plots before that commit are still not quite as "pretty" as I'd hope (compared to exact, CannyLab's FFT or #3058). |
Scratch that: that commit aimed to fix a deadlock, I was thinking of a different commit. Now I'm even more curious how changing the cache pref for the summarization kernel between Shared and L1 can cause a deadlock or change the output. Seems like it has to be a timing/synchronization bug, right? |
@zbjornson, Here's the TSNE projection from our single-cell examples with your PR: And here's the TSNE projection pre-0.16: At first glance, it appears your PR fixes the problem. |
Fixes #3057 Co-authored-by: Corey J. Nolet <[email protected]>
With cuML version 0.16, I'm noticing some strange outliers suddenly in the rapids-single-cell example notebooks. Please refer to the notebook in the repository for the expected output. Below is the output I am getting when running the notebook with 0.16:
The same issue is happening on the 1M cells notebook, though the outliers look much more extreme:
I installed the 0.16 environment using the following yaml file (CUDA toolkit version is 10.2 and driver version is 11.0):
Since there have been recent changes to TSNE, it would probably be best to bisect through the commit history in 0.16 to find where this started.
The text was updated successfully, but these errors were encountered: