Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Add slow high-precision mode to KNN #3304

Merged
merged 10 commits into from
Jan 26, 2021

Conversation

wphicks
Copy link
Contributor

@wphicks wphicks commented Dec 14, 2020

Provide mode to perform a second high-precision pass over results returned from brute-force KNN searches which make use of L2-derived metrics. This provides a workaround for issues with numerical instability in L2 distance calculations in FAISS when a query vector is quite close to multiple retrieved samples relative to the typical inter-sample distance.

Resolve #3195.

Provide mode to perform a second high-precision pass over results
returned from brute-force KNN searches which make use of L2-derived
metrics. This provides a workaround for issues with numerical
instability in L2 distance calculations in FAISS when a query vector is
quite close to multiple retrieved samples relative to the typical
inter-sample distance.
@wphicks wphicks added feature request New feature or request 2 - In Progress Currenty a work in progress non-breaking Non-breaking change labels Dec 14, 2020
@wphicks wphicks added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currenty a work in progress labels Dec 15, 2020
@wphicks wphicks marked this pull request as ready for review December 15, 2020 16:57
@wphicks wphicks requested a review from a team as a code owner December 15, 2020 16:57
@wphicks wphicks changed the title [WIP] Add slow high-precision mode to KNN [REVIEW] Add slow high-precision mode to KNN Dec 15, 2020
@wphicks wphicks requested a review from cjnolet December 15, 2020 16:57
@wphicks
Copy link
Contributor Author

wphicks commented Dec 15, 2020

Additional detail on the data that first demonstrated the necessity of this new flag is available here: #3195 (comment).

Copy link
Contributor

@mdemoret-nv mdemoret-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestions on comments and array output_type handling. One thing I would like to see is some before/after testing added. For example, running once with two_pass_precision=False, then again with two_pass_precision=True and comparing that the output changed. This will help prove that the fix is working as intended.

python/cuml/neighbors/nearest_neighbors.pyx Outdated Show resolved Hide resolved
python/cuml/neighbors/nearest_neighbors.pyx Outdated Show resolved Hide resolved
python/cuml/neighbors/nearest_neighbors.pyx Outdated Show resolved Hide resolved
python/cuml/neighbors/nearest_neighbors.pyx Outdated Show resolved Hide resolved
python/cuml/neighbors/nearest_neighbors.pyx Outdated Show resolved Hide resolved
python/cuml/test/test_nearest_neighbors.py Outdated Show resolved Hide resolved
@wphicks wphicks added 4 - Waiting on Reviewer Waiting for reviewer to review or respond and removed 3 - Ready for Review Ready for review by team labels Dec 17, 2020
@wphicks
Copy link
Contributor Author

wphicks commented Dec 17, 2020

One thing I would like to see is some before/after testing added. For example, running once with two_pass_precision=False, then again with two_pass_precision=True and comparing that the output changed. This will help prove that the fix is working as intended.

Sadly, this is not possible to write in an environment-neutral way. Because of how the errors propagate (or not) in the distance approximations, we've seen environments where this issue never comes up and environments where it occurs every time. Poor @cjnolet slogged away at this one for awhile but was unlucky (lucky?) enough to be on a system where it never came up. Even more illustratively, the PR that I used to test for this issue in CI passed with no problem before the fix was in, even though I was consistently seeing local failures.

@codecov-io
Copy link

codecov-io commented Dec 17, 2020

Codecov Report

Merging #3304 (fc479b7) into branch-0.18 (550121b) will increase coverage by 0.17%.
The diff coverage is 85.11%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.18    #3304      +/-   ##
===============================================
+ Coverage        71.48%   71.66%   +0.17%     
===============================================
  Files              207      210       +3     
  Lines            16748    16945     +197     
===============================================
+ Hits             11973    12144     +171     
- Misses            4775     4801      +26     
Impacted Files Coverage Δ
python/cuml/decomposition/incremental_pca.py 94.70% <ø> (ø)
python/cuml/dask/ensemble/base.py 19.69% <30.43%> (+0.36%) ⬆️
python/cuml/dask/cluster/kmeans.py 54.00% <33.33%> (ø)
python/cuml/ensemble/randomforestregressor.pyx 70.83% <44.44%> (ø)
python/cuml/dask/decomposition/base.py 39.53% <50.00%> (ø)
...ython/cuml/dask/ensemble/randomforestclassifier.py 30.00% <50.00%> (+0.51%) ⬆️
python/cuml/dask/ensemble/randomforestregressor.py 35.08% <50.00%> (+0.54%) ⬆️
python/cuml/dask/linear_model/linear_regression.py 59.09% <50.00%> (ø)
python/cuml/dask/linear_model/ridge.py 50.00% <50.00%> (ø)
...ython/cuml/dask/neighbors/kneighbors_classifier.py 22.33% <50.00%> (ø)
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6b5e7ff...fc479b7. Read the comment docs.

@wphicks
Copy link
Contributor Author

wphicks commented Jan 7, 2021

To provide a little more clarity on my last comment, the unit tests introduced here consistently failed in my local environment before the included fix was provided and consistently passed after the fix was introduced. On CI and in other environments, that same unit tests would consistently pass even before the fix was introduced.

Copy link
Contributor

@JohnZed JohnZed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - just one question/suggestion

python/cuml/neighbors/nearest_neighbors.pyx Outdated Show resolved Hide resolved
@wphicks
Copy link
Contributor Author

wphicks commented Jan 12, 2021

rerun tests

@github-actions github-actions bot added the Cython / Python Cython or Python issue label Jan 14, 2021
@wphicks
Copy link
Contributor Author

wphicks commented Jan 14, 2021

Merging in latest mainline to see if that will fix seemingly unrelated CI errors

@JohnZed JohnZed removed the 4 - Waiting on Reviewer Waiting for reviewer to review or respond label Jan 14, 2021
@wphicks
Copy link
Contributor Author

wphicks commented Jan 19, 2021

rerun tests

1 similar comment
@wphicks
Copy link
Contributor Author

wphicks commented Jan 19, 2021

rerun tests

@wphicks
Copy link
Contributor Author

wphicks commented Jan 20, 2021

Seems to have been an unrelated error in FAISS. Rerunning tests and will check on specifics.

@wphicks
Copy link
Contributor Author

wphicks commented Jan 20, 2021

rerun tests

@ajschmidt8
Copy link
Member

@JohnZed, I'll be removing the auto-merge labels from all repos shortly. Please make sure to use the new merge comment, @gpucibot merge when you're ready to merge this PR.

@JohnZed
Copy link
Contributor

JohnZed commented Jan 21, 2021

rerun tests

@wphicks
Copy link
Contributor Author

wphicks commented Jan 22, 2021

Just updated copyright headers

@wphicks
Copy link
Contributor Author

wphicks commented Jan 23, 2021

Merged in branch-0.18 to deal with FAISS error

@wphicks wphicks added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Jan 26, 2021
Copy link
Contributor

@mdemoret-nv mdemoret-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything LGTM

@dantegd dantegd merged commit 546abad into rapidsai:branch-0.18 Jan 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge Cython / Python Cython or Python issue feature request New feature or request non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Anomalous behavior in NearestNeighbors
6 participants