-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Approximate Nearest Neighbors algorithms #73
Comments
Makes sense to have it here in my opinion. |
Would be nice to run that benchmark here as well. |
They should surely be connected and share as much of the same interface as possible. |
Hm, I didn't realize there was any thought of adding any binary dependencies to this repo. |
Oh, maybe it was just me. |
A wrapper should definitely be in a separate package. A native version can live here. |
I can attest that nmslib is extremely fast - takes on the order of a few milliseconds to perform a search in millions of samples for a dimensionality >100. That's a couple of orders of magnitude faster than this implementation, which is fast only for a small number of dimensions. |
I am working on an implementation of the NN Descent algorithm from this paper: https://github.com/dillondaudert/NNDescent.jl . If it's something that people think is appropriate for NearestNeighbors.jl, I can look to merge it into this repo once I have a reasonably efficient implementation done. |
There seems to be many pieces of work happening in parallel; I think it is more fitting for a |
booooooooooooooooooooooooooring. Why don't we name it |
@JonasIsensee great, I can help with testing and further optimization if needed. |
For now I think this package will be kept to exact searches to not bloat its scope too much. |
Hello!
I am guessing @KristofferC already knows this, but there is a "new" approach to nearest neighbors called "approximate nearest neigbhors" for the kNN problem. For many applications these approaches are just as good as the "exact" methods. JuliaDynamics is very interested on this!
From the discussions that lead to #69 , I was told that the "state of the art"/"fastest" algorithm (faster than the one proposed in #69), is the one described here: https://arxiv.org/abs/1603.09320 that uses "Hierarchical Navigable Small World" graphs (
hnsw
). So that if I wanted to implement a faster kNN approach, I should do this one instead of the one in #69 .In addition it is very interesting to share these benchmarks: https://github.com/erikbern/ann-benchmarks which are in python but they have a lot of different algortithms. This gives a nice perspective.
What is the consensus on having the
hnsw
method in this repo? Do we agree on that (because of the "approximate")? I think it fits, but a new repo in JuliaDynamics can also be made if putting it here is a problem.By the way, there is already some code for this method: (C++, header only https://github.com/nmslib/hnsw )
@andyferris I am tagging you as well because you were interested in the discussion in #69 . @JonasIsensee was also mentioned in the proposal of this method.
p.s.: I intend to use some funding I got to get someone contribute a faster kNN method
The text was updated successfully, but these errors were encountered: