Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UMAP & T-SNE to pass user-configured metrics to KNN #1653

Closed
ntr34g opened this issue Feb 9, 2020 · 12 comments · Fixed by #4779
Closed

UMAP & T-SNE to pass user-configured metrics to KNN #1653

ntr34g opened this issue Feb 9, 2020 · 12 comments · Fixed by #4779
Labels
feature request New feature or request

Comments

@ntr34g
Copy link

ntr34g commented Feb 9, 2020

my question, is there any implementation planning for metrics feature like in classic UMAP learn package?
like that:
"The metric to use to compute distances in high dimensional space. If a string is passed it must match a valid predefined metric. If a general metric is required a function that takes two 1d arrays and returns a float can be provided. For performance purposes it is required that this be a numba jit’d function. Valid string metrics include:

    euclidean
    manhattan
    chebyshev
    minkowski
    canberra
    braycurtis..."
@ntr34g ntr34g added ? - Needs Triage Need team to review and classify question Further information is requested labels Feb 9, 2020
@cjnolet
Copy link
Member

cjnolet commented Feb 10, 2020

Hi @Yerne. We do have plans to implement this functionality, though it's not on the planned roadmap yet and we don't have a planned time frame.

That being said, we are currently using FAISS for fast KNN and they are going to be integrate these distances very soon. Once they have those exposed, the change should be accomplished fairly easily on our end.

Here's the FAISS issue for reference: facebookresearch/faiss#848

@JohnZed JohnZed added feature request New feature or request and removed ? - Needs Triage Need team to review and classify question Further information is requested labels Feb 12, 2020
@cjnolet
Copy link
Member

cjnolet commented Jul 22, 2020

Our KNN now supports additional distance metrics but now these need to be wired up in both UMAP and T-SNE. Changing the title to do this.

@cjnolet cjnolet changed the title Umap-classic like metrics support? UMAP & T-SNE to pass user-configured metrics to KNN Jul 22, 2020
@P3ngLiu
Copy link

P3ngLiu commented Jan 20, 2021

@cjnolet Hi! Just curious. Are you going to implement this feature on 0.18 Release?

@P3ngLiu
Copy link

P3ngLiu commented Jan 22, 2021

@JohnZed @fondaing can someone please tell me the answer? Thanks!

@snakeztc
Copy link

Same question here! When can this function be released since it's crucial for word2vec viz using cosine distance. @cjnolet

@cjnolet
Copy link
Member

cjnolet commented Jan 22, 2021

Our UMAP API does accept a knn_graph argument in the fit(), transform() and fit_transform() functions. This should allow UMAP to embed different metrics in the meantime:

from cuml.neighbors import NearestNeighbors
from cuml.manifold import UMAP
import numpy as np

a = np.random.random((100, 10))

m = NearestNeighbors(n_neighbors=10, metric='cosine')
m.fit(a)
knn_graph = m.kneighbors_graph(a, mode='distance')

u = UMAP(n_components=2)
u.fit_transform(a, knn_graph=knn_graph)

@snakeztc
Copy link

@cjnolet this is great. thanks!

@siegrikw
Copy link

siegrikw commented Mar 27, 2021

@cjnolet ,

Yes it does work. Thank you for the quick response.

@gandroz
Copy link

gandroz commented Mar 28, 2021

Is it possible to use such a temporary hack with TSNE ? or even with KMEANS ?

@cjnolet
Copy link
Member

cjnolet commented May 5, 2022

@gandroz, unfortunatley k-means is limited to L2/Euclidean, though I suppose you could L2 normalize your vectors in order to achieve something close to a spherical k-means w/ the angular distance. The knn_graph argument is supported in T-SNE: https://docs.rapids.ai/api/cuml/stable/api.html#cuml.TSNE.fit_transform.

@rapids-bot rapids-bot bot closed this as completed in #4779 Aug 9, 2022
rapids-bot bot pushed a commit that referenced this issue Aug 9, 2022
- [x] TSNE allow different distance metrics to be passed to KNN
- [x] TSNE distance metric pytests
- [x] UMAP allow different distance metrics to be passed to KNN
- [x] UMAP distance metric pytests
closes #1653

Authors:
  - Tarang Jain (https://github.com/tarang-jain)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #4779
jakirkham pushed a commit to jakirkham/cuml that referenced this issue Feb 27, 2023
- [x] TSNE allow different distance metrics to be passed to KNN
- [x] TSNE distance metric pytests
- [x] UMAP allow different distance metrics to be passed to KNN
- [x] UMAP distance metric pytests
closes rapidsai#1653

Authors:
  - Tarang Jain (https://github.com/tarang-jain)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#4779
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants