Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Census cell similarity search: experimental Python API for searching given AnnData #1114

Open
mlin opened this issue Apr 25, 2024 · 2 comments
Assignees
Labels
P0 Priority 0 - Critical, fix ASAP!

Comments

@mlin
Copy link
Contributor

mlin commented Apr 25, 2024

Having built the vector indexes on Census embeddings (#694 #1113), develop the Python API inside cellxgene_census.experimental to input an AnnData and identify the most-similar Census cells.

This necessitates running a forward pass of the embedding model (starting with scVI) on the given AnnData. In case that causes a lot of complications, we can create an initial demo that searches existing Census cells against the index and break out the forward passes into a separate issue. It might even end up involving a docker image or web service of some sort.

@mlin
Copy link
Contributor Author

mlin commented Apr 29, 2024

Per 4/29 discussion:

For now (i) the API assumes the given AnnData will include a layer with suitable embeddings and (ii) we'll informally provide a notebook/docker showing how to do the forward pass to add them. To be revisited in H2.

@mlin
Copy link
Contributor Author

mlin commented May 30, 2024

Draft PR: #1164

(finalization pending #1116)

mlin added a commit that referenced this issue Aug 8, 2024
Adds two new functions to `cellxgene_census.experimental`:

1. `find_nearest_obs` uses TileDB-Vector-Search indexes of Census embeddings to find nearest neighbors of given embedding vectors (in an AnnData obsm layer). #1114
2. `predict_obs_metadata` uses the nearest neighbors to predict metadata attributes like cell_type and tissue_general for the query cells. Naive implementation is just a starting point to start experimenting with. #1115
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P0 Priority 0 - Critical, fix ASAP!
Projects
None yet
Development

No branches or pull requests

2 participants