Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using cell ontology with SingleR() #199

Closed
phoebee-h opened this issue Aug 24, 2021 · 4 comments
Closed

Using cell ontology with SingleR() #199

phoebee-h opened this issue Aug 24, 2021 · 4 comments

Comments

@phoebee-h
Copy link

phoebee-h commented Aug 24, 2021

Hi,
I've read the SingleR book, and discussion about ontology (#68) but didn't see the tutorial about cell ontology applied on the query single cell data. I wonder if I can assign labels parameters with label.ont ? I am not familiar with the cell ontology, not sure whether it is a correct way for interpretation. How do you integrate cell ontology with celldex references?

db <- celldex::MouseRNAseqData()
obj <- readRDS("Seurat.rds")
obj_SCE <- as.SingleCellExperiment(obj)
obj_singler <- SingleR(test = obj_SCE, ref = db, assay.type.test=1, labels = db$label.ont)
obj[["MouseRNA_ont.labels"]] <- obj_mouseRNA.ont$labels

## to get the hierarchical figure of annotated ontology
cl <- ontoProc::getCellOnto()
ontoProc::onto_plot2(cl, obj$MouseRNA_ont.labels)

P.S. I am asking the last question is because I don't see a clear relationship, when I replace SingleR(..., labels = db$label.main) and SingleR(..., labels = db$label.fine). For example, the first two cells were both defined in "main" with Neurons, and "fine" with NPCs, why would they come up with different ontology?
image
image

Thanks you.

@phoebee-h phoebee-h reopened this Aug 24, 2021
@LTLA
Copy link
Collaborator

LTLA commented Aug 26, 2021

The ontology terms and the fine/broad labels don't have a 1:1 mapping. So the prediction results will change depending on whether you use one set of labels or the other. Specifically, consider these two strategies:

  1. Map labels to ontology terms and use the ontology-labelled reference data for prediction.
  2. Use the original reference data for prediction and map the resulting predictions to the ontology terms.

These will not yield the same results, as the grouping of cells in the reference data will change (remember, it's not 1:1). Which in turn changes the detected marker genes, correlation scores, etc., etc.

@phoebee-h
Copy link
Author

Thank you for the suggestion!
I think I realized what cause the differences now. However, I can not figure out how the cell ontology was labeled in SingleR reference set. In the old version of SingleR, there were four pre-built reference set, HPCA/BE/Immgen/MouseRNAseq, where none of them include the cell ontology label. If I understand it correctly, the reference set were the matrix consist of "sample v.s. gene".

image

And, the list of DEGs between cell types.
image

Also, the corresponding cell types to samples.
image

As above, I think these can explain how you defined cell type of each sample.
The new version of SingleR indeed is more powerful. The first strategy you mentioned can be easily accessed by SingleR (thanks for the development team). But what defined the cell ontology of the pre-built references? If I consider the second strategy, how could I "map the resulting predictions"?

  1. Use the original reference data for prediction and map the resulting predictions to the ontology terms.

Sorry for asking this naive question. :-P

@LTLA
Copy link
Collaborator

LTLA commented Sep 7, 2021

But what defined the cell ontology of the pre-built references?

@j-andrews7 and I went through all the labels and mapped them to the Cell Ontology manually. We literally went through the labels and looked them up on https://www.ebi.ac.uk/ols/ontologies/cl to figure out the best matching term.

If I consider the second strategy, how could I "map the resulting predictions"?

The celldex package has some mappings between labels in its installation directory, accessible from:

system.file("mapping", "hpca.tsv", package="celldex")

You can have a look at them here.

@phoebee-h
Copy link
Author

I see.
Thank you for your detailed and clear explanation. Thank you for your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants