-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
choosing the optimal number of "Ks" #40
Comments
Hi @cathalgking, Thanks again for using STdeconvolve and for your questions! To provide some context, I'll point you towards a previous GitHub response: In the example a max K of 9 was chosen for speed purposes, however in practice, a higher K could be used if you suspect more than 9 cell types in the data. In terms of your R session crashing, what kind of errors are you seeing, if any? In terms of compute resources, are you possibly running out of memory? This has happened to me sometimes for very large datasets when fitting multiple models. I believe there is a way to change the max memory limit of R. Let me know if you still have follow up questions and hope this helps, |
I solved this thanks @bmill3r |
@bmill3r I notice that the My 4 samples seem to vary a lot in terms of what K to choose. While sample B seems to have an optimal K at around 16? Other than this plot, is there any other way to ascertain the best K per sample? |
Hi @cathalgking, I believe that
but all of these currently do not take into account alpha, and whether they are truly identify the optimal K can be dataset dependent. So really, I would recommend using the plots to help guide selection of K. Hope this helps, |
Ok thanks @bmill3r . So would you say (just from looking at the plots) that the best K would be ~6 for the first plot and for the bottom plot ~ 16? |
Hi @cathalgking, Yes, looking at those plots I would say those are reasonable choices of K. Brendan |
What is the best way to choose the optimal number of K's or cell-types for a dataset?
Is it just by observing the plot from the below code? How does one know what to set the upper limit to? i.e. in the below example, there could be more than 9 cell types present.
ldas <- fitLDA(t(as.matrix(cd)), Ks = seq(2, 9, by = 1))
Also, my R session often crashes when running the above code.
The text was updated successfully, but these errors were encountered: