-
-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error comparing PyMC3 models with multiple observed variables #1614
Comments
Luckily, this is actually only a matter of me messing up and not exposing ArviZ does already support model comparison when there are multiple variables stored in the log likelihood group, hence the error, it simply can't be done automatically, it needs the |
@OriolAbril Thanks for that explanation. I will see about making an MR for this. BTW, do you know if I'm correct in my conjecture that for this special case (where the variables in question are essentially sub-ranges of a single vector), I could simply add together the individual log-likelihood terms? |
I don't have enough information about the model to see if they should be added or concatenated, it does sound like adding makes more sense but I am not sure. The notebook I linked above however does cover this exact same case with the rugby data, and it also has some cool diagrams. After reading that you should not have doubts about which is the one that makes sense for your model and question, if it were not clear enough, let me know and we'll see how to improve the notebook |
Describe the bug
I have a set of PyMC3 models with multiple observed variables that I would like to compare with ArviZ. When I try to invoke the comparison as follows:
I get this error:
This is unfortunate for at least two reasons:
var_name
argument forcompare
. IIRC, the real error message should be that ArviZ can only compare models with a single output variable, which could be checked at thecompare
interface, instead of there being an error inget_log_likelihood
Request for Help
There's a usage question hidden in here: the different observed variables in the model
['AND obs', 'NAND obs', 'NOR obs', 'OR obs', 'XNOR obs', 'XOR obs']
are essentially the same variable, but with six different combinations of parameters upstream.Instead of combining all these variables into one observation variable with a complex structure of selectors to turn parameters on and off, they are separated into subsets.
So there's really a "meta variable" that is the concatenation of these six variables. IIUC, the log-likelihood of the full model should be the sum of the log-likelihood of each of these variables (each variable is conditionally independent of the others given the parameters).
If there's a way to massage the model and/or the
InferenceData
to reflect this, so that the IC can be evaluated, please let me know!To Reproduce
I don't have a small case that replicates this behavior yet.
Expected behavior
Either a successful comparison or an error message that reflects the API. IIRC, this requires that there be only a single output variable -- maybe we could check the argument input data for this, and raise an error if there's more than one.
Additional context
Arviz == 0.11.2
PyMC3 == 3.11.1
Theano-PyMC == 1.1.2
The text was updated successfully, but these errors were encountered: