New predictions group! #1030

OriolAbril · 2020-01-26T19:59:05Z

Tell us about it

Two new groups were recently added to the InferenceData schema: predictions and predictions_constant_data. #983 added them in from_pymc3, however it is still pending for the other libraries.

Current status:

Thoughts on implementation

PyStan

@ahartikainen what do you think would be better? Mimicking posterior_predictive or log_likelihood and adding the dict option to rename variables on creation?

The text was updated successfully, but these errors were encountered:

ahartikainen · 2020-01-26T20:15:23Z

I think following posterior_predictive should be the correct

ahartikainen · 2020-01-26T20:17:16Z

For future, we need to think about how do we handle post-run generated_quantities.

nitishp25 · 2020-02-03T17:49:12Z

Can I work on this?

OriolAbril · 2020-02-03T17:53:37Z

Yes! #1032 added predictions and predictions constant data to from_pystan (we forgot to link here) but there are still converters missing it.

nitishp25 · 2020-02-05T15:00:54Z

@OriolAbril which converters are you referring to? The predictions_to_xarray and predictions_constant_data_to_xarray are already present

OriolAbril · 2020-02-05T15:42:45Z

Other converters such as from_pyro, from_cmdstan and other inference libraries allowing out-of- sample posterior predictive sampling.

nitishp25 · 2020-02-07T16:39:06Z

Should the predictions implementation in from_pyro also follow the posterior_predictive implementation?

Also, it doesn't seem to have constant_data so predictions_constant_data and constant_data both are to be added?

OriolAbril · 2020-02-07T23:14:21Z

I don't really know the details of pyro well enough, I don't know if it is even possible or if this data is stored in pyro. A good start would be to replicate the model and workflow in the examples used to illustrate the schema specification. There is a version in PyStan and in PyMC3 implementing the same model and populating all the groups in InferenceData (with the exception of sample_stats_prior in PyMC3)

ahartikainen · 2020-02-13T14:28:07Z

from_cmdstan and from_cmdstanpy are implemented in #1064

nitishp25 · 2020-02-13T16:24:54Z

Hi, I was trying to implement this example in Pyro. I have come up with this so far but as you can see I have incorrect dimensions in the InferenceData. Can you please help me understand what's the problem?

OriolAbril · 2020-02-13T17:00:09Z

The dimension looks ok (the extra dims are 1, the data contained there looks correct). Maybe it is an issue with a missing squeeze?

Also, it looks like you have no log_likelihood group, are you using latest pyro release and arviz master?

nitishp25 · 2020-02-14T14:50:23Z

The dimension looks ok (the extra dims are 1, the data contained there looks correct). Maybe it is an issue with a missing squeeze?

If you compare it with the PyMC3 example then the prior and log_likelihood have different dimensions right? I don't understand how the parameters have entered the prior dimensions. Does this require a squeeze?

Also, it looks like you have no log_likelihood group, are you using latest pyro release and arviz master?

I had not updated arviz but now I have and you can review again.

OriolAbril · 2020-02-14T15:01:14Z

In the case of log_likelihood it looks like an error of dimension handling in io_pyro. In the case of the prior, all extra dimensions (b0_dim_o, c1_dim_0,...) have length 1 which is why I believe a squeeze may solve the "problem". The data is the same for both a shape (2,) array [3, 4] and a shape (2,1) array [[3], [4]]

OriolAbril · 2020-02-14T15:49:01Z

I was reading the code from my mobile on the train, so double check and it looks like log likelihood to xarray uses dims instead of self.dims, should be an easy fix.

I am a little bit puzzled by the extra dims in prior, a squeeze (before adding the chain dim) should solve the issue, but they probably shold not be there to begin with

nitishp25 · 2020-02-14T16:24:39Z

I was reading the code from my mobile on the train, so double check and it looks like log likelihood to xarray uses dims instead of self.dims, should be an easy fix.

Got it.

I am a little bit puzzled by the extra dims in prior, a squeeze (before adding the chain dim) should solve the issue, but they probably shold not be there to begin with

Does it have anything to do with my implementation? I don't have much experience with Pyro.
I'll still keep looking for something in io_pyro.

ahartikainen · 2020-02-14T16:34:17Z

Is posterior squeezing the input in io_pyro?

In Pyro 1D arrays are actually 2D with last dimension being 1. --> column vector

nitishp25 · 2020-02-15T08:00:24Z

Is posterior squeezing the input in io_pyro?

There seems to be no squeeze for posterior.

In Pyro 1D arrays are actually 2D with last dimension being 1. --> column vector

Yes, I just checked the shape of prior that is passed to from_pyro and it is (400, 1). Calling expand_dims on this makes its shape (1, 400, 1). I think this is causing the extra dim right?

nitishp25 · 2020-02-17T15:08:24Z

Hi, I am now looking to get predictions on new data but what I get is the data has shape of posterior_predictive even though the new data has different shape than the one passed to get posterior. I've even tried to remove pyro.plate or use for loop instead and also used guide but the results were same.

It seems like everytime you need new predictions you generate a new trace in Pyro? But there must be someway to get the right dims on new data with same trace. Here's the example @OriolAbril @ahartikainen.

OriolAbril · 2020-02-17T17:47:11Z

@nitishp25 I don't know how to fix this, as I said above I don't even know if it is possible to get out of sample posterior predictive samples (what we shortened to predictions) in Pyro. If you want to keep working on this maybe their forum is good place to get some help.

Side comment: I was looking at posterior_predictive and prior_predictive groups and it looks like Pyro is only repeating over and over the observed data. I can't explain why though, it could either be because of an error in the notebook or because of a bug in Pyro.

nitishp25 · 2020-02-18T07:47:21Z

I have asked these questions on their forum. I'll wait for the reply.

nitishp25 · 2020-02-25T10:27:04Z

Thanks to the Pyro forum I was able to figure out how to make out-of-sample predictions. Here's the notebook.

I'm thinking of implementing this data similar to PyMC3 by creating a from_pyro_predictions API.
And for constant_data I'm thinking of adding a constant_data argument to from_pyro and predictions_constant_data argument to from_pyro_predictions. What do you think @OriolAbril @ahartikainen?

OriolAbril · 2020-02-25T18:11:07Z

I am not sure about following pymc3 or pystan approach. PyMC3 needs a different function, because the information stored in the model is needed for the conversion, and to obtain predictions, the model changes, so a single function would not be able to access both the old and new instances of the model.

It is probably better to follow pystan converter, because then both options become possible; users can merge the current inference data object with the new inference data containing only predictions (pymc3 like workflow) or generate the inference data with everything at once. In this case it would then be important to test that from_pyro works with only predictions and predictions_constant_data arguments.

nitishp25 · 2020-02-26T08:01:35Z

Yes that makes more sense.

Also note that currently there seems to be no way to store constant data directly in Pyro, there exists pyro.determinictics but it samples the constant data with the rest of the data. So, the only way for the user is to manually pass the dict of constant_data and predictions_constant_data containing variable names to data values and we basically just use dict_to_dataset. So is it worth it to add these groups right now?

OriolAbril · 2020-02-26T14:31:08Z

I'd say it is useful to include the group because it let's users keep track of how where the predictions generated. For example, in the model you are working with, if you only get the inference data without the constant data groups you are missing quite some information as you would not have the times used to generate the posterior predictive samples.

ahartikainen · 2020-02-26T14:34:22Z

Why it (group input) can not be a dictionary that user provides?

OriolAbril · 2020-02-26T14:38:47Z

Why it (group input) can not be a dictionary that user provides?

I think this was the idea? @nitishp25

nitishp25 · 2020-02-26T14:53:36Z

Why it (group input) can not be a dictionary that user provides?

It can be. I was just trying to find an easier way from the user's perspective like if Pyro had an option like pymc3.data or pym3.deterministics then all the user would have to do would be to declare the constant data in the model and ArviZ would convert these automatically to inference data (like in PyMC3) without having to pass an argument separately for them.

But since this is not yet possible in Pyro, a dictionary can definitely be the desired input type. I was just looking for a better way and I thought if waiting for such a solution is worth it or not :)

OriolAbril added the Enhancement Improvements to ArviZ label Jan 26, 2020

nitishp25 mentioned this issue Mar 27, 2020

[WIP] Add new groups to from_numpyro and from_dict #1125

Merged

8 tasks

OriolAbril closed this as completed in #1125 Apr 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New predictions group! #1030

New predictions group! #1030

OriolAbril commented Jan 26, 2020 •

edited

Loading

ahartikainen commented Jan 26, 2020

ahartikainen commented Jan 26, 2020

nitishp25 commented Feb 3, 2020

OriolAbril commented Feb 3, 2020

nitishp25 commented Feb 5, 2020

OriolAbril commented Feb 5, 2020

nitishp25 commented Feb 7, 2020

OriolAbril commented Feb 7, 2020

ahartikainen commented Feb 13, 2020

nitishp25 commented Feb 13, 2020

OriolAbril commented Feb 13, 2020

nitishp25 commented Feb 14, 2020 •

edited

Loading

OriolAbril commented Feb 14, 2020

OriolAbril commented Feb 14, 2020

nitishp25 commented Feb 14, 2020

ahartikainen commented Feb 14, 2020

nitishp25 commented Feb 15, 2020

nitishp25 commented Feb 17, 2020

OriolAbril commented Feb 17, 2020

nitishp25 commented Feb 18, 2020

nitishp25 commented Feb 25, 2020

OriolAbril commented Feb 25, 2020

nitishp25 commented Feb 26, 2020 •

edited

Loading

OriolAbril commented Feb 26, 2020

ahartikainen commented Feb 26, 2020

OriolAbril commented Feb 26, 2020

nitishp25 commented Feb 26, 2020 •

edited

Loading

New predictions group! #1030

New predictions group! #1030

Comments

OriolAbril commented Jan 26, 2020 • edited Loading

Tell us about it

Thoughts on implementation

PyStan

ahartikainen commented Jan 26, 2020

ahartikainen commented Jan 26, 2020

nitishp25 commented Feb 3, 2020

OriolAbril commented Feb 3, 2020

nitishp25 commented Feb 5, 2020

OriolAbril commented Feb 5, 2020

nitishp25 commented Feb 7, 2020

OriolAbril commented Feb 7, 2020

ahartikainen commented Feb 13, 2020

nitishp25 commented Feb 13, 2020

OriolAbril commented Feb 13, 2020

nitishp25 commented Feb 14, 2020 • edited Loading

OriolAbril commented Feb 14, 2020

OriolAbril commented Feb 14, 2020

nitishp25 commented Feb 14, 2020

ahartikainen commented Feb 14, 2020

nitishp25 commented Feb 15, 2020

nitishp25 commented Feb 17, 2020

OriolAbril commented Feb 17, 2020

nitishp25 commented Feb 18, 2020

nitishp25 commented Feb 25, 2020

OriolAbril commented Feb 25, 2020

nitishp25 commented Feb 26, 2020 • edited Loading

OriolAbril commented Feb 26, 2020

ahartikainen commented Feb 26, 2020

OriolAbril commented Feb 26, 2020

nitishp25 commented Feb 26, 2020 • edited Loading

OriolAbril commented Jan 26, 2020 •

edited

Loading

nitishp25 commented Feb 14, 2020 •

edited

Loading

nitishp25 commented Feb 26, 2020 •

edited

Loading

nitishp25 commented Feb 26, 2020 •

edited

Loading