draw_values draws from marginal distributions #3210

lucianopaz · 2018-09-25T13:11:13Z

Description of your problem

draw_values draws from the marginal distributions of the requested variables instead of the joint distribution.

Please provide a minimal, self-contained, and reproducible example.

import numpy as np
import pymc3 as pm
import seaborn
from matplotlib import pyplot as plt


with pm.Model():
    a = pm.Normal('a', mu=0, sd=100)
    b = pm.Normal('b', mu=a, sd=1e-6)
    c = pm.Normal('c', mu=a, sd=1e-6)

params = [a, b, c]
N = 10000
np.random.seed(1)
A, B, C = list(zip(*[pm.distributions.distribution.draw_values(params)
                     for i in range(N)]))
A = np.array(A)
B = np.array(B)
C = np.array(C)

np.random.seed(1)
eA = np.random.randn(N) * 100.
eB = eA + np.random.randn(N) * 1e-6
eC = eA + np.random.randn(N) * 1e-6

seaborn.jointplot(B, C)
plt.suptitle('draw_values')
plt.savefig('draw_values_output.png')
seaborn.jointplot(eB, eC)
plt.suptitle('Expected')
plt.savefig('expected_output.png')

Please provide the full traceback.
This model is a rather simple hierarchical model. The expected joint distribution of variables b and c is:

However, the values drawn with draw_values are conditionally independent from one another

The marginals match but the joint distribution does not. Given that #2983 relies heavily on draw_values to make sample_prior_predictive work, this problem is really important.

I've given this issue some though before submitting it here and I think I found the origin of the problem and also a possible solution, that I'll pull in later. What seems to be happening is

draw_values sees that it must sample from a, b and c.
None of these nodes are Apply nodes with ownerships so there is no dependency tree being searched and nothing happens to givens or point.
draw_values arrives to its final stage where it makes three seperate calls to _draw_value, one for each variable.
Inside _draw_value, its decided that param's random method must be called.
random in turn issues a new call to draw_values to get the Normal's mu and sd. The values returned by this call to draw_values are never made available to the other calls in point 3. Thus, effectively, each variable ends up being drawn from its marginal distribution.

As it stands now, the only way that the nested draw_values calls from point 5 could be made aware of any value given to a is by modifying the point dictionary with the values drawn from a, before calling the conditionally dependent b and c.

My proposal is a follows. Each distribution should store a list of the theano.Variable's it is conditionally dependent on.

For example Normal should have an attribute called conditional_on which should be the list [self.mu, self.tau]. If this attribute is adopted as a convention of all distributions (in the end, every distribution already implicitly uses it when calling draw_values from their random method), we could build a dependency graph (I suppose it should be a DAG) between all the variables used by draw_values. I also imagine that this DAG could be built behind the scenes when adding random variables into a model, to save some unnecessary computations. This DAG should also take into account deterministic relations, given by theano owner input nodes, something that is already a bit in place in draw_values. Knowing the full DAG, we can know which nodes are independent from anything else, start drawing their values, putting them into the point dictionary, and them moving on to the next nodes in the hierarchy.

Versions and main components

PyMC3 Version: 3.4
Theano Version: 1.03
Python Version: 3.6
Operating system: Ubuntu 16.04
How did you install PyMC3: pip install -e

The text was updated successfully, but these errors were encountered:

junpenglao · 2018-09-25T20:57:28Z

yes i think I also had a similar issue (cannot find which issue it is now), that if you have y ~ Normal(X*b, ...) with X being a latent variable with observed, prior prediction is not conditioned correctly.

I think the current implementation is to sample the observed variable, which it necessary produce sample of all its parents as by product. When there is no observed or multiple observed things are not working well. Fixing it might be difficult in terms of finding a fast implementation (do forward pass on the DAG would surely work - we had an earlier implementation but it is much slower).

ColCarroll · 2018-09-25T21:24:04Z

Looking at this now - a few observations:

The graphviz representation, which was developed at the same time with the same DAG introspection, knows about the relationship
I can't quickly recreate @junpenglao's note about conditioning (either by adding an observed node D, or by adding an observation to B or C). Such behavior wouldn't surprise me, though!
My first guess was to use prior = pm.sample_prior_predictive(samples=N), but that plot looks just the same.

ColCarroll · 2018-09-25T22:56:17Z

Another note:

The bug also shows up in sample_posterior_predictive (even with Sample ppc does not shuffle by default #3212), which uses draw_values too:

with pm.Model() as model:
    a = pm.Normal('a', mu=0, sd=100, observed=0)
    b = pm.Normal('b', mu=a, sd=1e-4, observed=1e-6)
    c = pm.Normal('c', mu=a, sd=1e-4, observed=1e-6)
    d = pm.Normal('d')
    trace = pm.sample()
    posterior = pm.sample_posterior_predictive(trace)

The bug does not show up in sample, which uses MCMC, but I had to play with the scales and ignore some warnings:

with pm.Model() as model:
    a = pm.Normal('a', mu=0, sd=10)
    b = pm.Normal('b', mu=a, sd=1e-2)
    c = pm.Normal('c', mu=a, sd=1e-2)
    trace = pm.sample()

junpenglao · 2018-10-07T08:25:15Z

Likely related: https://discourse.pymc.io/t/samples-from-prior-appear-to-have-wrong-distribution/2026/

miclegr · 2018-10-10T16:41:02Z

I post here a related error I get when i try to sample from posterior distribution of a copula model.

Versions and main components

PyMC3 Version: tried with 3.5 and dev branch
Theano Version: 1.03
Python Version: 3.6
Operating system: Win7
How did you install PyMC3: pip install

Reproducible example

import numpy as np
import pymc3 as pm
import scipy.special as ss
import matplotlib.pyplot as plt
import seaborn as sns
import scipy as sp
import theano.tensor as tt
from theano import shared

corr=0.6
_sigma=np.array([[1,corr],[corr,1]])
_mus=np.array([-1,2])
_sigmas=np.array([.5,.5])
N=10000

u=np.random.multivariate_normal((0,0),_sigma,size=N)
c=sp.stats.norm.cdf(u)
generated=sp.stats.lognorm.ppf(c,_sigmas,scale=np.exp(_mus))

def lognormal_cdf(values,mu,sigma):
    return pm.math.invprobit((tt.log(values)-mu)/sigma)

def jacobian_det(f_inv_x, x):
    grad = tt.reshape(pm.theanof.gradient(tt.sum(f_inv_x), [x]), x.shape)
    return tt.log(tt.abs_(grad))

with pm.Model() as model:
    
    data=shared(generated)
    
    #hackish way to create cholesky mat. that yields a covariance mat. with 1s on the diagonal and corr anywhere else
    corr = pm.LKJCorr('corr',n=2, eta=2)[0]
    chol = tt.set_subtensor(tt.eye(2,2)[([1,1], [0,1])],[corr,tt.sqrt(1-tt.pow(corr,2))]) 

#     r =  pm.Uniform('r',lower=-1, upper=1)
#     cov = pm.Deterministic('cov', 
#                            tt.stacklists([[1., r],
#                                           [r, 1.]]))
    
    mus=pm.Normal('mus',0,1,shape=2)
    sigmas=pm.HalfCauchy('sigmas',beta=2,shape=2)
    
    ps=lognormal_cdf(data,mus,sigmas)
    cs=pm.math.probit(ps)
    
    us_dist = pm.MvNormal('copula',mu=np.zeros(2), chol=chol,observed=cs)
    pm.Potential('jacob_det', jacobian_det(cs, data))

    trace=pm.sample(1000)


sample=pm.sample_posterior_predictive(trace,model=model,vars=[us_dist,mus,sigmas],samples=10000)


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-14dbe60ba96f> in <module>()
----> 1 sample=pm.sample_posterior_predictive(trace,model=model,vars=[us_dist,mus,sigmas],samples=10000)
      2 
      3 simulated_c=sp.stats.norm.cdf(sample['copula'])
      4 simulated=np.exp(sp.stats.norm.ppf(simulated_c)*sample['sigmas']+sample['mus'])

~\AppData\Local\Continuum\Anaconda3\envs\newroot\lib\site-packages\pymc3\sampling.py in sample_posterior_predictive(trace, samples, model, vars, size, random_seed, progressbar)
   1118     # draw once to inspect the shape
   1119     var_values = list(zip(varnames,
-> 1120                           draw_values(vars, point=model.test_point, size=size)))
   1121     ppc_trace = defaultdict(list)
   1122     for varname, value in var_values:

~\AppData\Local\Continuum\Anaconda3\envs\newroot\lib\site-packages\pymc3\distributions\distribution.py in draw_values(params, point, size)
    310     while to_eval or missing_inputs:
    311         if to_eval == missing_inputs:
--> 312             raise ValueError('Cannot resolve inputs for {}'.format([str(params[j]) for j in to_eval]))
    313         to_eval = set(missing_inputs)
    314         missing_inputs = set()

ValueError: Cannot resolve inputs for ['copula']

related discourse thread

FYI with version 3.4 I was able to successfully sample from the copula but samples of mus and sigmas were wrong (coming from the prior i guess). With version 3.5 and dev branch samples of mus and sigmas seem to be correct but this error appeared.

…n PR pymc-devs#3214. It uses a context manager inside `draw_values` that makes all the values drawn from `TensorVariables` or `MultiObservedRV`s available to nested calls of the original call to `draw_values`. It is partly inspired by how Edward2 approaches the problem of forward sampling. Ed2 tensors fix a `_values` attribute after they first call `sample` and then only return that. They can do it because of their functional scheme, where the entire graph is recreated each time the generative function is called. Our object oriented paradigm cannot set a fixed _values, it has to know it is in the context of a single `draw_values` call. That is why I opted for context managers to store the drawn values.

* Fix for #3225. Made Triangular `c` attribute be handled consistently with scipy.stats. Added test and updated example code. * Fix for #3210 which uses a completely different approach than PR #3214. It uses a context manager inside `draw_values` that makes all the values drawn from `TensorVariables` or `MultiObservedRV`s available to nested calls of the original call to `draw_values`. It is partly inspired by how Edward2 approaches the problem of forward sampling. Ed2 tensors fix a `_values` attribute after they first call `sample` and then only return that. They can do it because of their functional scheme, where the entire graph is recreated each time the generative function is called. Our object oriented paradigm cannot set a fixed _values, it has to know it is in the context of a single `draw_values` call. That is why I opted for context managers to store the drawn values. * Removed leftover print statement * Added release notes and draw values context managers to mixture and multivariate distributions that make many calls to draw_values or other distributions random methods within their own random.

junpenglao · 2018-12-03T10:35:53Z

close by #3273

junpenglao assigned ColCarroll Sep 25, 2018

ColCarroll mentioned this issue Sep 26, 2018

Update to more accurate way of calculating ancestors #3213

Merged

lucianopaz mentioned this issue Sep 26, 2018

Make draw_values draw from the joint distribution #3214

Closed

lucianopaz mentioned this issue Nov 27, 2018

Fix for #3210 without computing the Bayes network #3273

Merged

junpenglao closed this as completed Dec 3, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

draw_values draws from marginal distributions #3210

draw_values draws from marginal distributions #3210

lucianopaz commented Sep 25, 2018

junpenglao commented Sep 25, 2018

ColCarroll commented Sep 25, 2018

ColCarroll commented Sep 25, 2018

junpenglao commented Oct 7, 2018

miclegr commented Oct 10, 2018 •

edited

Loading

junpenglao commented Dec 3, 2018

draw_values draws from marginal distributions #3210

draw_values draws from marginal distributions #3210

Comments

lucianopaz commented Sep 25, 2018

Description of your problem

Versions and main components

junpenglao commented Sep 25, 2018

ColCarroll commented Sep 25, 2018

ColCarroll commented Sep 25, 2018

junpenglao commented Oct 7, 2018

miclegr commented Oct 10, 2018 • edited Loading

Versions and main components

Reproducible example

junpenglao commented Dec 3, 2018

miclegr commented Oct 10, 2018 •

edited

Loading