Make transforms stateless #4551

brandonwillard · 2021-03-17T05:53:58Z

This PR addresses a few more transform changes/issues.

The primary change is that transforms are now stateless (i.e. they no longer carry their own parameters). Stateful transforms make it very easy to accidentally introduce old and/or irrelevant parameters into a graph, and are a source for some extremely confusing and difficult bugs. That's why this change was made.

Now, transforms only take a "parameter extraction function" that, when applied to a random variable, will extract the required transform parameters.

In other words, transform objects are no longer random variable instance-specific, but random variable class-specfic.

michaelosthege

After reading the entire diff I'm now quite sure I got the purposes of the rv_var and rv_value args wrong.

pymc3/distributions/__init__.py

michaelosthege · 2021-03-17T22:29:45Z

pymc3/distributions/__init__.py

+
+        if transform is not None and rv_var is None:
+            warnings.warn(
+                f"A transform was found for {measure_var}" " but no corresponding random variable"


String is a bit messed up.
More importantly: The sentence is a bit incomplete - no variable corresponding to what?

That measure_var doesn't have a random variable associated with it, so there's really nothing else to print or say. If anything, this should probably be an error condition.

It might actually make more sense to associate the transform object with the rv_var (i.e. the random variable). I'll have to think about that.

michaelosthege · 2021-03-17T22:39:38Z

pymc3/distributions/transforms.py

+        rv_var
+            The random variable being transformed
+        rv_value
+            The parameters required for the transform.


rv_value doesn't sound very intuitive for something that holds the transform parameters. (I was confused by this above already.)

How about rv and transform_params?
Or rv and params?

We need to be clear about the rv_var and rv_value[_var] distinctions.

rv_vars are the "sample-space" variables that are produced by RandomVariable Ops.
rv_value[_var]s are the "measure-space" (or log-likelihood) variables that correspond to a specific value of an rv_var.

These are the same two types of variables described here, where the sloppy P(X = x) or x ~ X notation denotes the rv_var with X (i.e. the random variable), and the value variable with rv_value[_var].

These transform methods are getting those two variables, so any new name that involves "params" would be inaccurate, because the rv_value variable does not provide parameters. The first argument, rv_var, does provide access to a random variable's parameters via rv_var.owner.inputs, and—again—rv_value is a value that's compatible with the random variable rv_var (i.e. a value that could've been a sample from it).

So rv_var is the tensor of the user-provided, observed values? (A TensorConstant?)

We might still want to copy parts of your explanation into the docstring.

pymc3/distributions/transforms.py

pymc3/tests/test_distributions.py

michaelosthege

There are a few threads still open.
Nevertheless I'll say LGTM, but don't count too much on my judgement. Most of my trust comes from the facts that Brandon did this and that the CI Tests are now ✔.

pymc3/tests/test_distributions.py

michaelosthege · 2021-03-18T22:09:04Z

pymc3/tests/test_transforms.py

-    with pytest.warns(
-        DeprecationWarning, match="The argument `eps` is deprecated and will not be used."
-    ):
-        tr.StickBreaking(eps=1e-9)


(Where) do we keep a list of these changes? We should mention them in the release notes. The alternative is to raise the DeprecationWarning which saves users from complicated digging.

brandonwillard · 2021-03-18T23:04:55Z

Sorry, been pretty busy, but I have another commit to push, and it's a big refactor that should address most/all of the open logp-related issues.

This make `aesara.graph.basic.clone_replace` work correctly when `Scan`s are included in a graph.

michaelosthege · 2021-03-22T09:34:26Z

pymc3/distributions/__init__.py

@@ -161,80 +157,119 @@ def rv_log_likelihood_args(
    variable).

    """
+    if not var.owner:
+        return None, None


Doesn't match with the return type hints and docstring.

Can you explain (maybe in the docstring) why and under what circumstances None, None is returned?

michaelosthege · 2021-03-22T09:56:00Z

pymc3/distributions/__init__.py

+        rv_value = rv_var.type.filter_variable(rv_value.astype(rv_var.dtype))
+
+        if rv_value_var is None:
+            rv_value_var = rv_value


That's the case when rv_value has no observations, right?

michaelosthege · 2021-03-22T15:13:30Z

pymc3/distributions/continuous.py

-        mean = alpha / (alpha + beta)
-        variance = (alpha * beta) / ((alpha + beta) ** 2 * (alpha + beta + 1))
+        # mean = alpha / (alpha + beta)
+        # variance = (alpha * beta) / ((alpha + beta) ** 2 * (alpha + beta + 1))


can be removed?

michaelosthege · 2021-03-22T15:15:10Z

pymc3/distributions/distribution.py

+            #
+            #     @logp_transform.register(rv_type)
+            #     def transform(op, *args, **kwargs):
+            #         return class_transform(*args, **kwargs)


michaelosthege · 2021-03-22T15:17:56Z

pymc3/distributions/distribution.py

-        super().__init__(shape, dtype, defaults=defaults, *args, **kwargs)
+        if kwargs.get("transform", None):
+            raise ValueError("Transformations for discrete distributions")
+


Shouldn't we keep the dtype checks? (Based on intX.)

Those are done at the Aesara Op-level now (i.e. within RandomVariable.make_node); although I'm not sure if float-to-int conversion is part of that. It might only raise an exception for the wrong dtype. If it's not, then we might need to add that at this level.

michaelosthege · 2021-03-22T15:53:15Z

The failing test looks like the non-deterministic logpt that @ricardoV94 noticed a few days ago?

brandonwillard added the v4 label Mar 17, 2021

brandonwillard self-assigned this Mar 17, 2021

michaelosthege reviewed Mar 17, 2021

View reviewed changes

brandonwillard force-pushed the more-transform-updates branch 3 times, most recently from b97dc37 to 0762608 Compare March 17, 2021 23:56

Make transform objects stateless

43bd711

brandonwillard force-pushed the more-transform-updates branch from 0762608 to 43bd711 Compare March 17, 2021 23:57

Add tests for two important open logpt and Model issues

ec02513

ricardoV94 reviewed Mar 18, 2021

View reviewed changes

pymc3/tests/test_distributions.py Outdated Show resolved Hide resolved

michaelosthege previously approved these changes Mar 18, 2021

View reviewed changes

brandonwillard mentioned this pull request Mar 20, 2021

Implement shape and dims for v4 #4552

Closed

brandonwillard added 4 commits March 20, 2021 13:10

Add non_sequences to uses of Scan Op

199ef56

This make `aesara.graph.basic.clone_replace` work correctly when `Scan`s are included in a graph.

Replace Observed Op with tag.observations

d1c79bf

Add missing imports to pymc3.step_methods.gibbs

74935fa

Comment out unused moments

e2c42b3

brandonwillard dismissed michaelosthege’s stale review via 036353b March 21, 2021 01:19

brandonwillard force-pushed the more-transform-updates branch 2 times, most recently from 6d8c136 to 05d4e19 Compare March 22, 2021 05:36

michaelosthege reviewed Mar 22, 2021

View reviewed changes

brandonwillard added 3 commits March 22, 2021 19:31

Make logpt work correctly for nested models and transforms

02e170a

Disable use of Arviz in pymc3.tests.test_data_container

58b2f11

Set model seed correctly in pymc3.tests.test_ndarray_backend

80d189f

brandonwillard force-pushed the more-transform-updates branch from 05d4e19 to 00dcfad Compare March 23, 2021 00:32

Prevent Model from turning on test value computations

049c5f8

brandonwillard force-pushed the more-transform-updates branch from 00dcfad to 049c5f8 Compare March 23, 2021 00:58

brandonwillard merged commit 4b07810 into pymc-devs:v4 Mar 23, 2021

brandonwillard deleted the more-transform-updates branch March 23, 2021 03:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make transforms stateless #4551

Make transforms stateless #4551

brandonwillard commented Mar 17, 2021

michaelosthege left a comment

michaelosthege Mar 17, 2021

brandonwillard Mar 17, 2021

brandonwillard Mar 17, 2021

michaelosthege Mar 17, 2021

brandonwillard Mar 17, 2021 •

edited

Loading

michaelosthege Mar 18, 2021

michaelosthege left a comment

michaelosthege Mar 18, 2021

brandonwillard commented Mar 18, 2021

michaelosthege Mar 22, 2021

michaelosthege Mar 22, 2021

michaelosthege Mar 22, 2021

michaelosthege Mar 22, 2021

michaelosthege Mar 22, 2021

brandonwillard Mar 22, 2021 •

edited

Loading

michaelosthege commented Mar 22, 2021

Make transforms stateless #4551

Make transforms stateless #4551

Conversation

brandonwillard commented Mar 17, 2021

michaelosthege left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brandonwillard Mar 17, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelosthege left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brandonwillard commented Mar 18, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brandonwillard Mar 22, 2021 • edited Loading

Choose a reason for hiding this comment

michaelosthege commented Mar 22, 2021

brandonwillard Mar 17, 2021 •

edited

Loading

brandonwillard Mar 22, 2021 •

edited

Loading