Proposal to remove "syntactic sugar" that overloads `*` and `/` for univariate distributions #1438

ablaom · 2021-11-24T22:25:17Z

I propose that the "syntactic sugar" added here be removed in the next breaking release.

A probability distribution is just a special case of a measure. The transformations implied by the current implementations of *, +, - and / so forth do not generalise to arbitrary measures which, moreover, already have well-established meanings for these operations. Eg, for the product of a scalar $x$ with a measure $\mu$ we have $(x\mu)(A) = x \mu(A)$. And these definitions are useful also within probability; for example:

When constructing a mixture of probability distributions, such as a finite average: The average is a prob measure even though the partial sums are just measures. That is, it's frequently convenient to leave the affine subspace of probability measures using the standard definitions of + and *.
When one wants to avoid normalising the measure representing a probability measure, because it is not needed (eg for generating samples)

My own use case is in ensemble models where I am averaging the probabilistic predictions of multiple classifiers. Computing averages naively is not working smoothly because the forementioned syntactic sugar conflicts with the "usual" ones, preventing me from just doing mean(....).

Thoughts anyone?

@cscherrer

The text was updated successfully, but these errors were encountered:

devmotion · 2021-11-24T22:55:39Z

This was added in #1217. Affine transformations were already defined for MvNormal, and the LocationScale definitions could be improved for different univariate distributions as well (#1407).

At first glance, these transformations seem completely fine to me: they return the distribution of the transformed random variable. I guess the main concern here is that the use of +, *, etc. might indicate that the measures or densities are added, multiplied, etc.?

cscherrer · 2021-11-24T23:03:03Z

It might help to consider... When we write 2 + Normal() or 3 * Normal(), what kind of objects are 2 and 3? In the current Distributions notation, this seem to be implicitly converted to Dirac measures, so the operations can be interpreted as convolution. I'm not saying this is good or bad, but it does seem helpful to pin down the semantics.

In MeasureTheory, we've had some discussion of changing to ⊙ for weighted measures. See
JuliaMath/MeasureTheory.jl#170

This is a little more awkward, but it has the advantage of not getting in the way of the current Distributions syntactic sugar. Our usual use for ⊙ is for a likelihood "acting on" a measure through a pointwise product (hence the pointy notation). So this is really yet another syntactic sugar, interpreting 3 ⊙ Normal() as something like Returns(3) ⊙ Normal().

cscherrer · 2021-11-24T23:04:15Z

Addition is a little trickier, since + is used for superposition. But that always takes two measures, never a measure and a likelihood.

devmotion · 2021-11-24T23:07:13Z

It might help to consider... When we write 2 + Normal() or 3 * Normal(), what kind of objects are 2 and 3? In the current Distributions notation, this seem to be implicitly converted to Dirac measures, so the operations can be interpreted as convolution.

No, it's not, it's much simpler. It is an affine transformation of a random variable X with distribution Normal(). So 2 + Normal() and 3 * Normal() just means "give me the distribution of 2 + X" and "give me the distribution of 3 * X". I.e., 2 and 3 are really just two real numbers.

BTW convolutions can be computed with Distributions.convolve.

ablaom · 2021-11-24T23:27:46Z

I guess the main concern here is that the use of +, *, etc. might indicate that the measures or densities are added, multiplied, and so forth.?

Yes, that's my only concern. I have no objections to the transformations that were added, just the usurping of +, * to represent them.

devmotion · 2021-11-24T23:39:04Z

Affine transformations are very useful and I think we should definitely support them, make them easy to use, and define more optimized versions whenever possible (e.g. as in #1407).

They were added initially for MvNormals, and this fixed a very old issue: #307

I am not completely convinced yet that the current behaviour of +, *, etc. is surprising since Distributions does not perform any computations with measures and was not designed from a measure theory view point. This different interpretation as a transformation of measures was also discussed in #307. There seemed general agreement that the syntax was fine in the end: @mschauer wrote

But +(A::Vector, B::MvNormal) and *(A::Matrix B::MvNormal) can only mean one thing imho.

and (a bit similar to my comment) @simonbyrne said

I mean, the only other interpretation it could mean would be transforming it as a measure, but (a) that isn't very useful (since it would no longer be a probability measure), and (b) we don't treat it as a measure in other contexts, e.g. defining (d::Distribution)(x::Interval) to get the probability of an interval.

even though it seems he was a bit reluctant initially:

It would be good to have some way to do this, at least for constants. Mathematically, I'm a bit reluctant to overload +/* directly: the objects are intended to be distributions, not random variables. But maybe that isn't such a big deal?

ablaom · 2021-11-24T23:49:53Z

Okay, I understood one thing wrong. + is only overloaded for distr + scalar and not distr + distr. Sorry, I should have checked that more carefully. So it only the scalar product scalar*distr and the distr/scalar cases that I am wondering about.

After these further clarifications I better understand the arguments for the status quo in those cases. I guess this is tricky one.

ablaom · 2021-11-25T00:16:55Z

It seems @mschauer did express essentially the same misgiving that I have in #307:

One thing, for scalar λ the expression λ*D1 + (1-λ)*D2 could denote a mixture distribution.

devmotion · 2021-11-25T22:40:10Z

I think it's not the same concern though: the comment only mentions a very specific expression. I see that one could expect a mixture distribution for an expression of this form but it requires

that all non-negative coefficients sum up to 1,
that operations are based on measures (even though generally we don't use this view in Distributions),
that one can sum scaled Distribution objects (which is not possible, we don't define +(::Distribution, ::Distribution); e.g. for convolutions one has to use convolve).

Therefore I'm not sure if the examples is actually an argument against the current use of a * D - it seems to indicate mainly that for such an expression one might expect a mixture distribution. However, the current implementation does not support this syntax and hence this confusion can't arise.

ablaom · 2021-11-25T23:42:26Z

@devmotion Thanks for taking my suggestion seriously and for the helpful explanations. I stand by my objection but can see that this boat has sailed. Even MeasureTheory.jl seems resigned to avoiding Base.* to avoid confusion for users of Distributions.jl. These calls are always difficult.

ablaom mentioned this issue Nov 25, 2021

Add arithmetic for UnivariateFinite objects. JuliaAI/CategoricalDistributions.jl#12

Merged

ablaom changed the title ~~Proposal to remove "syntactic sugar" that overloads + and * for univariate distributions~~ Proposal to remove "syntactic sugar" that overloads * and / for univariate distributions Nov 25, 2021

ablaom closed this as completed Nov 25, 2021

oschulz mentioned this issue Nov 27, 2021

Generalize Product #1391

Merged

5 tasks

devmotion mentioned this issue Nov 30, 2021

Add Unicode alias \oplus for convolve #1445

Open

devmotion mentioned this issue Dec 11, 2021

Add Unicode alias \ast for convolve #1455

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal to remove "syntactic sugar" that overloads `*` and `/` for univariate distributions #1438

Proposal to remove "syntactic sugar" that overloads `*` and `/` for univariate distributions #1438

ablaom commented Nov 24, 2021

devmotion commented Nov 24, 2021

cscherrer commented Nov 24, 2021

cscherrer commented Nov 24, 2021

devmotion commented Nov 24, 2021 •

edited

Loading

ablaom commented Nov 24, 2021 •

edited

Loading

devmotion commented Nov 24, 2021

ablaom commented Nov 24, 2021

ablaom commented Nov 25, 2021

devmotion commented Nov 25, 2021

ablaom commented Nov 25, 2021 •

edited

Loading

Proposal to remove "syntactic sugar" that overloads * and / for univariate distributions #1438

Proposal to remove "syntactic sugar" that overloads * and / for univariate distributions #1438

Comments

ablaom commented Nov 24, 2021

devmotion commented Nov 24, 2021

cscherrer commented Nov 24, 2021

cscherrer commented Nov 24, 2021

devmotion commented Nov 24, 2021 • edited Loading

ablaom commented Nov 24, 2021 • edited Loading

devmotion commented Nov 24, 2021

ablaom commented Nov 24, 2021

ablaom commented Nov 25, 2021

devmotion commented Nov 25, 2021

ablaom commented Nov 25, 2021 • edited Loading

Proposal to remove "syntactic sugar" that overloads `*` and `/` for univariate distributions #1438

Proposal to remove "syntactic sugar" that overloads `*` and `/` for univariate distributions #1438

devmotion commented Nov 24, 2021 •

edited

Loading

ablaom commented Nov 24, 2021 •

edited

Loading

ablaom commented Nov 25, 2021 •

edited

Loading