Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative clamping for NegativeBinomialMeanClust #237

Closed
wants to merge 5 commits into from

Conversation

damonbayer
Copy link

@damonbayer damonbayer commented May 29, 2024

As requested, porting the clamping from https://github.com/damonbayer/immunity_semi_parametric_model/blob/d54162e1bda24950efdb5ce60da686cc47e2b36b/src/immunity_semi_parametric_model.jl#L9.

Some of the math in the existing code seemed a bit odd to me, so I made some simplifications:

_μ^2 / ex_σ² = _μ^2 / (_α * _μ^2) = 1 / _α

Perhaps I am missing something.

@SamuelBrand1
Copy link
Collaborator

Hey @damonbayer !

Thanks for the contribution!

The underlying parametrisation we're using is determined by the variance-to-mean relationship:

$$\sigma^2 = \mu + \alpha \mu^2$$

And I got taught the $\alpha \mu^2$ as "excess" variance compared to a Poisson, so the code basically just reflects my year 1 probability lecturer...

Your commit looks good and is maths-equivalent so just waiting on the CI.

@SamuelBrand1 SamuelBrand1 self-requested a review May 29, 2024 09:15
Copy link
Collaborator

@SamuelBrand1 SamuelBrand1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check the failure of the unit test here?

This could just be because the committed approach is better/auto-safe!

@@ -33,11 +33,8 @@ function NegativeBinomialMeanClust(μ, α)
if isnan(μ) || isnan(α)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of interest, does nextfloat etc mean we can get rid of the logic branching here? That would be good from the pov of compilable tape for reverse diff.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your desire to replace the DiscreteUniform(0, 1_000_000) with a very wide NegativeBinomial? I'm not sure of the right way to do that, but the clamp method will not play nice with NaNs.

@seabbs
Copy link
Collaborator

seabbs commented May 29, 2024

❤️

I think prior to merging we should add @damonbayer to the authors list (

authors = ["Samuel Abbott <[email protected]>", "Samuel Brand <[email protected]>", "Zachary Susswein <[email protected]>"]
+ anywhere else we have one).

Longer term we should have some kind of contributing policy for this but above practice is my usual default so I think we should follow until we have said policy in place.

@damonbayer
Copy link
Author

Could you check the failure of the unit test here?

This could just be because the committed approach is better/auto-safe!

Actually, it seems that the existing approach is safer. For the values in the test, the clamp does nothing, but the rand produces an InexactError.

julia>     big_mu = 1e30
               alpha = 0.5
               α = alpha
               μ = big_mu
               r = clamp(1 / α, nextfloat(zero(α)), prevfloat(typemax(α)))
               p = clamp(1 / (1 + α * μ), nextfloat(zero(μ)), one(μ))
               rand(NegativeBinomial(r, p))
ERROR: InexactError: trunc(Int64, 1.3967586116300517e30)

Maybe it's just better to keep as is.

@seabbs
Copy link
Collaborator

seabbs commented May 29, 2024

Just noting I am really surprised that this is the case (and slightly sad). Not entirely clear to me what action we should take here? Promote to an issue for investigation or is there anything more to be done here?

@damonbayer
Copy link
Author

I have a related, longstanding issue in Distributions.jl JuliaStats/Distributions.jl#1512

@SamuelBrand1
Copy link
Collaborator

SamuelBrand1 commented May 30, 2024

So burrowing into this; the place that chucks an error is the call to rand

e.g.

big_mu = 1e30
alpha = 0.5

p = 1 / (1 + alpha + big_mu)
r = 1 / alpha
nb = NegativeBinomial(r, p)

rand(nb)
# InexactError: trunc(Int64, 1.0211481865897263e30) 

This is because Distributions generates a random neg bin by:

  1. Draw a Gamma with mean 1
  2. multiple mean by Gamma
  3. Draw a Poisson with the compound mean

Poisson sampling branches into to a bunch of special cases for efficiency, in particular

J.H. Ahrens, U. Dieter (1982)
"Computer Generation of Poisson Deviates from Modified Normal Distributions"
ACM Transactions on Mathematical Software, 8(2):163-179
For μ sufficiently large, (i.e. >= 10.0)

This algo requires finding a value

L = floor(Int, μ - 1.1484)

And its floor that has a problem because it uses trunc rather than unsafe_trunc or indeed BigInt

@SamuelBrand1
Copy link
Collaborator

Note that logpdf is not affected e.g.

logpdf(nb, 1000)
# -129.86005643920763

@SamuelBrand1
Copy link
Collaborator

So the options are here:

  • Overload the rand function in some way,
  • Raise an issue with Distributions.jl
  • Keep as is

@SamuelBrand1
Copy link
Collaborator

SamuelBrand1 commented May 30, 2024

Another more radical possibility is that we accept that having models that can sample > 1e17 infections (which is easier to do in unbounded pops than one might expect due to RW $\log R_t$ and the magic of exponential growth) are bad and enforce a max population size. Then these warm up phase problems should disappear in a more principled way.

@seabbs
Copy link
Collaborator

seabbs commented May 30, 2024

Another more radical possibility is that we accept that having models that can sample > 1e17 infections (which is easier to do in unbounded pops than one might expect due to RW
and the magic of exponential growth) are bad and enforce a max population size. Then these warm up phase problems should disappear in a more principled way.

So for this see my comment in the issue and I think generally wee should move discussion there.

Nice investigation.

Raise an issue with Distributions.jl

This is my preference I think and for now we work around?

@seabbs
Copy link
Collaborator

seabbs commented Jun 11, 2024

For my read of this is that we plan not to merge so closing. @damonbayer it would be a joy to have another contribution from you :)

@seabbs seabbs closed this Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants