-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
first draft of Binomial.icdf issue #6612 - PyData 2024 Hackathon #7362
Draft
niknow
wants to merge
1
commit into
pymc-devs:main
Choose a base branch
from
niknow:binomial_icdf
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is clever but widely inefficient for large
n
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also would probably need to pass axis=-1 to argmax
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are totally right, it is slow for large
n
as it is in O(n). I did actually flag this point at the hackathon and would agree that for that reason this is an ad-hoc implementation, but was encouraged to submit it anyway. ;) If you feel it is too slow for production I'm totally fine if you reject this PR.I would note though that for very large
n
, one is usually better off using a normal approximation instead anyway.And on the upside: This trick can be used to quickly get an implementation for the icdf of any discrete distribution, so one idea we had at the event was to use that as a first draft and make it quicker later if needed. One might be able to get the complexity down to O(log(n)) one replaces this with some sort of bisection.
Having a fast icdf just for Binomial might require quite a bit more effort, e.g. some numerical root finder of the CDF (which for Binomial is the regularized incomplete beta function). So I think the options are:
Of course open to ideas if you see another option? ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation!
I think we should try 3/4, and we can use this approach to obtain and independent reference for testing?
Perhaps the binary search wouldn't be too hard to implement with a ScalarLoop (so it is trivial to vectorize as an Elemwise operation)?
https://github.com/pymc-devs/pytensor/blob/main/pytensor/scalar/loop.py
The requirement is that the CDF expression be composed only of Scalar (or Elemwise in the batched version) operations.
The code may be a bit daunting but here are some cases where we use it: https://github.com/pymc-devs/pytensor/blob/efa845a3484915e4e15a928918fa97d081886d50/pytensor/scalar/math.py#L870
Or tests that may be more readable: https://github.com/pymc-devs/pytensor/blob/main/tests/scalar/test_loop.py