-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uniform_mersenne is using memory proportional to the range of numbers it generates #2
Comments
Good catch. That's a very poor way to do this. I'm replacing it with:
This supports cases until 10^20 at which point int64 overflows. So I guess it's nice if we can say we support permutation spaces until 10^20. |
It seems the implementation could be replaced by using pythons built in random generation rather than numpys as python has optimized its implementation for ranges import random Are there ay problems with making a change like that? |
Does the randint allows duplicates or not? |
It indeed does. So let's use what you have proposed. EDIT: correction to before >> here the constrain with overflow is at 10^18 so that's the supported limit. |
That will have to do |
If 10^18 proves limiting it would probably work well to add a fallback where if the requested number is bigger than that it selects random numbers in a loop until it has enough without duplicates. The neat thing is that in this case with huge numbers the risk of selecting the same number several times in a row is vanishingly small as the range of the numbers are huge compared to the number of numbers requested. This will make the fallback take linear time in the common case instead of running the risk of not terminating which that approach can do with large fill factors. |
This limit ended up being a problem for me, pull request #3 |
The CI failed, it seems to be complaining about unit test coverage, I will try to resolve that. |
CI still fails but the error seems unrelated to this and looks more like a dependency problem, can you look and see if you see a cause? |
uniform_mersenne is using memory proportional to the range of numbers it generates
If asked to generate 3 numbers between 0 and 1000000000 the memmory allocated will be for an array 1000000000 big.
the culprint is the line
out = list(range(self.len))
which happily allocates gigabytes of memmory even when only a few numbers are requested
The text was updated successfully, but these errors were encountered: