uniform_mersenne is using memory proportional to the range of numbers it generates #2

JohanMollevik · 2019-02-20T13:17:07Z

uniform_mersenne is using memory proportional to the range of numbers it generates

If asked to generate 3 numbers between 0 and 1000000000 the memmory allocated will be for an array 1000000000 big.

the culprint is the line

out = list(range(self.len))

which happily allocates gigabytes of memmory even when only a few numbers are requested

mikkokotila · 2019-02-20T13:55:05Z

Good catch. That's a very poor way to do this. I'm replacing it with:

return np.random.randint(0, self.len, self.n).tolist()

This supports cases until 10^20 at which point int64 overflows. So I guess it's nice if we can say we support permutation spaces until 10^20.

JohanMollevik · 2019-02-20T13:55:10Z

It seems the implementation could be replaced by using pythons built in random generation rather than numpys as python has optimized its implementation for ranges

import random
...
return random.sample(range(self.len), k=self.n)

Are there ay problems with making a change like that?

JohanMollevik · 2019-02-20T13:55:45Z

Does the randint allows duplicates or not?

mikkokotila · 2019-02-20T13:59:41Z

It indeed does. So let's use what you have proposed.

EDIT: correction to before >> here the constrain with overflow is at 10^18 so that's the supported limit.

JohanMollevik · 2019-02-20T14:06:45Z

That will have to do

JohanMollevik · 2019-02-22T07:51:28Z

If 10^18 proves limiting it would probably work well to add a fallback where if the requested number is bigger than that it selects random numbers in a loop until it has enough without duplicates.

The neat thing is that in this case with huge numbers the risk of selecting the same number several times in a row is vanishingly small as the range of the numbers are huge compared to the number of numbers requested.

This will make the fallback take linear time in the common case instead of running the risk of not terminating which that approach can do with large fill factors.

JohanMollevik · 2019-03-07T10:07:39Z

This limit ended up being a problem for me, pull request #3

JohanMollevik · 2019-03-07T10:12:49Z

The CI failed, it seems to be complaining about unit test coverage, I will try to resolve that.

JohanMollevik · 2019-03-07T10:35:36Z

CI still fails but the error seems unrelated to this and looks more like a dependency problem, can you look and see if you see a cause?

JohanMollevik mentioned this issue Feb 20, 2019

Support working with huge parameter spaces autonomio/talos#201

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uniform_mersenne is using memory proportional to the range of numbers it generates #2

uniform_mersenne is using memory proportional to the range of numbers it generates #2

JohanMollevik commented Feb 20, 2019

mikkokotila commented Feb 20, 2019

JohanMollevik commented Feb 20, 2019

JohanMollevik commented Feb 20, 2019

mikkokotila commented Feb 20, 2019 •

edited

Loading

JohanMollevik commented Feb 20, 2019

JohanMollevik commented Feb 22, 2019

JohanMollevik commented Mar 7, 2019

JohanMollevik commented Mar 7, 2019

JohanMollevik commented Mar 7, 2019

uniform_mersenne is using memory proportional to the range of numbers it generates #2

uniform_mersenne is using memory proportional to the range of numbers it generates #2

Comments

JohanMollevik commented Feb 20, 2019

mikkokotila commented Feb 20, 2019

JohanMollevik commented Feb 20, 2019

JohanMollevik commented Feb 20, 2019

mikkokotila commented Feb 20, 2019 • edited Loading

JohanMollevik commented Feb 20, 2019

JohanMollevik commented Feb 22, 2019

JohanMollevik commented Mar 7, 2019

JohanMollevik commented Mar 7, 2019

JohanMollevik commented Mar 7, 2019

mikkokotila commented Feb 20, 2019 •

edited

Loading