Implement HighPrecision01 distribution #320

pitdicker · 2018-03-21T13:35:57Z

Separated out from #308.

Add HC-128

New error handling + type

Implement Rng for Box with alloc feature

CloudABI has a special system call, random_get(), that can be used to obtain random data from the environment (the kernel, emulator, etc). We can invoke this system call by making use of the cloudabi crate.

Port rand to CloudABI.

Fix documentation warnings

Copied from dhardy/master, minus single-read functions, with extra tests

The planned switch to 256-bit byte array seeds does not allow such long seeds.

It's not being used anyway unless the std feature is enabled

dhardy · 2018-03-22T08:12:28Z

@tspiteri can I interest you in reviewing?

dhardy · 2018-03-22T12:26:59Z

I had a thought: we could do extra testing with biased generators. E.g. if an RNG generated four numbers and XORed all, I think the mean output from this algorithm should be 2^-4 = 1/16. This might let us better test extremely unlikely outputs?

dhardy · 2018-03-23T10:22:17Z

Continuing our discussion about using a wrapper type...

Range is currently parameterised only around the return type; it then uses Uniform internally to sample values before mapping to the target range. We could in theory parametrise Range also over a distribution, but while tempting, that distribution can't simply replace Uniform (the widening-multiply method would make no sense for other distributions), so in reality this would become just Range<T, Uniform> most of the time and not be very useful.

So using unique types for high-precision stuff is probably the best approach. In which case we might have to choose between something like one of the following:

// generic wrapper type:
let range = Range::new(HighPrecision(-1.0), HighPrecision(1.0));
let range = Range::new(HP(-1.0), HP(1.0));
// specific wrapper types:
let range = Range::new(f64(-1.0), f64(1.0));

I think possibly f32 / f64 is the best option?

Also @burdges perhaps you have a better idea of how to parametrise Range to use a different, high-precision, method internally?

pitdicker · 2018-03-23T10:22:26Z

Continuing from #326 (comment):

I am slowly coming around to the idea of using a wrapper type. Would then something like this work?

let value = rng.gen::<HighPrecision<f32>>().0;
let value_in_range = rng.gen_range::<HighPrecision<f32>>(-10.0, 10.0).0;

Still slightly ugly, but no extra distributions.

pitdicker · 2018-03-23T10:28:20Z

let range = Range::new(HighPrecision(-1.0), HighPrecision(1.0));

Ah, that is how it would look. Again seems ugly to me. That wrapper type has no meaning for the input values.

pitdicker · 2018-03-23T10:35:29Z

I had a thought: we could do extra testing with biased generators.

I don't really follow what you are suggesting.
But I don't think testing using averages etc. is going to be all that useful. I did it before submitting this PR, and it did help catch a typo. It seems better to me to just test against a couple of known-good values, and leave it at that.

pitdicker · 2018-03-23T10:46:46Z

One other thing I am thinking about (directly copied from the old open_closed01 method in the experimental branch):

This method is similar to closed_open01, with one extra step: If the fractional part ends up as zero, add 1 to the exponent in 50% of the cases. This changes 0.5 to 1.0, 0.25 to 0.5, etc.

The chance to select these numbers that straddle the boundary between exponents is 33% to high in closed_open01 (e.g. 33%*2^-23 for 0.5f32). The reason is that with 0.5 as example the distance to the next number is 2^-23, but to the previous number 2^-24. With the adjustment they have exactly the right chance of getting sampled.

So our HighPrecision01 is not 'perfect' yet, it would have to sample from [0, 1] to be able to represent every value with exactly the right chance.

But I am not sure this is something we want to fix. It would require an extra branch, and taking one bit from the exponent, which seems like it would complicate the function a lot. The worst case is 0.5f32, which will appear every once in 2^24 times too often.

dhardy · 2018-03-26T11:37:34Z

So "0.5" represents the space between 0.5 and 0.5+ε/2, whereas the next smallest number represents the space between 0.5-ε/4 and 0.5; indeed we always round down instead of rounding up 50% of the time. I guess a "fix" would be to have a 50% chance of adding 1 to the fractional part, and dealing with the case it wraps to zero properly (increase exponent).

But should we?

The significance is tiny — 1 in 2^{fraction bits}, so arguably if it matters you should be using higher-precision floating-point types already
For certain operations such as sample < p test the current behaviour is correct anyway.

So I'd say no.

dhardy · 2018-03-26T16:54:33Z

Ah, that is how it would look. Again seems ugly to me. That wrapper type has no meaning for the input values.

You are right that the wrapper type solution is ugly, but the alternatives are:

whole new version of Range for high-precision float stuff (but identical on other types, if implemented) — more ugly
a new RangeImpl (e.g. RangeFloatHP) accessed via a different constructor, Range::new_hp; I think this needs a separate helper trait SampleRangeHP

The latter actually sounds reasonable, don't you think? Obviously this means it would never be used by Rng::gen_range but not a big deal.

I am wondering though if we should offer two versions of gen_bool, or at least document that rng.sample(HighPrecision01) < p offers the same performance if p is variable and has higher precision for p not a multiple of 2^-32.

pitdicker · 2018-03-26T17:15:37Z

In my mind having two distributions, HighPrecision01 and HighPrecisionRange, is the simplest solution.

I wonder if we could even make do with a distribution HighPrecisionRange. If we write things right, the compiler should see that HighPrecisionRange(0.0, 1.0) doesn't do much more than val * 1.0 + 0.0, and optimize things out.

We don't need to support all methods from Range. range_inclusive and sample_single do not change things here or make things faster. So this can be a simple distribution, a bit like how Binomial takes two values.

dhardy · 2018-03-26T17:19:27Z

It's not going to be HighPrecisionRange(0.0, 1.0) because IIRC there are four cases to handle differently, so a constructor is needed; thus I see no reason not to use Range<HighPrecisionFloatRange> or similar.

tspiteri · 2018-03-27T12:33:21Z

src/distributions/float.rs

+                    exp += bits.trailing_zeros() as i32;
+                    // If RNG were guaranteed unbiased we could skip the
+                    // check against exp; unfortunately it may be.
+                    // Worst case ("zeros" RNG) has recursion depth 16.


The recursion can be removed easily. The contents of fallback can look like the code below, which I also find a little easier to understand.

// Worst case ("zeros" RNG) will have 16 iterations (f64 exp bias / 64). loop { let bits = rng.gen::<$uty>(); exp += bits.trailing_zeros() as i32; // if exp reaches bias, return a subnormal that can be zero if exp >= $exponent_bias { return fraction.into_float_with_exponent(-$exponent_bias); } if bits != 0 { return fraction.into_float_with_exponent(-exp); } }

Actually I just thought the recursion was neater since @pitdicker recommended this bit be in a separate function anyway (not inlined so that the common code is small). Since it will almost never recurse I don't see a problem with that strategy. But using a loop also works (was used in an older commit).

pitdicker · 2018-04-03T14:57:17Z

I have been thinking about how to do a range implementation that can have precision like HighPrecision01.

Take for example a range like [0.0 .. 3.5]. If we generate a value in [0.0 .. 1.0] and multiply by 3.5 we get a (small) rounding mess. And a range like [-3.0 .. 7.0] is going to need some trickery to bring part of the unsymmetrical part over to the other sign.

When we only used 32/64 bits to generate the float this seemed acceptable. But if we call this 'as much precision as the floating-point format can represent', it gets messy. Is it instead possible to generate values directly inside the target range? I think it is possible, with only a very small part of the values that have to be rejected.

What is needed (for f32):

1 bit for the sign.
23 bits for the fraction.
an unknown number of bits for the exponent.

Let's assume the [-3.0 .. 7.0] range again.

This would need an exponent between 3 and -9, where each smaller exponent had 50% the chance of the larger one.
We have two zones to reject from: [-8.0 .. -3.0] and [7.0 .. 8.0].
We can improve on the last zone. If we have the largest exponent (3) we can zero out as many most significant bits of the fraction as are zero in 7.0. And the same is true in the zone [-4.0 .. -3.0].
We can also improve a bit on the first zone. Basically we generate the largest exponents (that are asymmetrical), 3 in this case, twice to often. We only use some of the exponents bits to get this exponent. Only the first bit for the highest exponent, two bits for the second highest etc. If we fall in this zone, we need to pick another bit. Based on that bit we reject the exponent in 50% of the cases. So we are only rejecting a couple of bits, not the whole value. And as we already have a mechanism in place to generate more bits just for the exponent, this is relatively cheap.

I think this would give us full precision, without the rounding problems.

dhardy · 2018-04-03T16:04:24Z

Sounds very good!

Can we also eliminate the sign bit if not needed? If so I suspect this can be as fast as the [0, 1) implementation with those bounds (when statically known); even without eliminating the sign it may be close enough that we don't need both.

dhardy · 2018-04-04T09:58:59Z

On second thoughts, it sounds complicated without having large rejection zones (and if you consider a range like [-1 .. 70] you effectively need to sample from (-128, 128) then with rejection you need to throw out approx 5/7ths of all possible samples).

I don't think your zeroing trick can work without bias. Numbers from [7, 8) should be rejected; to get there you need positive sign, exponent 3, and the top two bits of the fraction 1, but you can't just zero all the other fraction bits in this case because that introduces bias, and you need those bits in most cases. You could possibly only sample enough bits of the fraction to know you are in unusable territory (i.e. sign+exponent+top two bits, then next 1 bit pushes you over 7), but then you have to sample the next fraction part from however many bits are left, and it gets complicated (and likely slow).

What's your second idea — sample the sign bit and the first exponent bit, and if both are 1 (i.e. range (-8, 4] which is unusable) then skip those two bits? That is unbiased but I suspect it will be a bit slow because there's extra looping. Note that you do need to resample the sign bit too to avoid bias, and you can't just shift 1 bit because you know the next bit is 1 in this case which introduces bias.

This gives some improvement to the above range [-1, 70], but still rejects values from (70, 128) (assuming we don't do complex partial fraction rejection) and rejects bits from (-128, -1).

So it sounds to me like any perfect generation method would be slow.

There's also closed ranges to consider: ideally we would sample from [0, 1] by sampling from [0, 2) then rejecting anything in (1, 2), but this is (just a bit) ridiculous.

All FP arithmetic is designed to be lossy anyway once the maximum representable precision is exceeded. Perfectly producing random values seems like a lot of work to save one extra bit of precision — and if users care about precision they will already be using f64 which very likely has more precision than actually needed. So I think the multiplication method is the better option.

tspiteri · 2018-04-04T19:09:56Z

This seems to be settled, but just a small note: I agree that just one bit of precision is not worth the extra complexity for ranges that are not from 0 to 1; using mul_add(range_width, range_low) will result in only one rounding error after all.

tspiteri · 2018-04-04T19:40:24Z

Replying to my own comment, mul_add does result in one rounding error, but it can result in more than one bit of lost precision. If for example the range 0 to 1 is to be converted to the range -1 to 1, the result of rand01.mul_add(2.0, -1.0) will not have high precision around zero, as resulting numbers close to zero will come from something like close_to_half.mul_add(2.0, -1.0), which will have errors of the order of f32::EPSILON, not of the order of f32::MIN_POSITIVE.

But handling such cases should probably be left to the user of the library. In that particular example, if the library user really wants high precision in the range -1 to 1, they can just generate a number from 0.0 to 1.0 and then generate the sign using a random bool. The most that can be expected from the rand crate, I think, is to mention something in the docs.

dhardy · 2018-04-05T07:48:39Z

No, this isn't settled, just recreated (#372) due to history cleanup.

@pitdicker already implemented a method to maintain (some) precision close to 0; see https://github.com/dhardy/rand/blob/experimental/src/distributions/range.rs#L432. I guess we will use that or similar code.

dhardy and others added 30 commits January 10, 2018 09:25

Merge pull request rust-random#210 from pitdicker/hc-128_up

3d1dbb6

Add HC-128

Merge pull request rust-random#225 from dhardy/error

8e8bb6a

New error handling + type

Replace thread_rng with StdRng wrapper in tests

7ccf19d

Final fixes to enable testing with no_std

b4c7517

Simplify reseeding::test_rng_fill_bytes

4828506

Merge pull request rust-random#227 from pitdicker/no_std_alloc

c2060ef

Implement Rng for Box with alloc feature

Add CI tests for no_std testing enabled by rust-random#227

0561fb6

Port rand to CloudABI.

79023b8

CloudABI has a special system call, random_get(), that can be used to obtain random data from the environment (the kernel, emulator, etc). We can invoke this system call by making use of the cloudabi crate.

HC128: avoid use of Vec, as other PRNGs were updated by rust-random#227

fc987d2

Avoid doc tests with Rustc 1.15

f871ec7

Merge pull request rust-random#224 from EdSchouten/cloudabi

c794c49

Port rand to CloudABI.

Fix typo

463f336

Merge branch 'no_std_tests'

2c1965b

Merge pull request rust-random#226 from pitdicker/fix_docs

d1e0ab1

Fix documentation warnings

Add le module, to be moved to rand_core later

3a63497

Copied from dhardy/master, minus single-read functions, with extra tests

Isaac64: shorten seeds and update results

c1d7dbf

The planned switch to 256-bit byte array seeds does not allow such long seeds.

Rework SeedableRng trait

01eee83

Implement NewSeeded trait

99fefbe

Replace ReseedWithDefault with ReseedWithNew

14f27db

Fix benchmarks and no_std

449cb38

Improve documentation

b21d1a9

Make winapi gated on the std feature

b158824

It's not being used anyway unless the std feature is enabled

Remove unsafe from from_seed

3d4e686

Rename NewSeeded to NewRng

2f42698

Update documentation

1302d4e

Documentation improvements

34239fa

Make Fill_via_u*_chunks not modify src

db9ba80

Add tests for ChaChaRng set_counter

17b9571

Make the number of rounds in ChaCha variable

f6ddbe9

Extend ChaChaRng documentation

c22c5d5

dhardy mentioned this pull request Mar 22, 2018

Tracker: planned changes for 0.5 #232

Closed

33 tasks

pitdicker added the T-distributions label Mar 22, 2018

pitdicker mentioned this pull request Mar 23, 2018

Update documentation for ReadRng error handling #326

Merged

dhardy mentioned this pull request Mar 26, 2018

Osrng doc #333

Merged

dhardy added F-new-int Functionality: new, within Rand P-high Priority: high labels Mar 26, 2018

tspiteri reviewed Mar 27, 2018

View reviewed changes

pitdicker closed this Apr 4, 2018

pitdicker force-pushed the master branch from baac1ca to 73ea10f Compare April 4, 2018 17:30

This was referenced Apr 4, 2018

Clean up git history #350

Closed

Implement HighPrecision01 distribution #372

Closed

dhardy mentioned this pull request Jun 6, 2018

Add API for getting a bool with chance of exactly 1-in-10 or 2-in-3 #491

Merged

dhardy mentioned this pull request Jan 25, 2021

gen_range(f64::MIN..f64::MAX) should panic instead of silently overflow #1090

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement HighPrecision01 distribution #320

Implement HighPrecision01 distribution #320

pitdicker commented Mar 21, 2018

dhardy commented Mar 22, 2018

dhardy commented Mar 22, 2018

dhardy commented Mar 23, 2018

pitdicker commented Mar 23, 2018

pitdicker commented Mar 23, 2018

pitdicker commented Mar 23, 2018

pitdicker commented Mar 23, 2018

dhardy commented Mar 26, 2018

dhardy commented Mar 26, 2018

pitdicker commented Mar 26, 2018

dhardy commented Mar 26, 2018

tspiteri Mar 27, 2018

dhardy Mar 27, 2018

pitdicker commented Apr 3, 2018

dhardy commented Apr 3, 2018

dhardy commented Apr 4, 2018

tspiteri commented Apr 4, 2018

tspiteri commented Apr 4, 2018

dhardy commented Apr 5, 2018

Implement HighPrecision01 distribution #320

Implement HighPrecision01 distribution #320

Conversation

pitdicker commented Mar 21, 2018

dhardy commented Mar 22, 2018

dhardy commented Mar 22, 2018

dhardy commented Mar 23, 2018

pitdicker commented Mar 23, 2018

pitdicker commented Mar 23, 2018

pitdicker commented Mar 23, 2018

pitdicker commented Mar 23, 2018

dhardy commented Mar 26, 2018

dhardy commented Mar 26, 2018

pitdicker commented Mar 26, 2018

dhardy commented Mar 26, 2018

tspiteri Mar 27, 2018

Choose a reason for hiding this comment

dhardy Mar 27, 2018

Choose a reason for hiding this comment

pitdicker commented Apr 3, 2018

dhardy commented Apr 3, 2018

dhardy commented Apr 4, 2018

tspiteri commented Apr 4, 2018

tspiteri commented Apr 4, 2018

dhardy commented Apr 5, 2018