Implement rand(::Range{BigInt}) #9122

rfourquet · 2014-11-23T10:47:49Z

This is a rebase of #8255, with minor modifications.
The travis test doesn't pass on linux because functions only available in GMP 6 are used, but isn't version 6 a requirement now? (cf. #7957).

ivarne · 2014-11-23T14:06:07Z

No, currently we require GMP (>= 5.0) in the README. Not sure about the situation in other parts of Julia, but from the discussion in #7957 it seems likely that we can bump the requirement. Packagers on Linux systems seem to have problems with updating dependencies, so we should probably be carefull.

rfourquet · 2014-11-23T14:13:36Z

OK, so GMP 6 may not be a requirement for Julia 0.4 in the end? in this case I would see if I can implement this using only GMP 5.

nalimilan · 2014-11-23T14:35:13Z

If you need it I guess we'll deal with that dependency when building packages, but if that's not really needed it would be easier for packagers not to use it. :-)

ViralBShah · 2014-11-23T15:27:55Z

I think that given julia 0.4 is going to release after a few months, we can go with GMP 6 as a dependency. All the distros will have it by then.

ivarne · 2014-11-23T16:33:20Z

I changed the title to WIP, so that it is clear that it can't be merged before we update GMP requirement.

ViralBShah · 2014-11-23T16:35:20Z

@nalimilan Would it be crazy to require GMP 6 for julia 0.4?

nalimilan · 2014-11-23T17:39:59Z

Absolutely not, it's already in the upcoming Fedora 21 (but generally speaking, raising requirements should only happen with relatively strong reasons).

rfourquet · 2014-11-24T07:16:34Z

So there was a solution in GMP5, which is no less efficient despite having to copy data instead of generating it directly into the BigInt.
Removing WIP as there are no more GMP requirements.

rfourquet · 2014-11-26T06:47:27Z

Here is the link to the doc of mpz_import, if someone wanted to check the wrapping code.
The Travis failure seems to be unrelated (it happens in parallel.jl/sharedarray.jl).

ViralBShah · 2014-11-29T07:29:05Z

@rfourquet Can you squash these commits?

ViralBShah · 2014-11-29T07:30:30Z

Also, in the squashed commits, can you give a slightly more detailed explanation of the commits?

tkelman · 2014-11-29T09:38:19Z

opened #9176 for the travis failure - it actually looks like it's in bitarray

rfourquet · 2014-11-30T04:24:42Z

This is squashed now, with some explanations added in the commit.

ViralBShah · 2014-11-30T18:28:25Z

A few points as I look through this and other related code:

Let's have the GMP 6 code in there, but with a version check. If there is no easy way to check the GMP version, let's have it in there, but commented out. Is there a significant reason to use the new API - performance or memory?
The names could be a little more consistent so that the relationships are clearer. How about renaming RangeGenerator to AbstractRangeGenerator, RandIntGen to RangeGeneratorInt, and RandIntGenBigInt to RangeGeneratorBigInt? Could inrange then just be rand_int_gen?
I notice that the range length is an unsigned integer, and we specialize for both UInt32 and UInt64. What if we just use Int, which is essentially what we use in most places? I don't think we need to have every possible combination and the full range - if we can keep things simpler, unless there is a good reason for it.

rfourquet · 2014-12-01T06:22:30Z

There is a way to get the GMP version, I didn't thought of that. For small ranges, I couldn't detect significative speed differences between the two versions, but on bigger ranges the GMP5 version is visibly slower (e.g. 30% slower for rand(1:big(2)^10000)).
I mostly agree with your changes, but I don't feel the need (yet) of "Abstract" in AbstractRangeGenerator. I was unsure for the name inrange, but it reads nicely in rand(mt, inrange(r)), where r is a range. But when squashing I wanted to change it to rangegen. I would prefer that the name has "range" in it, unlike rand_int_gen.
I started a refactoring to simplify this (a bit similar to rand of AbstractArray #8649). But there was opposition to removing the full range. The UInt32 version could maybe be removed if that doesn't make things slower for small types.

rfourquet · 2014-12-01T14:38:31Z

Updated:

the code for the two versions of GMP is there. I named a constant GMP_VERSION in gmp.jl, but if I call it simply VERSION , this message is issued during compilation "Warning: imported binding for VERSION overwritten in module GMP"... aren't they in a different namespace?
I kept the name RangeGenerator, and used also RangeGenerator as the constructor (instead of inrange).

ViralBShah · 2014-12-01T16:15:15Z

It seems that the Windows 64-bit has failed. The worker running the random test is the one that never came back - but there are no error messages.

https://ci.appveyor.com/project/StefanKarpinski/julia/build/1.0.133/job/yhkll194gvedibs9

@tkelman How do we figure this one out? Can you try this branch?

ViralBShah · 2014-12-01T16:16:56Z

@rfourquet The PR looks good otherwise.

tkelman · 2014-12-01T18:35:43Z

@ViralBShah sure I'm building this branch now, will let you know what I see

tkelman · 2014-12-01T19:31:50Z

This does seem to freeze or go into an infinite loop of some kind:

  | | |_| | | | (_| |  |  Version 0.4.0-dev+1904 (2014-12-01 14:28 UTC)
 _/ |\__'_|_|_|\__'_|  |  rf/rand-bigrange/e209872* (fork: 1 commits, 1 day)
|__/                   |  x86_64-w64-mingw32

julia> T = UInt32;

julia> s = big(typemax(T)-1000) : big(typemax(T)) + 10000
4294966295:4294977295

julia> using Base.Test

julia> @test rand(s) != rand(s)

tkelman · 2014-12-01T19:57:22Z

On master at fd531ed that same sequence of inputs gives ERROR: stack overflow

rfourquet · 2014-12-02T03:23:31Z

Thanks @tkelman. The stack overflow on master is expected, but I don't see anything that could cause the freeze, I believe it happens in the while true in rand(rng::AbstractRNG, g::RangeGeneratorBigInt).
I'm not familiar with endianness, so I may have put the wrong endianness argument to mpz_import: is the processor big endian? Could you show the value of Base.GMP.GMP_VERSION, and of Base.Random.RangeGenerator(s) in the example above?

tkelman · 2014-12-02T03:46:15Z

julia> Base.GMP.GMP_VERSION
v"6.0.0"

julia> Base.Random.RangeGenerator(s)
Base.Random.RangeGeneratorBigInt(4294966295,11000,1,0x00003fff)

tkelman · 2014-12-02T03:50:10Z

I'm not sure about endianness (I don't think it's that different on Windows?) but I have seen in the past that GMP and MPFR tend to use C longs in many of their API's, and I do know that long is only 32 bits on Win64. But I'll defer to you on whether that would make a difference to this code.

rfourquet · 2014-12-02T06:14:29Z

The other AppVeyor build seems to succeed on a pentium 4 32-bits machine... I read again the code without finding any bug. Would you accept to make another test after adding prints around line 457:

print(limbs, " ")
ccall((:__gmpz_limbs_finish, :libgmp), Void, (Ptr{BigInt}, Clong), &x, g.nlimbs)
println(x)

and then run

T = UInt32; s = big(typemax(T)-1000) : big(typemax(T)) + 10000;  println(rand(s)); rand(s)

?

tkelman · 2014-12-02T08:12:57Z

https://gist.github.com/2e22f8b016fda3d35c76

That was terminated by me with ctrl-C, didn't show any signs of stopping. Make sure you're not assuming Int and Clong are the same anywhere, on Win64 they're different.

ViralBShah · 2014-12-02T08:22:04Z

The code seems to use Clong everywhere in ccall, and I assume that Ints passed are converted appropriately by ccall.

rfourquet · 2014-12-02T09:20:40Z

Thanks a lot! The manual says indeed that ccall converts arguments to what is declared in ccall argument types.
So the first call to rand works (0x00000722 == UInt32(1826)), but then in the second call we find that 0x000006ac == 326417516204 % UInt32 but with 326417516204 = UInt64(0x4c)<<32 + 0x000006ac. So my guess is that for some reason, mpz_limbs_finish assumes that limbs are not of type Culong but UInt64, and that when reading the first limb, it reads beyond the passed array (UInt32[0x000006ac] of size 1), which contains 0x4c after the end position, interpreting those two UInt32 as one UInt64 limb.
In gmp.jl, the data field is Ptr{Culong} so I had assumed that it's enforced that limbs are Culong (default value in gmp.h), but if a local GMP installation is used, it can be different, and I think that this particular installation has configured GMP with #define _LONG_LONG_LIMB. I will see if there is a way to detect the size of the limbs.

rfourquet · 2014-12-02T10:09:23Z

I pushed a new commit replacing the use of Culong (for limbs) by an auto-detected Limb type. If this fixes the problem above, I'll squash it.

ViralBShah · 2014-12-02T12:44:49Z

Since you are squashing, could you also update the comments with a bit about the type of Limb? Should be good to merge after that.

* Previously, calling e.g. rand(big(1:4)) caused a stack overflow, because rand(mt::MersenneTwister, r::AbstractArray) was calling itself recursively. This is fixed here, but a mecanism handling not-implemented types should be added. * RandIntGen is renamed to RangeGeneratorInt. * A type RangeGeneratorBigInt similar to RangeGeneratorInt (both subtypes of RangeGenerator) is introduced to handle rand on BigInt ranges. RangeGenerator is the generic constructor of such objects, taking a range as parameter. Note: two different versions are implemented depending on the GMP version (it is faster with version 6 on big ranges). * BigInt tests from commit bf8c452 are re-added.

rfourquet · 2014-12-02T13:25:59Z

@ViralBShah I added some comment above the Limb definition in gmp.jl, is this fine?

rfourquet · 2014-12-02T13:34:01Z

Referencing squashed commit to record discussion.

Implement rand(::Range{BigInt})

ViralBShah · 2014-12-02T16:06:15Z

Thanks - this got done nicely.

tkelman · 2014-12-02T19:33:25Z

Good catch @rfourquet. I don't think we're specifically asking for this configuration item but it's probably buried somewhere in config.log.

rfourquet · 2014-12-03T04:36:05Z

@tkelman and it was a good configuration as far as tests are concerned!
I fixed the limbs field and put the check suggested in 8b0921c#commitcomment-8799280 at https://github.com/JuliaLang/julia/compare/rf/rand-bigrange-fixup. I thougth it didn't deserve a new PR but wanted someone to check before meging.

@JeffBezanson

Now the runtime version of GMP is checked against the compile time version, as suggested by @JeffBezanson. He also noted that the limbs field of RangeGeneratorBigInt was not specific enough.

rfourquet added the randomness Random number generation and the Random stdlib label Nov 23, 2014

rfourquet mentioned this pull request Nov 23, 2014

fix infinite loop in rand(::Range(BigInt)) #8255

Closed

ivarne changed the title ~~Implement rand(::Range{BigInt})~~ WIP: Implement rand(::Range{BigInt}) Nov 23, 2014

rfourquet changed the title ~~WIP: Implement rand(::Range{BigInt})~~ Implement rand(::Range{BigInt}) Nov 24, 2014

rfourquet force-pushed the rf/rand-bigrange branch 2 times, most recently from 3f19fc1 to beff573 Compare November 30, 2014 04:20

rfourquet force-pushed the rf/rand-bigrange branch 2 times, most recently from 622c530 to e209872 Compare December 1, 2014 14:28

rfourquet force-pushed the rf/rand-bigrange branch from 8b0921c to 79e4104 Compare December 2, 2014 13:21

ViralBShah added a commit that referenced this pull request Dec 2, 2014

Merge pull request #9122 from JuliaLang/rf/rand-bigrange

7eb9a14

Implement rand(::Range{BigInt})

ViralBShah merged commit 7eb9a14 into master Dec 2, 2014

rfourquet deleted the rf/rand-bigrange branch December 3, 2014 03:03

rfourquet added the bignums BigInt and BigFloat label May 23, 2017

Implement rand(::Range{BigInt}) #9122

Implement rand(::Range{BigInt}) #9122

Conversation

rfourquet commented Nov 23, 2014

ivarne commented Nov 23, 2014

rfourquet commented Nov 23, 2014

nalimilan commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

ivarne commented Nov 23, 2014

ViralBShah commented Nov 23, 2014

nalimilan commented Nov 23, 2014

rfourquet commented Nov 24, 2014

rfourquet commented Nov 26, 2014

ViralBShah commented Nov 29, 2014

ViralBShah commented Nov 29, 2014

tkelman commented Nov 29, 2014

rfourquet commented Nov 30, 2014

ViralBShah commented Nov 30, 2014

rfourquet commented Dec 1, 2014

rfourquet commented Dec 1, 2014

ViralBShah commented Dec 1, 2014

ViralBShah commented Dec 1, 2014

tkelman commented Dec 1, 2014

tkelman commented Dec 1, 2014

tkelman commented Dec 1, 2014

rfourquet commented Dec 2, 2014

tkelman commented Dec 2, 2014

tkelman commented Dec 2, 2014

rfourquet commented Dec 2, 2014

tkelman commented Dec 2, 2014

ViralBShah commented Dec 2, 2014

rfourquet commented Dec 2, 2014

rfourquet commented Dec 2, 2014

ViralBShah commented Dec 2, 2014

rfourquet commented Dec 2, 2014

rfourquet commented Dec 2, 2014

ViralBShah commented Dec 2, 2014

tkelman commented Dec 2, 2014

rfourquet commented Dec 3, 2014