A prime sieve #197

czurnieden · 2019-04-04T20:36:02Z

The actual sieve from #190 plus the two functions mp_next_small_prime and mp_prec_small_prime

minad · 2019-04-25T09:42:56Z

tommath.h

+#   define LTM_SIEVE_PR_UINT            PRIu32
+#   define LTM_SIEVE_UINT_MAX           0xFFFFFFFFlu
+#   define LTM_SIEVE_UINT_MAX_SQRT      0xFFFFlu
+#endif


Could you move those definitions to the private header with a MP_* prefix? Maybe just use size_t instead of LTM_SIEVE_UINT?

Could you move those definitions to the private header with a MP_* prefix

It already needs rebase'ing, so: no problem.
(Apropos rebase'ing: should I wait a day or two until your stuff gets merged or is there much more to come?)

Maybe just use size_t instead of LTM_SIEVE_UINT

Mmh…no, I need to know the exact sizes here and size_t is most likely LTM_SIEVE_UINT but I cannot be completely sure and I need to.

@czurnieden From me there is not much more to come for now. I would rather like to reduce the backlog a bit.

Mmh…no, I need to know the exact sizes here and size_t is most likely LTM_SIEVE_UINT but I cannot be completely sure and I need to.

We already include limits.h?

(Sorry for the delay, everybody seems to have pushed their urgent things to "after Easter" which didn't make it better *sigh*)

Could you move those definitions to the private header with a MP_* prefix?

The MP_ prefix I can do. Was already planned when I watched you harmonizing the style.

Making (some of) the macros private, not so much.

The sieve exists in three sizes, one for 8-bit, one extra-large, and one for the whole rest.

#ifdef MP_8BIT # define LTM_SIEVE_BIGGEST_PRIME 65521lu # define LTM_SIEVE_UINT uint16_t # define LTM_SIEVE_PR_UINT PRIu16 # define LTM_SIEVE_UINT_MAX 0xFFFFlu # define LTM_SIEVE_UINT_MAX_SQRT 0xFFlu #elif ( (defined MP_64BIT) && (defined LTM_SIEVE_USE_LARGE_SIEVE) ) # define LTM_SIEVE_BIGGEST_PRIME 18446744073709551557llu # define LTM_SIEVE_UINT uint64_t # define LTM_SIEVE_PR_UINT PRIu64 # define LTM_SIEVE_UINT_MAX 0xFFFFFFFFFFFFFFFFllu # define LTM_SIEVE_UINT_MAX_SQRT 0xFFFFFFFFllu #else # define LTM_SIEVE_BIGGEST_PRIME 4294967291lu # define LTM_SIEVE_UINT uint32_t # define LTM_SIEVE_PR_UINT PRIu32 # define LTM_SIEVE_UINT_MAX 0xFFFFFFFFlu # define LTM_SIEVE_UINT_MAX_SQRT 0xFFFFlu #endif

The macro LTM_SIEVE_UINT is used in the definition of the functions, a replacement is not easy:

Maybe just use size_t instead of LTM_SIEVE_UINT?

According to the standard (ISO/IEC 9899:2011 7.20.3) the limit of size_t is SIZE_MAX and has only a minimum (65535) not a fixed size. I would need to use a large type (mp_word would do it, I think) for it as a replacement and that would make it slower for high-mp architectures and is also a waste of memory.

I could also do something in the line of typedef TYPE mp_small_prime which looks a bit more elegant.

LTM_SIEVE_BIGGEST_PRIME can be set private, although it comes quite handy in loops.

LTM_SIEVE_PR_UINT can be set private, although it would be a bit of a work for the user to find out the correct way to print the values.

LTM_SIEVE_UINT_MAX can be set private. It is convenient but LTM_SIEVE_BIGGEST_PRIME should suffice.

LTM_SIEVE_UINT_MAX_SQRT can be set private, but it is the size of the base which might be usefull to know for low memory architectures, On the other side: it is not directly the size, it needs a bit of work to get the actual size from this information. This is the macro I would be able to kick out with the cleanest conscious.

Tell me what you think while I'll run the rebase and the replace LTM_SIEVE_ MP_SIEVE_ -- *.[ch]

czurnieden · 2019-05-22T16:31:50Z

@minad rebased and tried to adapt to the new API. I'm pretty sure that some slipped through, please check if I missed some.

Oh, and this is not really a "work in progress", I don't plan to add anything in this PR (re. logic parts and all that) this one is complete.
Being complete doesn't mean that is fully polished, of course, and free of errors ;-)

minad · 2020-02-20T22:07:13Z

Close, see #160 for the reason

czurnieden · 2023-03-13T22:02:04Z

Rebased to actual (see date) develop branch.

Did a bit of a cleanup (no featuritis anymore, at least not that much) and added some documentation.
The base sieve has a size of 4096 bytes (that is fixed) as do the segments, but that is not fixed and can go down to 670 bits (largest prime-gap < 2^32 is 335). Default is 4096 bits, 512 bytes, which gives a good random access time (about 100 microsec on my machine) and is not bade sequentially especially in the base sieve. Warm-up time of the base sieve is about 50 microseconds.

Binary size (stripped) is a combined 6k, the stripped s_mp_prime_tab.o has 2.8k (all compiled for 64-bit) so about double the size.

This sieve is useful to add a large-ish random prime to the Miller-Rabin test. It is quite simple (although computationally intense) to generate strong pseudoprimes in cryptographic relevant sizes if the tests use small primes only. In our case two and three in the first run and one random large one (could be composite with very small factors). Just checking the whole table in s_mp_prime_tab does not do it, there are known pseudoprimes that pass all 255 tests.

It would be better to combine the 2,3 test with a random large-ish prime from the sieve. We could generate that small prime with the deterministic version of MR but that costs, the little space and negligible runtime needed to do it with this sieve would pay.

czurnieden · 2023-03-16T01:19:18Z

I don't know what is on with the VS tests, maybe a caching problem?
9dabf48 complained about a comparison it never complained about, before.

18beae6 tried some changes, same error, same line-number.

b21bef4 tested my guess with changing the line of the. error. VS threw the same error with the same line number.

czurnieden · 2023-03-16T22:15:40Z

So, VS, it was just coincidence that the numbers 1630 and later 1634 could be taken as a line numbers because they were inside newly added code?
Yes, it was a problem with signedness, but those kind of error messages are not very helpful!

czurnieden · 2023-03-16T22:46:18Z

It is just the sieve itself, optimized for quick random access for the primes up to 2^32 -5. Sequential access above the prime 2^16 - 15 is rather slow but not unusable.

The time to add all primes up to 2^32 is 2:40 min with the default segment-size and 0:55 min with a slightly larger segment size.

Warm-up time (bullding the base sieve) is about 60 microseconds. Random access in the base sieve is <1 microsecond, random access in a segemnt is about 100 microseconds including the building of the segment but that timing depends on the size of the segment. See documentation in bn.tex for more details.

Fun fact: Pari/gp s = 0;forprime(n=0, 2^32, s+=n);s needs 1:24 min on the same machine.

MasterDuke17 · 2023-03-17T02:18:34Z

Do you find this faster than https://ceur-ws.org/Vol-1326/020-Forisek.pdf ?

…

On Thu, Mar 16, 2023 at 6:46 PM Christoph Zurnieden < ***@***.***> wrote: It is just the sieve itself, optimized for quick random access for the primes up to 2^32 -5. Sequential access above the prime 2^16 - 15 is rather slow but not unusable. The time to add all primes up to 2^32 is 2:40 min with the default segment-size and 0:55 min with a slightly larger segment size. Warm-up time (bullding the base sieve) is about 60 microseconds. Random access in the base sieve is <1 microsecond, random access in a segemnt is about 100 microseconds including the building of the segment but that timing depends on the size of the segment. See documentation in bn.tex for more details. Fun fact: Pari/gp s = 0;forprime(n=0, 2^32, s+=n);s needs 1:24 min on the same machine. — Reply to this email directly, view it on GitHub <#197 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACOHYUOBE6W4YMRMH4NARDDW4OJ4NANCNFSM4HDWUNIA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

czurnieden · 2023-03-17T04:48:47Z

@MasterDuke17 It is in the base sieve but it is not in the segments.

It is rarely as simple in real life as it is in the papers ;-)

We support 16-bit architectures. That means that the largest type we can use is a 32-bit integer (which is already a bigint in 16-bit archs) so we have to avoid anything larger than that. Forišek's implementation of the Miller-Rabin test needs 64-bit variables. It can be avoided but for a cost in runtime. If that would still be faster is a good question. And we already have a Miller-Rabin test. Do we need another?

But it is much smaller in code, admitted, even with such a large table.

My code has some possibilities to optimize, I was just happy that I got VS to finally accept it ;-)

the segments are normal sieves, they should be restricted to odd numbers, too. Less memory used and speedier.
extend the wheel from 2 to 210 = 2*3*5*7 or maybe even higher: more speed (to an extend) but also more code
get rid of the segments altogether and compute the primes directly (complicated but prob. the fastest)
minimize the segments. The largest prime-gap < 2^32 is 335. If we only keep the odd numbers, we need 21 bytes, that are 6 32-bit integers to find the next prime without recomputing that little segment. And that memory is on the stack, not on the heap (which might be a disadvantage if stack-space is tight) . A lot of the code managing the memory can go, so we have more speed with less code and less memory.

But further optimization depends heavily on the success of the actual use: adding a random small prime to the first M-R tests in mp_prime_is_prime. The base of our M-R test is a bigint, so we can multiply the three together 2*3*random_prime and have a better first test without much more overhead (how much? t.b.d.) to reduce the calls to the quite costly Lucas and/or Frobenius-Underwood tests. LTM offers a way to avoid them and just rely on M-R only. That always gave me the…uhm…heeby-jeebies—it made me a bit uncomfortable.

It quiet easy to construct strong pseudorprimes to several small bases. Even as much as all of the bases in LTM's prime-table[1]. The algorithm is highly parallelizable, a medium botnet makes quick work out of it. I put a quick&dirty example in this gist here. The example call at the end produces a strong pseudoprime to the bases 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 in a couple of seconds (378 milliseconds when I tried it now but the trials are random , so it might last longer sometimes) or the first 15 bases with parameters 50, 1, 61, 173, 64, 100000. Or all 25 prime bases smaller than 100 with parameters 97, 1,101, 173, 64, 100000. For smaller pseudoprimes use 97, 1,101, 113, 16, 1000 which is quite quick, spits out 100 spsp's in under 2 minutes.

We could use a normal random base as we already do in the final step which is composite in about n - n/log(n) cases and could be composed out of all small primes. And counting the latter is a problem I'm now stuck with ;-)

A risk that can be minimized to some extend with another large-ish prime in the first M-R round.
The obvious question: is it worth it?
I'll implement it and we can take a look at it. If it is too much hassle for too little gain: dump it, if not: take it.

[1] Albrecht, Martin R., et al. "Prime and prejudice: primality testing under adversarial conditions." Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 2018.
PDF: https://eprint.iacr.org/2018/749.pdf

MasterDuke17 · 2023-03-17T11:57:27Z

Ah, we have it easier in MoarVM where we don’t support 16-bit architectures.But this may be a perfect time to ask if you have any comments on https://stackoverflow.com/a/75724718/1077672?Sent from my iPhoneOn Mar 17, 2023, at 12:50 AM, Christoph Zurnieden ***@***.***> wrote: @MasterDuke17 It is in the base sieve but it is not in the segments. It is rarely as simple in real life as it is in the papers ;-) We support 16-bit architectures. That means that the largest type we can use is a 32-bit integer (which is already a bigint in 16-bit archs) so we have to avoid anything larger than that. Forišek's implementation of the Miller-Rabin test needs 64-bit variables. It can be avoided but for a cost in runtime. If that would still be faster is a good question. And we already have a Miller-Rabin test. Do we need another? But it is much smaller in code, admitted, even with such a large table. My code has some possibilities to optimize, I was just happy that I got VS to finally accept it ;-) the segments are normal sieves, they should be restricted to odd numbers, too. Less memory used and speedier. extend the wheel from 2 to 210 = 2*3*5*7 or maybe even higher: more speed (to an extend) but also more code get rid of the segments altogether and compute the primes directly (complicated but prob. the fastest) minimize the segments. The largest prime-gap < 2^32 is 335. If we only keep the odd numbers, we need 21 bytes, that are 6 32-bit integers to find the next prime without recomputing that little segment. And that memory is on the stack, not on the heap (which might be a disadvantage if stack-space is tight) . A lot of the code managing the memory can go, so we have more speed with less code and less memory. But further optimization depends heavily on the success of the actual use: adding a random small prime to the first M-R tests in mp_prime_is_prime. The base of our M-R test is a bigint, so we can multiply the three together 2*3*random_prime and have a better first test without much more overhead (how much? t.b.d.) to reduce the calls to the quite costly Lucas and/or Frobenius-Underwood tests. LTM offers a way to avoid them and just rely on M-R only. That always gave me the…uhm…heeby-jeebies—it made me a bit uncomfortable. It quiet easy to construct strong pseudorprimes to several small bases. Even as much as all of the bases in LTM's prime-table[1]. The algorithm is highly parallelizable, a medium botnet makes quick work out of it. I put a quick&dirty example in this gist here. The example call at the end produces a strong pseudoprime to the bases 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 in a couple of seconds (378 milliseconds when I tried it now but the trials are random , so it might last longer sometimes) or the first 15 bases with parameters 50, 1, 61, 173, 64, 100000. Or all 25 prime bases smaller than 100 with parameters 97, 1,101, 173, 64, 100000. For smaller pseudoprimes use 97, 1,101, 113, 16, 1000 which is quite quick, spits out 100 spsp's in under 2 minutes. We could use a normal random base as we already do in the final step which is composite in about n - n/log(n) cases and could be composed out of all small primes. And counting the latter is a problem I'm now stuck with ;-) A risk that can be minimized to some extend with another large-ish prime in the first M-R round. The obvious question: is it worth it? I'll implement it and we can take a look at it. If it is too much hassle for too little gain: dump it, if not: take it. [1] Albrecht, Martin R., et al. "Prime and prejudice: primality testing under adversarial conditions." Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 2018. PDF: https://eprint.iacr.org/2018/749.pdf —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>

czurnieden · 2023-04-05T14:03:03Z

@MasterDuke17 I run some tests and found the method using BigInts about ten times slower than the native versions. The native version able to generate primes up to 32 bit is about the same speed as the sieve for random access, sometimes even faster. I have not tested sequential generation but it should be in the same ballpark.

If we use the whole mp_prime_is_prime() shebang it averages out at about three times slower for generating random primes up to 64 bit than the native version.

I have not run the tests against #541 which should shave another usec or two off in that range.

So: the whole work for nothing? Well, that's life ;-)

czurnieden force-pushed the bn_sieve branch 4 times, most recently from 85d0276 to c9c37f2 Compare April 8, 2019 02:21

minad reviewed Apr 25, 2019

View reviewed changes

czurnieden force-pushed the bn_sieve branch 4 times, most recently from ad19a9a to 880fdf0 Compare May 19, 2019 20:13

minad added the work in progress label May 21, 2019

czurnieden force-pushed the bn_sieve branch 2 times, most recently from 6659295 to 8a82665 Compare May 22, 2019 16:26

czurnieden force-pushed the bn_sieve branch 5 times, most recently from fdb02b0 to 52402ab Compare June 3, 2019 13:57

minad closed this Feb 20, 2020

czurnieden reopened this Mar 13, 2023

czurnieden force-pushed the bn_sieve branch 2 times, most recently from e7a3b41 to 8ae39ca Compare March 14, 2023 05:11

A prime sieve

c3216ae

czurnieden force-pushed the bn_sieve branch from 02a671e to c3216ae Compare March 16, 2023 22:20

czurnieden added feedback required and removed work in progress labels Mar 16, 2023

czurnieden added this to the v2.0.0 milestone Mar 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A prime sieve #197

A prime sieve #197

czurnieden commented Apr 4, 2019

minad Apr 25, 2019

czurnieden Apr 25, 2019

minad Apr 25, 2019

czurnieden Apr 27, 2019

czurnieden commented May 22, 2019 •

edited

Loading

minad commented Feb 20, 2020

czurnieden commented Mar 13, 2023

czurnieden commented Mar 16, 2023

czurnieden commented Mar 16, 2023

czurnieden commented Mar 16, 2023

MasterDuke17 commented Mar 17, 2023 via email

czurnieden commented Mar 17, 2023

MasterDuke17 commented Mar 17, 2023 via email

czurnieden commented Apr 5, 2023 •

edited

Loading

A prime sieve #197

Are you sure you want to change the base?

A prime sieve #197

Conversation

czurnieden commented Apr 4, 2019

minad Apr 25, 2019

Choose a reason for hiding this comment

czurnieden Apr 25, 2019

Choose a reason for hiding this comment

minad Apr 25, 2019

Choose a reason for hiding this comment

czurnieden Apr 27, 2019

Choose a reason for hiding this comment

czurnieden commented May 22, 2019 • edited Loading

minad commented Feb 20, 2020

czurnieden commented Mar 13, 2023

czurnieden commented Mar 16, 2023

czurnieden commented Mar 16, 2023

czurnieden commented Mar 16, 2023

MasterDuke17 commented Mar 17, 2023 via email

czurnieden commented Mar 17, 2023

MasterDuke17 commented Mar 17, 2023 via email

czurnieden commented Apr 5, 2023 • edited Loading

czurnieden commented May 22, 2019 •

edited

Loading

czurnieden commented Apr 5, 2023 •

edited

Loading