-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplifications #434
Simplifications #434
Conversation
ab9eda9
to
15db0ac
Compare
@sjaeckel @czurnieden Since this is a huge PR touching a lot of functions, it might make sense to split it into smaller pieces, which are easier to consume, if that is desired. |
15db0ac
to
1e03dfc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
had only a quick look for now, basically looks good
Generally, regarding this PR my proposal would be the following. I split it in three parts:
@sjaeckel What do you think? |
that was my first thought when I saw this huge PR |
1e03dfc
to
2b7df46
Compare
68cc5ce
to
1d1c898
Compare
I splitted the more complicated parts of this PR in multiple commits regarding logical units of functions. These can also be extracted to separate PRs if desired. This PR is rebased on #435. |
94bb741
to
f1779e7
Compare
@czurnieden @sjaeckel If you want to open the renaming bottle, I would propose to rename |
f1779e7
to
4d865c8
Compare
e97cb84
to
bad0292
Compare
If you want to include the proposed changes (inclusion of the subfunctions like Ah, I forgot one: the data type for the shift amount is still
These are private functions, what you call them should not matter at all.
But one of the many parental duties ;-) |
One could only provide only one shift function which decides based on the sign. I think it makes sense to keep it as is. Or rather return an error for negative shifts, this is what I did in this PR. mp_lshd and mp_rshd are special however. They ignore negative shifts since mp_rshd doesn't return an error code. |
@czurnieden If you have time in the next days, I would highly appreciate if you review the things I did here, since this could easily lead to bugs. I did things step by step and used the test suite quite often, but things could have happened. Feel free to also push additional commits if something comes up. |
@czurnieden One other thing regarding comba - since you asked about performance. Actually I don't expect there to be a huge difference between asm and the c version. In ltm we only have the generic comba and nothing unrolled. I think in tfm the performance difference comes mainly due to loop unrolling in combination with the custom assembly. Probably also due to vectorization. |
718dbc8
to
ca8cb27
Compare
Took a short first glimpse, and can say with confidence: it needs a second glimpse and a much longer one, too. BTW: I found that the cut-off values in
Computed with the abomination below
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far, so good.
@sjaeckel this is good to merge! |
cfc3761
to
70bc075
Compare
Originally I made those as macros. However we have many other small functions like mp_clamp, mp_exch which are also not implemented as macros right now. If we would use c99, I would implement them as private static inline functions. And mp_exch would be a public static inline function. But since we are bound to c89, we simply use normal functions. To achieve optimal performance one should either use link time optimization or amalgamation.
* these double checks are not necessary * the compiler will move the early return outside of the called function, basically the functions is partially inlined * however lto/amalgamation needed for the optimization
* this is the final commit of a series of simplifications, containing only the regenerated files and the explanation in the commit message * This is in preparation of the size_t change/a potential representation change to use full width as in tfm, if a (partial?) merge with tfm is desired. These changes have their own merits however. * Remove obfuscating tmpx digit pointers (fewer variables, it is more obvious what is being manipulated) * Reduce scope of variables where possible * Stricter error handling/checking (for example handling in karatsuba was broken) * In some cases the result was written even in the case of an error (e.g. s_mp_is_divisible). This will hide bugs, since the user should check the return value (enforced by MP_WUR). Furthermore if the user accesses the non-initialized result, valgrind will complain for example. Global static analysis like coverity will also detect the issue. Therefore this improves the status quo. * Introduce generic, private MP_EXCH macro which can be used to swap values. * Introduce s_mp_copy_digs/s_mp_zero_digs/s_mp_zero_buf * Some control flow simplifications, e.g, loops instead of goto * Renamings of variables/labels for consistency * Renamings of mul/sqr functions for more consistency, e.g., comba instead of fast suffix * I didn't read through some very complex functions. They are so complex, I am too afraid and lazy to touch them. Maybe someone resposible wants to simplify them if possible. Hint... Hint... - mp_prime_strong_lucas_selfridge.c - s_mp_exptmod.c - s_mp_exptmod_fast.c
70bc075
to
410bf49
Compare
[skip ci]
sorry for the delay, I was sick for the last week :-\ |
This PR contains two things:Renaming some mp_(root|expt|log)_u32 functions back to suffix less version. Using uint32_t for those functions was a mistake, I have to admit that now. In particular using a suffix is not a good idea, since this is a slight obstacle now to change the types :( These functions should either take/return bitcounts (int) or mp_digits. For example mp_log is defined in terms of count_bits, so this is the natural type to use. mp_root calls mp_mul_d, so mp_digit would be a good fit. For now I am using simply int.Many simplifications - I hope that those will be helpful when we decide to convert the bitcount type to size_t/or something else. Furthermore these simplifications are helpful if we want to change the digit representation, such that the full width is used.