Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of fast division (recursive divrem only) #370

Merged
merged 1 commit into from
Oct 23, 2019

Conversation

czurnieden
Copy link
Contributor

Direct implementation of algorithms 1.8 "RecursiveDivRem" and 1.9 "UnbalancedDivision"
from:

Brent, Richard P., and Paul Zimmermann. "Modern computer arithmetic"
Vol. 18. Cambridge University Press, 2010
Available online

pages 19ff. in the above online document

@MasterDuke17
Copy link
Collaborator

Should this change anything about my in-progress implementation of the faster to_radix?

@czurnieden
Copy link
Contributor Author

@MasterDuke17

Should this change anything about my in-progress implementation of the faster to_radix?

No.
I'm just offering it, it may or may not find it's way into LTM eventually.

@minad minad added this to the v2.0.0 milestone Oct 14, 2019
@czurnieden czurnieden force-pushed the recursive_division branch 2 times, most recently from b9d6e5a to ff6759a Compare October 15, 2019 19:15
@minad
Copy link
Member

minad commented Oct 16, 2019

I would like to have this 👍

@minad minad self-requested a review October 16, 2019 07:53
Copy link
Member

@minad minad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you refactor this a bit, introducing a s_mp_div_small function? Then disabling s_mp_div_recursive/school would result in the small function being used.

   if (MP_HAS(S_MP_DIV_RECURSIVE)
       && (b->used > MP_KARATSUBA_MUL_CUTOFF)
       && (b->used <= ((a->used)/3*2))) {
      err = s_mp_div_recursive(a, b, c, d);
   } else if (MP_HAS(S_MP_DIV_SCHOOL)) {
      err = s_mp_div_school(a, b, c, d);
   } else {
      err = s_mp_div_small(a, b, c, d);
   }

It seems BN_MP_DIV_SMALL is the only macro we have which enables an alternative version being built, circumventing the MP_HAS configuration system.

@czurnieden
Copy link
Contributor Author

Could you refactor this a bit, introducing a s_mp_div_small function?

Yepp, no prob'.

It seems BN_MP_DIV_SMALL is the only macro we have which enables an alternative version being built, circumventing the MP_HAS configuration system.

As we are working towards 2.0.0 now (Ignoring the bugfix-versions 1.2.x): we could change it and include it in the MP_HAS configuration system?

bn_s_mp_div_small.c Outdated Show resolved Hide resolved
bn_s_mp_div_small.c Outdated Show resolved Hide resolved
@czurnieden
Copy link
Contributor Author

The snippet

   if (MP_HAS(S_MP_DIV_RECURSIVE)
       && (b->used > MP_KARATSUBA_MUL_CUTOFF)
       && (b->used <= ((a->used)/3*2))) {
      err = s_mp_div_recursive(a, b, c, d);
   } else if (MP_HAS(S_MP_DIV_SCHOOL)) {
      err = s_mp_div_school(a, b, c, d);
   } else {
      err = s_mp_div_small(a, b, c, d);
   }

Does not seem to work as intended (with LTM_NOTHING), any suggestions besides bracketing them off the old preprocessor way?

tommath_superclass.h Outdated Show resolved Hide resolved
bn_s_mp_div_school.c Outdated Show resolved Hide resolved
bn_s_mp_div_recursive.c Outdated Show resolved Hide resolved
@minad
Copy link
Member

minad commented Oct 17, 2019

What do you mean - does not work as intended? If you compile with LTM_NOTHING, only s_mp_div_small will be used as I see it. And yes, this is 2.0 - we can do breaking changes.

@czurnieden
Copy link
Contributor Author

If you compile with LTM_NOTHING, only s_mp_div_small will be used as I see it.

Only if I remove the guards and that will compile s_mp_school and s_mp_recursive, too, which goes against the intent of s_mp_div_small to decrease the size of the actual lib.

The way our configuration system works now results in the following entry in `tommath_class.h

#if defined(BN_MP_DIV_C)
#   define BN_MP_CMP_MAG_C
#   define BN_MP_COPY_C
#   define BN_MP_ZERO_C
#   define BN_S_MP_DIV_RECURSIVE_C
#   define BN_S_MP_DIV_SCHOOL_C
#   define BN_S_MP_DIV_SMALL_C
#endif

So all three are defined and hence all three get compiled if I remove the guards.

If I don't remove the guards, the functions are not included and the linker complains.

If I add extra guards around the branching like e.g.:

#ifndef  BN_S_MP_DIV_SMALL
   if (MP_HAS(S_MP_DIV_RECURSIVE)
       && (b->used > MP_KARATSUBA_MUL_CUTOFF)
       && (b->used <= ((a->used)/3*2))) {
      err = s_mp_div_recursive(a, b, c, d);
   } else {
      err = s_mp_div_school(a, b, c, d);
   }
#else 
      err = s_mp_div_small(a, b, c, d);
#endif 

it works but it looks ugly and defies the reason you introduced MP_HAS(x) for: to get rid of all of these preprocessors branches.

Any suggestions?

@minad
Copy link
Member

minad commented Oct 17, 2019

Yes, we should rework the configuration system. Optional dependencies (guarded by MP_HAS) should not be required automatically in tommath_class.h. I made such suggestions already in #301. But this is independent of this PR and we can address this later.

@czurnieden
Copy link
Contributor Author

we can address this later.

OK.
So I removed all BN_S_MP_DIV_SMALL guards (including the definition in tommath_superclass.h).

[...] in #301.

It is still on the TODO list, no worry.
I think I'll boldly add the 2.0.0 milestone.

@minad
Copy link
Member

minad commented Oct 20, 2019

@czurnieden this PR seems almost ready. Can you rebase it?

@czurnieden
Copy link
Contributor Author

this PR seems almost ready.

Bugfixes not withstanding, of course, but yes.
Changed label accordingly.

demo/test.c Outdated Show resolved Hide resolved
demo/test.c Outdated Show resolved Hide resolved
demo/test.c Outdated
@@ -2325,6 +2325,140 @@ static int test_mp_radix_size(void)
return EXIT_FAILURE;
}

#ifndef S_MP_DIV_SMALL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the guard and test both functions test_s_mp_div_recursive, trst_s_mp_div_small. Additional guards are rarely necessary. MP_HAS takes care of it.

@minad minad self-requested a review October 21, 2019 14:53
@minad
Copy link
Member

minad commented Oct 21, 2019

@czurnieden please squash @sjaeckel this looks ready from my side

@minad
Copy link
Member

minad commented Oct 21, 2019

I forgot - does it make sense to add an additional cut off here? Instead of using karatsuba?

@czurnieden
Copy link
Contributor Author

" out-of-date with the base branch"?
You are really busy the last couple of days!

I forgot - does it make sense to add an additional cut off here? Instead of using karatsuba?

Found no differences above Karatsuba only below (it needs fast multiplication to function). The larger difference is in the relation numerator/denominator where it stops being faster approaching ~2/3 and even starts to get slower above 0.8.

I don't think that any kind of tuning would make sense.

@minad
Copy link
Member

minad commented Oct 21, 2019

Hmm ok, but I mean is there a theoretical reason why the cutoff should be the same? If there is none, I would prefer if you introduce another constant (even if the value is the same).

@czurnieden
Copy link
Contributor Author

Hmm ok, but I mean is there a theoretical reason why the cutoff should be the same?

As I said: it needs fast multiplication to function and the lowest cutoff for that is the Karatsuba cutoff; there is a direct connection, not just coincidence.

We can ignore Comba here because the most likely reason for not wanting the Comba algorithm is lack of memory which means there is no other fast multiplication in that case and I'm pretty sure no space for fast division either.

@minad
Copy link
Member

minad commented Oct 21, 2019

Ok!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants