alg/dict: SlidingWindow: max zeroes and shortening #110

Nik-U · 2021-06-08T03:37:29Z

As discussed in #56. I finally got around to submitting these improvements.

This change adds new variations of SlidingWindow and modifies
Hybrid to use an arbitrary Decomposer after runs are removed.
SlidingWindowRTL and SlidingWindowShortRTL construct the windows
from least to most significant bit (right-to-left) instead of the
default left-to-right approach. SlidingWindowShort and
SlidingWindowShortRTL incorporate a "shortening" heuristic that
sometimes cuts windows short. The Z parameter restricts the
maximum number of zeroes that may appear in a window. If a window
is maximum length, contains at least one zero, and the bit
following the window is a one, then the window is shortened in
order to yield all trailing ones to the next window.

This new behavior was inspired by the windowing technique used for
the upper half of smooth isogeny primes in [isogenychains]. This
update also adds p512-2 from [isogenychains] to the result set.

Improvements with the new Ensemble:

p256_scalar improved from +2 to -1
p384_scalar improved from +1 to +0
isop512_field (new) is -3
p2519_field improved from 263 to 261

Notably, the isop512_field results are better than [isogenychains]
when using their weighting metric (square = 0.8 * multiply).

This change adds new variations of SlidingWindow and modifies Hybrid to use an arbitrary Decomposer after runs are removed. SlidingWindowRTL and SlidingWindowShortRTL construct the windows from least to most significant bit (right-to-left) instead of the default left-to-right approach. SlidingWindowShort and SlidingWindowShortRTL incorporate a "shortening" heuristic that sometimes cuts windows short. The Z parameter restricts the maximum number of zeroes that may appear in a window. If a window is maximum length, contains at least one zero, and the bit following the window is a one, then the window is shortened in order to yield all trailing ones to the next window. This new behavior was inspired by the windowing technique used for the upper half of smooth isogeny primes in [isogenychains]. This update also adds p512-2 from [isogenychains] to the result set. Improvements with the new Ensemble: - p256_scalar improved from +2 to -1 - p384_scalar improved from +1 to +0 - isop512_field (new) is -3 - p2519_field improved from 263 to 261 Notably, the isop512_field results are better than [isogenychains] when using their weighting metric (square = 0.8 * multiply).

mmcloughlin · 2021-08-30T18:24:09Z

Really sorry I missed this! I'll need to take some time to go through this in more detail, but based on a quick skim it's looking great, and the results are exciting too, so I expect we can land this soon :)

briansmith

With these tweaks, the addition chain would be 289 = 253 doubles + 36 additions, saving one doubling and two additions compared to my previous result.

Analogous comments apply to the other chains.

This is very nice work!

briansmith · 2024-10-10T18:33:54Z

doc/results.md

-i286      = ((i257 << 7 + _111111) << 10 + _1100011) << 10
-return      (_10010101 + i286) << 6 + _1111
+_111      = _10 + _101
+_1000     = 1 + _111


There should be an optimization pass that replaces additions that yield an even number with doublings. This should become _1000 = 2 * _100 to replace one addition with a doubling.

briansmith · 2024-10-10T18:34:11Z

doc/results.md

+_111      = _10 + _101
+_1000     = 1 + _111
+_1110     = 2*_111
+_10000    = _10 + _1110


Similarly, this can be replaced with a doubling.

briansmith · 2024-10-10T18:39:15Z

doc/results.md

+x16       = _11111111 + i28
+i37       = i28 << 8
+x24       = x16 + i37
+x32       = i37 << 8 + x24


It is strange that this changed from x32 = x16 << 16 + x16 to instead compute x24 unnecessarily (IIUC). If we fixed this then the addition chain would be two shorter than my previously-published one.

briansmith · 2024-10-10T18:59:17Z

doc/results.md

+i190      = ((i169 << 4 + _101) << 8 + _1011011) << 7
+i210      = ((_100111 + i190) << 9 + _101111) << 8 + _101111
+i229      = ((_1110 + i210) << 11 + _1001111) << 5 + _111
+i249      = (i229 << 9 + _11011111 + _1000) << 8 + _101011


I am not sure why this this addition of _11100111 = _11011111 + _1000 gets inlined here whereas none of the others seem to. It makes it harder to follow. Regardless, computing _11100111 doesn't help; I avoided computing it and then redid the middle of this chain with the remaining windows, which saved an additional addition.

briansmith · 2024-10-10T19:47:01Z

doc/results.md

+_10111    = _1000 + _1111
+_11001    = _10 + _10111
+_11011    = _10 + _11001
+_11111    = _1000 + _10111


Assuming it makes sense to calculate 11111:

At this point you have 11111 and 111, so you can compute x8 = 11111 << 3 + 111, or x10 = 11111 << 5 + 11111.

We need 6 * 8 * 4 = 192 + 2 = 194 consecutive ones to start.

x20 = x10 << 10 + x10 x40 = x20 << 20 + x20 x80 = x40 << 40 + x40 x160 = x80 << 80 + x80 x180 = x160 << 160 + x20 x190 = x180 << 10 + x10 x194 = x140 << 4 + _1111

EDIT: This would save one addition:

x20 = x10 << 10 + x10 x24 = x20 << 4 + _1111 x48 = x24 << 24 + x24 x96 = x48 << 48 + x48 x192 = x96 << 96 + x96

To get to x194 we could do:

x194 = x192 << 2 + _11

But as these two bits are in the "random" part of the addition chain, doing something else is likely better.

briansmith · 2024-10-10T19:47:47Z

doc/results.md

+i23       = i17 << 5 + i17
+i34       = i23 << 10 + i23
+i61       = (i34 << 4 + _11111000) << 21 + i34
+i113      = (i61 << 3 + _1111100) << 47 + i61


It is confusing as to why we're adding even numbers where the least significant bits don't contribute anything. I wonder if this indicates a bug in the new windowing algorithm where it doesn't realize that trailing zeros are worthless.

briansmith reviewed Oct 10, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

alg/dict: SlidingWindow: max zeroes and shortening #110

alg/dict: SlidingWindow: max zeroes and shortening #110

Nik-U commented Jun 8, 2021

mmcloughlin commented Aug 30, 2021

briansmith left a comment

briansmith Oct 10, 2024

briansmith Oct 10, 2024

briansmith Oct 10, 2024

briansmith Oct 10, 2024

briansmith Oct 10, 2024 •

edited

Loading

briansmith Oct 10, 2024

alg/dict: SlidingWindow: max zeroes and shortening #110

Are you sure you want to change the base?

alg/dict: SlidingWindow: max zeroes and shortening #110

Conversation

Nik-U commented Jun 8, 2021

mmcloughlin commented Aug 30, 2021

briansmith left a comment

Choose a reason for hiding this comment

briansmith Oct 10, 2024

Choose a reason for hiding this comment

briansmith Oct 10, 2024

Choose a reason for hiding this comment

briansmith Oct 10, 2024

Choose a reason for hiding this comment

briansmith Oct 10, 2024

Choose a reason for hiding this comment

briansmith Oct 10, 2024 • edited Loading

Choose a reason for hiding this comment

briansmith Oct 10, 2024

Choose a reason for hiding this comment

briansmith Oct 10, 2024 •

edited

Loading