Signed-digit multi-comb for ecmult_gen #546

peterdettman · 2018-08-04T10:27:56Z

See section 3.3 of https://eprint.iacr.org/2012/309 for a description of the algorithm. Briefly, the scalar is recoded into signed-binary form, then divided into several blocks. A separate precomp. table is prepared for each block, and performing a multiplication is done using one comb per block, interleaved.

This implementation is constant-time, preserves the existing scalar blinding, but the NUMS group element is not yet used, perhaps not really useful (no zeroes in the signed-digit recoding). ~~Static precomputation is not yet implemented.~~ Settings are overridden to let the exhaustive tests work.

You can play with the comb parameters in ecmult_gen.h .

Compared to the existing approach, this gives improved performance/memory tradeoffs, and allows considerable flexibility in the parameters depending on platform details.

The following table gives an idea of the sort of tradeoffs available (bench_sign results - best "min" of 3, asm=no, 64bit field and scalar, -O3, Haswell):

Blocks	Teeth	Spacing	Memory (KiB)	Time (us)
43	6	1	86	39.2
22	6	2	44	39.7
11	6	4	22	40.1
4	6	11	8	41.0
4	5	13	4	42.6
2	5	26	2	44.6
2	4	32	1	48.4
1	4	64	0.5	53.3
1	3	86	0.25	63.1
1	2	128	0.125	82.3
1	1	256	0.0625	140

For existing approach: 44.6us (64KiB precomp. data)

- see section 3.3 of https://eprint.iacr.org/2012/309

gmaxwell · 2018-08-11T00:22:08Z

Oh wow! I wasn't expecting to see a big (relative to what we normally get) speedup any time soon.

apoelstra · 2018-09-23T00:44:30Z

src/ecmult_gen_impl.h

+#if USE_COMB
+    ctx->prec = (secp256k1_ge_storage (*)[COMB_BLOCKS][COMB_POINTS])secp256k1_ecmult_gen_ctx_prec;
+#if COMB_OFFSET
+    secp256k1_ge_from_storage(&ctx->offset, &secp256k1_ecmult_gen_ctx_offset);


secp256k1_ecmult_gen_ctx_offset is not declared except in gen_context.c, so this line doesn't compile for me when I set the parameters to 4/4/16.

This is a build system issue; there's no dependency of gen_context on ecmult_gen.h (and presumably others). At the moment, after changing comb parameters in ecmult_gen.h, you'd need to touch gen_context.c (or just make clean).

This has burned me multiple times while testing -- @sipa can you advise how to fix this?

apoelstra · 2018-09-23T00:46:02Z

src/ecmult_gen.h

+/* The remaining COMB_* parameters are derived values, don't modify these. */
+#define COMB_BITS (COMB_BLOCKS * COMB_TEETH * COMB_SPACING)
+#define COMB_GROUPED ((COMB_SPACING == 1) && ((32 % COMB_TEETH) == 0))
+#define COMB_OFFSET (COMB_BITS == 256)


Can you add some documentation somewhere about what the offset does?

Done: e8beef9 .

Great, thanks!

peterdettman · 2018-09-23T05:52:22Z

The CI test failure (1407.7) seems spurious, might need investigation.

apoelstra · 2018-09-23T13:54:28Z

Agreed it seems spurious - I had the same issue on #557. I kicked travis to see if it works this time.

apoelstra · 2018-09-23T16:27:26Z

src/ecmult_gen.h

+
+  /* COMB_BLOCKS, COMB_TEETH, COMB_SPACING must all be positive and the product of the three (COMB_BITS)
+   * must evaluate to a value in the range [256, 288]. The resulting memory usage for precomputation
+   * will be COMB_POINTS_TOTAL * sizeof(secp256k1_ge_storage). */


Should comment that COMB_SPACING should not exceed 32 or else the bit_pos logic in secp256k1_ecmult_gen stops working.

I don’t see why, and all cases in the table above passed the tests before benchmarking.

Oh, I think I forgot to make clean after recompiling again. The tests do pass now.

But my reasoning was that you're only looking at one word of recoded at once, so if the bit_pos index jumps forward by more than one word, eventually bit_pos >> 5 will increment by more than one per iteration and you'll have skipped an entire word. But I'm re-reading the code with fresher eyes and now I see that this is exactly the intended behaviour.

apoelstra · 2018-09-23T16:30:26Z

src/ecmult_gen.h

+
+/* The remaining COMB_* parameters are derived values, don't modify these. */
+#define COMB_BITS (COMB_BLOCKS * COMB_TEETH * COMB_SPACING)
+#define COMB_GROUPED ((COMB_SPACING == 1) && ((32 % COMB_TEETH) == 0))


I think we should drop COMB_GROUPED because it's impossible for 32 % COMB_TEETH to be 0. (If COMB_TEETH is zero, things obviously won't work...but if it's ≥ 32, then several other things break: COMB_MASK overflows; the bits = expression in ecmult_gen_impl.h:228 left-shifts too far; I also get this bizarre error about the size of prec being negative in ecmult_gen.h:71.)

In fact, once COMB_TEETH gets much past 10 we start running into trouble in secp256k1_ecmult_gen_context_build because we put COMB_POINTS_TOTAL-many gejs on the stack.

Given that this is a ridiculous thing to do I don't think we should try to support it, e.g. by using heap allocations rather than stack allocations in gen_context_build. We should just assume it doesn't happen and drop COMB_GROUPED.

I assume you are misreading 32%T, which is 0 when T is a small power of 2. Still, practical values of COMB_TEETH probably peak at 8, and there is the stack concern as you say, so we should probably have the precompiler constrain it.

You're right, I misread 32 % COMB_TEETH (which is 0 for 1,2,4,8) as COMB_TEETH % 32 (which can't be 0). Let me reassess COMB_GROUPED in that light.

4c8ff9b adds precompiler guards on the comb constants.

apoelstra · 2018-09-23T16:57:46Z

Since USE_COMB is now always set, I think it's worth adding another commit that removes the old precomp code.

Other than that, and my nits about extreme parameter settings, ACK.

apoelstra · 2018-09-24T17:34:36Z

src/ecmult_gen_impl.h

-                bit = (recoded[bit_pos >> 5] >> (bit_pos & 0x1F)) & 1;
-                bits |= bit << tooth;
+                bit = recoded[bit_pos >> 5] >> (bit_pos & 0x1F);
+                bits &= ~(1 << tooth);


I think this bit-clearing line isn't actually needed, because bits starts at 0 and 1 << tooth is unique on each iteration.

Note that ‘bit’ may now contain junk (or noise) bits beyond bit 0. Iterations after the first therefore need to clear the target bit (and we try to leave noise in the high bits). The change is in response to [1], which I think @gmaxwell mentioned on IRC a while back.

[1] https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-alam.pdf

Ah! I understand, I was misreading bit << tooth as (bit & 1) << tooth.

Because the variable is named bit :P

apoelstra · 2018-09-24T18:34:17Z

ACK except that I'd really like to fix the make clean issue.

apoelstra · 2018-10-02T14:01:00Z

src/ecmult_gen_impl.h

+        bit_pos = comb_off;
+        for (block = 0; block < COMB_BLOCKS; ++block) {
+#if COMB_GROUPED
+            bits = recoded[bit_pos >> 5] >> (bit_pos & 0x1F);


Should this be done as part of a cmov ladder, like the old algorithm, to avoid cache-timing attacks?

This line is reading a window from the scalar. IIUC, the cmov ladder you mean is the _ge_storage_cmov loop at lines 255-257.

Oh, yes! I missed that - you do indeed have a cmov ladder at the point where the secret data is actually used. Thanks.

sipa · 2019-11-11T22:46:07Z

I've opened a rebased PR with somewhat better integration added in #693.

real-or-random · 2020-02-25T13:50:20Z

I've opened a rebased PR with somewhat better integration added in #693.

Closing this in favor of #693.

sipa · 2021-12-29T21:08:01Z

Another iteration: #1058.

peterdettman added 2 commits August 4, 2018 16:43

Signed-digit multi-comb for ecmult_gen

75d9e97

- see section 3.3 of https://eprint.iacr.org/2012/309

Support static precomputation with multi-comb

09ca146

apoelstra reviewed Sep 23, 2018

View reviewed changes

Add comments for context offset

e8beef9

apoelstra reviewed Sep 23, 2018

View reviewed changes

peterdettman added 5 commits September 24, 2018 20:06

Reduce side-channels from single-bit reads

183a7ff

Avoid unnecessary doublings in precomputation

e848342

Add precompiler guards on comb constants

4c8ff9b

Make use of negation optional via COMB_NEGATION

9156619

Add missing COMB_NEGATION for exhaustive tests

91179c1

apoelstra reviewed Sep 24, 2018

View reviewed changes

apoelstra reviewed Oct 2, 2018

View reviewed changes

gmaxwell mentioned this pull request May 25, 2019

Use a static constant table for small ecmult WINDOW_G sizes. #614

Closed

apoelstra mentioned this pull request Nov 9, 2019

Don't put an absurd amount of data onto the stack in some configs #692

Closed

sipa mentioned this pull request Nov 11, 2019

Signed-digit multi-comb for ecmult_gen (by peterdettman) #693

Closed

real-or-random closed this Feb 25, 2020

This was referenced Dec 29, 2021

Signed-digit multi-comb ecmult_gen algorithm #1057

Closed

Signed-digit multi-comb ecmult_gen algorithm #1058

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Signed-digit multi-comb for ecmult_gen #546

Signed-digit multi-comb for ecmult_gen #546

peterdettman commented Aug 4, 2018 •

edited

Loading

gmaxwell commented Aug 11, 2018

apoelstra Sep 23, 2018

peterdettman Sep 23, 2018

apoelstra Sep 24, 2018

apoelstra Sep 23, 2018

peterdettman Sep 23, 2018

apoelstra Sep 23, 2018

peterdettman commented Sep 23, 2018 •

edited

Loading

apoelstra commented Sep 23, 2018

apoelstra Sep 23, 2018

peterdettman Sep 24, 2018

apoelstra Sep 24, 2018

apoelstra Sep 23, 2018 •

edited

Loading

apoelstra Sep 23, 2018

peterdettman Sep 24, 2018

apoelstra Sep 24, 2018

peterdettman Sep 24, 2018

apoelstra commented Sep 23, 2018

apoelstra Sep 24, 2018

peterdettman Sep 24, 2018

apoelstra Sep 24, 2018

apoelstra Sep 24, 2018

apoelstra commented Sep 24, 2018

apoelstra Oct 2, 2018

peterdettman Oct 2, 2018

apoelstra Oct 2, 2018

sipa commented Nov 11, 2019

real-or-random commented Feb 25, 2020

sipa commented Dec 29, 2021

Signed-digit multi-comb for ecmult_gen #546

Signed-digit multi-comb for ecmult_gen #546

Conversation

peterdettman commented Aug 4, 2018 • edited Loading

gmaxwell commented Aug 11, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

peterdettman commented Sep 23, 2018 • edited Loading

apoelstra commented Sep 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apoelstra Sep 23, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apoelstra commented Sep 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apoelstra commented Sep 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sipa commented Nov 11, 2019

real-or-random commented Feb 25, 2020

sipa commented Dec 29, 2021

peterdettman commented Aug 4, 2018 •

edited

Loading

peterdettman commented Sep 23, 2018 •

edited

Loading

apoelstra Sep 23, 2018 •

edited

Loading