Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf/tensor commitment #341

Conversation

AlexandreBelling
Copy link
Collaborator

@AlexandreBelling AlexandreBelling commented Feb 13, 2023

  • Fix the test in utils_test.go to have them use NaiveMulMod instead of naiveMulMod
  • Implements a reference benchmark test for sis

Use case for the SIS hash function:

The function instance.Sum(...) is called in "bulk".

Namely, in the prover we want to hash many vectors at the same time. Thus, there is no need to parallelize within the instance.Sum function. For this reason, we only care about the single-threaded performances of the Sum function.

In the past version (pure-SIS), we noticed there was an issue with the memory bandwidth footprint of the function (on 96 cores, the memory bus was the bottleneck). But we believe, this will not be the case with the ring-SIS approach that we wish to optimize.

The function is importantly used

The vectors we intend to hash (seen as slices of field elements) typically contain plenty of successive zeroes.

The SIS hash function works by splitting in limbs the fields elements, then interpreting the limbs as the coefficients of polynomials and then performing a scalar product of polynomials.

Let us illustrates it with an example (where we assume q = 251, log2beta=2, n = 2). And let $x$ be an input vector $x = (172, 201, 0, 0, 0, 0)$ where each entry is understood to be in the field modulo 251.

  • First, we need to split $x$ in limbs. Since, log2beta=2, we have that each limb may store 2 bits of the inputs and we need to total of 4 limbs for each field elements. For 172, we have the decomposition $172 = 2 * 64 + 2 * 16 + 3 * 4 + 0*1$ and for $201 = 3 * 64 + 0 * 16 + 2 * 4 + 1 * 1$ (equivalent to the base 4 decomposition). Thus, we obtain the following limb decomposition for x.
x' = (2, 2, 3, 0, 3, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0)
  • Then, the limb decomposition $x'$ is splitted in chunks of size $n = 2$. Each chunk is interpreted as the coefficient of a polynomial module X^n + 1. Thus, we have the following polynomials.
x'' = (2X + 2, 3X, 3X, 2X+1, 0, 0, 0, 0) = (P0, P1, P2, P3)

Note, that P2 and P3 are the zero polynomials.

  • Thereafter, the result of the hashing is obtained by computing.

H = P0A0 + P1A1 + P2A2 + P3A3

The optimization is to notice that since P2 and P3 are the zero polynomial, we can just "skip" the term in the scalar product.

While it looks simple, it is a crucial optimization for the prover.

@gbotrel gbotrel merged commit 66b538e into Consensys:perf/tensor-commitment Feb 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants