Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PRNG on get_indices_from_sponge #69

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Cesar199999
Copy link

@Cesar199999 Cesar199999 commented Jun 17, 2024

@WizardOfMenlo thanks for the help 💯

commit Ligero<Fr>/12    time:   [1.4471 ms 1.4631 ms 1.4801 ms]
                        change: [-21.307% -18.580% -15.978%] (p = 0.00 < 0.05)
                        Performance has improved.
                        
commit Ligero<Fr>/14    time:   [4.4859 ms 4.5037 ms 4.5233 ms]
                        change: [+0.2918% +0.8145% +1.3480%] (p = 0.00 < 0.05)
                        Change within noise threshold.

commit Ligero<Fr>/16    time:   [15.467 ms 15.509 ms 15.555 ms]
                        change: [-0.7112% -0.2467% +0.2200%] (p = 0.32 > 0.05)
                        No change in performance detected.

commit Ligero<Fr>/18    time:   [58.271 ms 58.679 ms 59.097 ms]
                        change: [-0.0904% +0.8977% +1.8792%] (p = 0.08 > 0.05)
                        No change in performance detected.
                        
commit Ligero<Fr>/20    time:   [262.17 ms 262.48 ms 262.85 ms]
                        change: [-0.3453% -0.1839% -0.0186%] (p = 0.03 < 0.05)
                        Change within noise threshold.

open Ligero<Fr>/12      time:   [13.670 ms 13.709 ms 13.747 ms]
                        change: [-52.005% -51.770% -51.535%] (p = 0.00 < 0.05)
                        Performance has improved.

open Ligero<Fr>/14      time:   [27.383 ms 27.465 ms 27.547 ms]
                        change: [-34.929% -34.636% -34.370%] (p = 0.00 < 0.05)
                        Performance has improved.

open Ligero<Fr>/16      time:   [56.308 ms 56.559 ms 56.810 ms]
                        change: [-19.442% -19.056% -18.703%] (p = 0.00 < 0.05)
                        Performance has improved.

open Ligero<Fr>/18      time:   [118.89 ms 119.32 ms 119.75 ms]
                        change: [-10.087% -9.6919% -9.3017%] (p = 0.00 < 0.05)
                        Performance has improved.

open Ligero<Fr>/20      time:   [251.84 ms 252.60 ms 253.38 ms]
                        change: [-6.5656% -6.1312% -5.7239%] (p = 0.00 < 0.05)
                        Performance has improved.

verify Ligero<Fr>/12    time:   [17.168 ms 17.283 ms 17.428 ms]
                        change: [-47.372% -46.938% -46.459%] (p = 0.00 < 0.05)
                        Performance has improved.

verify Ligero<Fr>/14    time:   [31.243 ms 31.535 ms 31.879 ms]
                        change: [-32.390% -31.661% -30.855%] (p = 0.00 < 0.05)
                        Performance has improved.

verify Ligero<Fr>/16    time:   [57.835 ms 58.025 ms 58.227 ms]
                        change: [-20.233% -19.842% -19.466%] (p = 0.00 < 0.05)
                        Performance has improved.

verify Ligero<Fr>/18    time:   [110.80 ms 111.07 ms 111.35 ms]
                        change: [-12.179% -11.850% -11.526%] (p = 0.00 < 0.05)
                        Performance has improved.

verify Ligero<Fr>/20    time:   [218.17 ms 219.09 ms 220.19 ms]
                        change: [-7.1621% -6.6065% -6.0154%] (p = 0.00 < 0.05)
                        Performance has improved.
 ``

@mmagician
Copy link

That's a decent improvement. Could you add some links/research about the security analysis of this? I recall you mentioned this technique during one of the calls

@Antonio95
Copy link

As far as I know, there is no published research on that approach - just like there is no published research on the approach in our current code. The task required by the FS transform here is: "generate t (not nec. distinct) indices between 0 and e - 1 in a deterministic, random-looking way using the current state of the sponge S".
a) Currently, we do: squeeze t times from the sponge taking the mod-e residue class each time.
b) In the PR, we do: seed a ChaCha20 PRNG P with enough bytes squeezed from S, then generate t random numbers using P, taking the mod-e residue class each time.

What I want to stress is that both a) and b) are realisations of the FS transform (that is, instatiations of the ROM) and I'm not aware of specific research analysis published for any of them, while at the same time both of them seem to be well known in the community - it's just that we are more used to working with a). But, for instance, Giacomo's STIR repo uses b), and he explained some other advantages of this approach in our last meeting (such as being potentially less biased).

@autquis
Copy link
Collaborator

autquis commented Jun 20, 2024

Did you consider Fisher-Yates as I mentioned? Because we need t columns anyway for the soundness. Right?

@WizardOfMenlo
Copy link

Only t columns, not necessarily distinct. So you can just sample and deduplicate.

@Antonio95
Copy link

A written reference for @autquis about the indices not necessarily being different: Thaler's book, p. 166, first paragraph.

@mmagician
Copy link

Hmm just thinking about these benches: the underlying sponge being tested is Poseidon. So it makes sense that there is quite a large improvement in speed.
I wonder, could we also have some tests where the sponge is based on a CPU-friendly hash function, rather than a SNARK-recursion-friendly hash?

@mmagician
Copy link

e.g. something like this: arkworks-rs/crypto-primitives#136
or maybe we could add e.g. a sha3 based sponge?

@Antonio95
Copy link

Hey @mmagician, I'll look at that with César 👌

@Cesar199999 Cesar199999 changed the title Use PRNG on generate_indices_from_sponge Use PRNG on get_indices_from_sponge Jun 28, 2024
@Cesar199999
Copy link
Author

Hmm just thinking about these benches: the underlying sponge being tested is Poseidon. So it makes sense that there is quite a large improvement in speed. I wonder, could we also have some tests where the sponge is based on a CPU-friendly hash function, rather than a SNARK-recursion-friendly hash?

Sure, in general arkworks-rs/crypto-primitives#136 seems to be much faster than Poseidon. Modifying get_indices_from_sponge still yields an efficiency improvement on small instances:

open Ligero<Fr>/12      time:   [1.2908 ms 1.3031 ms 1.3179 ms]
                        change: [-10.290% -9.2510% -8.0990%] (p = 0.00 < 0.05)
                        Performance has improved.

open Ligero<Fr>/14      time:   [2.8603 ms 2.8718 ms 2.8840 ms]
                        change: [-5.1916% -4.4438% -3.7515%] (p = 0.00 < 0.05)
                        Performance has improved.

open Ligero<Fr>/16      time:   [6.7570 ms 6.7723 ms 6.7879 ms]
                        change: [-1.6735% -1.3768% -1.0707%] (p = 0.00 < 0.05)
                        Performance has improved.

open Ligero<Fr>/18      time:   [17.841 ms 17.912 ms 17.985 ms]
                        change: [-2.4678% -1.9374% -1.4061%] (p = 0.00 < 0.05)
                        Performance has improved.

open Ligero<Fr>/20      time:   [51.022 ms 51.440 ms 51.914 ms]
                        change: [-0.1053% +0.8190% +1.8021%] (p = 0.11 > 0.05)
                        No change in performance detected.

verify Ligero<Fr>/12    time:   [5.1639 ms 5.1891 ms 5.2167 ms]
                        change: [-13.281% -12.363% -11.604%] (p = 0.00 < 0.05)
                        Performance has improved.

verify Ligero<Fr>/14    time:   [6.6277 ms 6.6588 ms 6.6899 ms]
                        change: [-7.1938% -6.4604% -5.7624%] (p = 0.00 < 0.05)
                        Performance has improved.
                        
verify Ligero<Fr>/16    time:   [8.6742 ms 8.6998 ms 8.7265 ms]
                        change: [-5.2843% -4.8678% -4.4566%] (p = 0.00 < 0.05)
                        Performance has improved.

verify Ligero<Fr>/18    time:   [12.233 ms 12.317 ms 12.415 ms]
                        change: [-0.6065% +0.1825% +1.0576%] (p = 0.67 > 0.05)
                        No change in performance detected.

verify Ligero<Fr>/20    time:   [18.014 ms 18.455 ms 19.081 ms]
                        change: [+1.3535% +3.8517% +7.7403%] (p = 0.01 < 0.05)
                        Performance has regressed.

@mmagician
Copy link

Thanks for running the benches @Cesar199999! The results are what I feared.

In practice, we want to reason about massive circuits, like 2^28. While we don't have benches for these, it seems like the performance would be degraded for these with the approach in this PR.

@Antonio95
Copy link

@mmagician Let's put merging this on hold, @Cesar199999 and I would still like to make a set of four benchmarks to really understand what's going on here (there are small inconsistencies in what we have benched above). Should be ready very soon :)

@WizardOfMenlo
Copy link

As a comment:

  • If one is to prefer a proof system that is more recursion friendly, the strategy to go for is to just squeeze indices directly from the PoseidonSponge, as the PRNG is likely to be less friendly.
  • If instead a fast native verification is desired, the trade-off between the native sponge squeezing and the PRNG regulates if this is worth it. For example, I think in a SHA3 and Chacha20 configuration this might be worthwhile since SHA3 is (relatively) slower. If SHA3 is replaced by Blake3, things are much more unclear and it might be better to squeeze the sponge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants