add specification for TIP-0007 – Run-Time Permutation Check

TritonVM · Aug 9, 2023 · 82f81f3 · 82f81f3
2 parents c9907e2 + 3d83b8a
commit 82f81f3
Show file tree

Hide file tree

Showing 2 changed files with 69 additions and 11 deletions.
diff --git a/tips/Makefile b/tips/Makefile
diff --git a/tips/tip-0007/tip-0007.md b/tips/tip-0007/tip-0007.md
@@ -0,0 +1,69 @@
+# TIP 0007: Run-Time Permutation Check
+
+| TIP            | 0007                                               |
+|:---------------|:---------------------------------------------------|
+| authors:       | Alan Szepieniec and Ferdinand Sauer                |
+| title:         | Run-Time Permutation Check                         |
+| status:        | draft                                              |
+| created:       | 2023-05-31                                         |
+| issue tracker: | <https://github.com/TritonVM/triton-vm/issues/202> |
+
+**Abstract.**
+This note describes an architectural change to Triton VM that allows programs to check at run-time whether two lists are equal up to permutation. It requires 0 extra base columns, 1 extra extension column, and 3 extra instructions.
+
+## Introduction
+
+In blockchain programming especially, one often needs to assert that two lists are equal up to permutation. For instance, a natural way to engineer privacy is to show in zero knowledge that one list of commitments is equivalent to another list of commitments up to a secret shuffle and re-randomization.
+
+A standard solution to this problem is to sort the lists and then test for equality. It costs $O(N \log N)$ operations to sort, and another $O(N)$ to compare. We can do much better in a STARK VM by using non-determinism.
+
+## Construction
+
+This note proposes 3 new instructions: `push_perm`, `pop_perm`, and `assert_perm`. The first two instructions push data items to and pop them back from a data structure that is neither a stack (first-in-last-out) nor a queue (first-in-first-out) but somewhere in between. *The items that go in, come out in a nondeterministic order.* One could call this a *stueue* both because it is a portmanteau and because it captures intuition of mixing.
+
+Specifically:
+
+ - `push_perm` takes the top 5 elements of the stack, computes the inner product $p$ with a vector of Fiat-Shamir weights, and multiplies the product $(\alpha - p)$ into a running product column.
+ - `pop_perm` takes the top 5 elements of the stack, computes the inner product $p$ with the same vector of Fiat-Shamir weights, and divides the product $(\alpha - p)$ out of the running product column.
+ - `assert_perm` asserts that the running product column in the given row equals 1.
+
+In this expression, the inner product is $p = \sum_{i=0}^4 \mathtt{st}_i \cdot a_i$ and $\alpha$ is another Fiat-Shamir challenge used as indeterminate for evaluating the permutation check.
+
+The initial condition, as well as the terminal condition, of the running product column `permrp` stipulate that it is one. Furthermore, the following instruction-specific constraints limit its evolution:
+
+ - If the current instruction is `push_perm`, multiply the offset inner product in: `permrp* - permrp ⋅ (p - α)`.
+ - If the current instruction is `pop_perm`, divide the offset inner product out: `permrp* ⋅ (p - α) - permrp`.
+ - If the current instruction is `assert_perm`, equate the running product to one: `permrp - 1`.
+ - If the current instruction is neither `push_perm` nor `pop_perm`, the running product does not change: `permrp* - permrp`.
+
+## Use
+
+To use the run-time permutation check to assert that two lists, A and B are the same, the program specifies as follows. First, iterate over all the elements of A, hash them, and apply `push_perm`. Next, iterate over all elements of B, hash them, and apply `pop_perm`. Then assert the two lists are permutations of one another by invoking `assert_perm`.
+
+The cost of this task is $O(N)$. Note that at no point is any list being sorted. In particular, the host that computes the correct guess in order for that value nondeterminstically divined by the guest does not even need to sort.
+
+## Alternatives
+
+While brainstorming we considered some alternative approaches. We decided against them in light of the attached critiques.
+
+### 1. Hash-Accumulator
+
+Without changing the VM, the program can hash all the elements of both lists, sum the hashes of each list together, and then compare these sums-of-hashes. The intuition for security is that finding a pair of lists whose sums-of-hashes are equal should be hard for a secure hash function. This task generalizes one-wayness.
+
+Unfortunately, it is also significantly easier than finding preimages (breaking one-wayness). Let's analyze the following attack strategy.
+
+ - Maintain a list of preimages $P$ starting with $2N-1$ random preimages.
+ - In each iteration, sample a new random preimage $p$ and compute its hash $d = H(p)$.
+ - For each selection of one list of length $N$ and one list of length $N-1$, extend the second list with $d$ and check if the sums-of-hashes match. If yes, win and quit. If not, add $p$ to the expanding list.
+
+How big does the list have to be before the probability of halting in the next iteration is significant? Suppose the list contains $Q$ elements. There are ${Q \choose N}$ ways to select the first list. The sum-of-hashes for each such list is fixed and distinct from that of all others by assumption (since otherwise we would have terminated already). There are ${Q-N \choose N-1}$ ways to select the second list. The sum-of-hashes for each such list is fixed and distinct from that of all others by the same assumption. The probability that the element $p$ with hash $d$ makes the sums-of-hashes of two given such lists is $2^{-\lambda}$. But this event is not independent for different pairs of lists because their *difference* might be the same even though their individual sums are different. Nevertheless we can ignore the contribution from the second list and count $p$ as independent for each different *first* list only.
+
+By this logic, the probability of success is at least ${Q \choose N} \cdot 2^{-\lambda}$, which is already far greater than 1 if $Q = 2N = 200$. This analysis ignores the complexity of running through every combination of lists, but the point is that this complexity captures a specific attack whereas the success probability expressed in terms of $Q$ applies to every possible attack. In other words, we cannot rule out the existence of feasible attacks that take only $200$ hash queries.
+
+### 2. Ad-Hoc Fiat-Shamir
+
+This trick is due to Jeremy Bruestle. Hash list A, then hash list B, and then hash the concatenation of these digests. Use the resulting digest $X$ as the indeterminate for a permutation check: $\prod (a_i - X) \stackrel{?}{=} \prod (b_i - X)$.
+
+This works for the same reason Fiat-Shamir works: the attacker cannot predict the indeterminate ahead of time. Finding a pair of unequal lists that pass this check corresponds to finding a preimage giving a suitable digest.
+
+The downside of this approach is that it requires two passes over each list -- one for hashing, and the other for computing the running product.