-
Notifications
You must be signed in to change notification settings - Fork 479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kwxm/bls12-381/prototype (PLT-192, PLT-1557, PLT-1554, etc). #5231
Conversation
…/plutus into kwxm/BLS12_381/prototype
Thanks for the comments @effectfully, especially the one about has collisions. I think I've addressed all of them except for the one about explicit dictionaries: I might come back to that later. |
@angerman I think this PR is blocked because you requested changes because of the Windows build problem. I believe we decided that you'd look into that separately, so is it OK to merge this now? I had some trouble merging the nix files with those in |
@kwxm here's the haskell.nix bump PR: #5366 I've also tried to just bump haskell.nix on this PR, but that resulted in
after I pulled the latest changes 😕 |
So this fails on windows with
too. I don't think the Darwin -> Windows pipeline works, and I'm not sure why this is in CI. |
Notable I don't see that failure on master (with bumped haskell.nix): #5366 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should revert this back to the original for now.
That's weird: |
@kwxm yes, I'm also a bit surprised by this. Even more so that I don't see this in master 😅 |
Looks like it succeeded but the status isn't getting reported for some reason. |
I think this is good to merge. IDK why hydra complains. We could forcepush. I'll force push once. |
643900a
to
646b7e6
Compare
Slightly premature, since I was just about to push some more changes! I'll open a new branch for that. |
Introduction
This PR adds new built-in types and functions to Plutus Core and PlutusTx for pairings over BLS12-381, as specified in CIP-0381. These depend on Haskell bindings to the blst C library which are currently the subject of an open PR in
cardano-base
. In order not to depend on a branch, the code for these has been coped intoplutus-core/cbits
andplutus-core/src/plutus-core/Crypto/External
. We should be able to merge this PR as-is, but ideally we'd wait until thecardano-base
PR is merged and remove the copied files fromplutus-core
.Details
The PR introduces three new PLC types:
bls12_381_G1_element
bls12_381_G2_element
bls12_381_MlResult
and seventeen new built-in functions:
The basic definitions of all of these things are in
plutus-core/plutus-core/src/PlutusCore/Crypto/BLS12_381/
, in the filesG1.hs
,G2.hs
, andPairing.hs
. The following sections will attempt to give an idea of what all of these do, but some mathematical background will be required first.Background
Abelian groups
An abelian group is a commutative monoid in which every element has an inverse. These are usually written in an additive style where the operation is
+
, the identity element is0
, and the inverse of an elementa
is-a
, but they are sometimes written in a multiplicative style where the operation is*
, the identity element is1
, and the inverse of an elementa
isa⁻¹
. The types G1 and G2 above represent finite additive abelian groups andMlResult
represents a multiplicative one.For any abelian group one can define a notion of scalar multiplication by integers. Given an element
a
of an abelian group and an integer n,n*a
is defined to be 0 if n=0,a+...+a
n
times ifn>0
, and(-a) + ... + (-a)
-n
times ifn<0
. In multiplicative notation this scalar "multiple" is written as a poweraⁿ
.The
add
,neg
, andscalarMul
operations above represent these operations in the groups G1 and G2; in addition there is an equality operation, serialisation (compress
) and deserialisation (uncompress
) operations, and ahashToGroup
operation which takes an arbitrary bytestring and a domain separation tag and produces a hash in the form of an element of G1 or G2.This leaves three functions:
mulMlResult
, which performs group multiplication in the groupMlResult
;millerLoop
, which takes an element ofG1
and an element ofG2
and produces an element ofMlResult
(this is the actual pairing operation), andfinalVerify
, which essentially compares two elements ofMlResult
for equality. The last two operations (millerLoop
andfinalVerify
) are very expensive and are typically only expected to be called a few times in on-chain code.To explain more of the details of the new functions we need some extra background.
Finite fields
A field
F
is a set equipped with two operations+
and*
and elements0
and1
such that(F, 0, +)
is an abelian group(F, 1, *)
is a commutative monoid such that every nonzero element ofF
has a multiplicative inverse (soF\{0}
is an abelian group under multiplication).a*(b+c) = a*b + a*c
for all a,b,c in F.Familiar examples of fields are
ℚ
,ℝ
, andℂ
(but notℤ
since not every element has a multiplicative inverse inℤ
).There are also finite fields, and these can be completely classified. It can be shown that for every prime number
p
and every positive integern
there is a field of orderpⁿ
(the order of a field or group is its cardinality, the number of elements it contains), and these account for all of the finite fields up to isomorphism. These fields are usually denoted by𝔽_{pⁿ}
orGF(pⁿ)
orGF(p,n)
or something similar; I'll useGF(pⁿ)
because it's easier to type.For
n=1
,GF(p)
is justℤₚ
with the usual operations, the important point being that every nonzero element there has a multiplicative inverse modulo p, unlikeℤₙ
ifn
is not prime (what would be the inverse of 6 inℤ₂₄
, for example?).For
n>1
, we can regardGF(p,n)
as consisting of n-tuples of elements ofℤₚ
; addition is usual coordinatewise addition, but multiplication will be something much more convoluted (and expensive), where the i-th entry in a product will be a complicated combination of all of the elements of the two tuples being multiplied.Elliptic curves
For the purposes of this PR, an elliptic curve
E
over a fieldF
can be regarded as an equation of the formy² = x³+ax+b
, wherea
andb
are constants fromF
. The set of all pairs(x,y) ∈ F×F
which satisfy the equation is also referred to as "the curve". When the field isℝ
an elliptic curve is what you'd usually think of as a curve (see Google), but when F is a finite field the so-called curve is an apparently random set of pairs of field elements. However, although the geometric structure has vanished, there is an extremely rich algebraic structure that still makes sense.A fundamental property of elliptic curves is that they are also abelian groups: given two points
P
andQ
on an elliptic curveE
one can define another pointP+Q
which also lies on the curve. There's a geometric picture for this overℝ
(see Google again), but also an algebraic form which applies over any field. The coordinates ofP+Q
are complicated algebraic functions of the coordinates ofP
andQ
. In particular, if you're given a pointP
and the pointnP
(=P+...+P
) for some integer n, it's not easy to work out whatn
is: this is the discrete logarithm problem on E and is believed to be computationally intractable in general, which is one reason why elliptic curves are useful in cryptography.Pairings
A pairing is a function
e: G₁ × G₂ → G_T
whereG₁
,G₂
andG_T
are abelian groups (the T's supposed to be a subscript, but I can't find a Unicode character for that). In our case,G₁
andG₂
are written additively, but to complicate matters,G_T
is written multiplicatively. A pairing is require to be bilinear: forp, p' ∈ G₁
,q, q' ∈ G₂
, andn ∈ ℤ
we should havee(p+p',q) = e(p,q) * e(p',q)
(remember that the operation inG_T
is multiplication)e(p,q+q') = e(p,q) * e(p,q')
e(n*p,q) = e(p,q)ⁿ = e(p,n*q)
The final property here is one reason that the concept of pairing useful in cryptography: it allows one person to perform a computation in
G₁
and another to perform a computation inG₂
and then the results can be compared inG_T
without having to share information (or at least that's my simplistic interpretation: the details are complicated).Back to Plutus
The situation in BLS12-381 is as follows:
q
where is a 381-bit prime (approximately 10^115)G₁
is a subgroup of an elliptic curveE₁
defined overGF(q)
G₂
is a subgroup of an elliptic curveE₂
defined overGF(q²)
G_T
is a subgroup of the multiplicative group of nonzero elements ofGF(q¹²)
(the elements of this field are huge: you need 12x48 = 576 bytes to hold a single element)G₁
,G₂
, andG_T
are all isomorphic to each other and are of orderp
wherep
is a 255-bit prime (about 10^77). In fact, all of these groups are isomorphic to(ℤₚ, +)
, but in very obfuscated ways.p*a = 0
for alla ∈ G₁
, and similarly forG₂
, so scalar multiplication is periodic modulop
. This means that the efficiency of computing a scalar multiplen*a
can be improved by reducingn
modulop
before performing that multiplication (although scalar multiplication is still quite expensive since we've got up to 255 bits to multiply by). The reduction is performed automatically by thescalarMul
functions.e: G₁ × G₂ → G_T
, and combiningmillerLoop
andfinalVerify
allows one to check whethere(a,b) = e(a',b')
for elementsa,a' ∈ G₁
andb,b' ∈ G₂
. The fact that such a pairing exists is not obvious and depends on some rather complicated mathematics, which itself has undergone a lot of refinement to enable efficient computation.Serialisation and compression
We can now explain how elements of
G₁
andG₂
can be serialised as bytestrings.It follows from the facts above that every element of
G₁
can be represented as a pair(x,y)
withx,y ∈ GF(q)
, and each of x and y take up 381 bits, so each fits into 48 bytes with 3 bits left over. Thus(x,y)
pairs can be serialised in this form, a pair taking up a total of 96 bytes. However, this is inefficient in terms of space: remember that the point satisfy an equationy² = x³ + ax +b
, so if you know the x-coordinate of a point and can compute square roots inGF(q)
then you can calculate the y-coordinate of the point and save yourself 48 bytes (with some complications that we'll see later). A similar situation holds for elements ofG₂
, except that each coordinate requires 96 bytes. There are thus two possibilities for how group elements can be serialised into bytestrings:GF(q)
has a square root, so your serialised value may not correspond to a point on the curve.The
blst
library supports both of these formats, using the final spare bit to indicate which serialisation method is intended (see the specification here) (the "spare" bits are the most significant (leftmost) bits in the bytestring encoding) . The first method is called "serialisation" and the second "compression". There's a trade-off here: the serialised format is space-inefficient but deserialisation is fast, whereas for the compressed format only half the space is required but uncompression is slower because of the need to extract a square root. In fact, experiments show that uncompression takes about 20 times longer than deserialisation forG₁
and about 30 times longer forG₂
. Despite this we have chosen to use the compressed format in Plutus Core, providing uncompression functionsbls12_381_G1_uncompress
andbls12_381_G1_uncompress
, along with corresponding compression functions. We may wish to rethink this at some point, possibly adding serialisation and deserialisation if the time penalty for uncompression is too great.Our representation is more restrictive than that of
blst
:blst
allows serialisation of any point on the curveE₁
(orE₂
), but during uncompression we reject any points which do not lie in the subgroupG₁
(orG₂
). The subgroups are much smaller than the entire curve group (|E₁|/|G₁| ≈ 7×10^37
and|E₂|/|G₂| ≈ 3×10^152
) so the chance of being able to successfully decompress a random bytestring to get a point in a subgroup is negligible.Concrete syntax for
G₁
andG₂
elements.We have also used the compressed representation for the concrete syntax for
G₁
andG₂
elements. A constant of typeG₁
is represented by(con bls12_381_G1_element 0x...)
where...
consists of 96 hex digits, and aG₂
element is similar but with 192 hex digits.Hashing
We also provide two functions
bls12_381_G1_hashToGroup
andbls12_381_G2_hashToGroup
which, as the names suggest, take an arbitrary bytestring and produce an element in the relevant group (not just on the curve). Hashing and uncompression are the only means of obtaining group elements directly. The other group operations preserve the property of being in the subgroup, so none of our functions can can give us arbitrary curve elements.Pairing operations and
bls12_381_mlresult
In the BLS12-381 situation, what one usually wishes to do is to calculate two pairings to get two elements of
G_T
and then compare these results for equality. This is an expensive computation and it is considerably more efficient (though still expensive) to calculate values of a so-called non-reduced pairing in an intermediate group closely related toG_T
and then at the end perform a calculation (final verification) that checks that two elements of the intermediate group become equal when lifted intoG_T
.In Plutus Core, the intermediate group is called
bls12_381_mlresult
, and the operations we can perform on this type are (deliberately) very limited. The only way to obtain a value of typebls12_381_mlresult
is as a result of callingbls12_381_millerLoop
(hence the name), or by multiplying together two existing elements of that type. We do not provide any concrete syntax, textual printing method, or serialisation format (includingflat
) for values of typebls12_381_mlresult
.Note also that the operations on
bls12_381_mlresult
are quite expensive: the (constant) CPU costs are (currently) as follows:Fortunately this is in line with the expected on-chain usage patterns: in realistic applications one would tend to perform a few calls to
millerLoop
, some multiplications, and one call tobls12_381_finalVerify
, and costing benchmarks (cabal run bls-costs
) suggest that this can be done in a reasonable time. For example, a realistic zk-snark verification takes about 23% of the on-chain script execution limit (memory costs are negligible).flat
serialisationThe canonical storage format for Plutus Core scripts is the
flat
format (see Appendix E of the Plutus Core specification (pdf here)). We use the compression format described earlier for flat serialisation of elements of typebls12_381_G1_element
andbls12_381_G2_element
: these are compressed to bytestrings which are then encoded using the normalflat
encoding (but with tags 9 and 10 respectively). Deccoding can fail for a number of reasons: the bytestring can be the wrong length (it must be exactly 48 bytes long forbls12_381_G1_element
and 96 forbls12_381_G2_element
); the encoding can be invalid because the three format bits are used improperly; the encoding may specify a valid field element but one which is not the x-coordinate of a point on the curve ("off-curve"); and the encoding may specify a genuine point on the curve but one which is not in the relevant subgroup ("out-of-group"). All of these are described in theBLSTError
type defined in theblst
Haskell bindings.There is no means of serialising
bls21_381_mlresult
values. The tag 11 has been allocated to this type, but if it is encountered while decoding aflat
file then a failure occurs.Costing
Fortunately costing is quite simple for the BLS12-381 operations: costing benchmarks indicate that everything is constant cost apart from the hashing functions and the scalar multiplication functions, which are linear in the size of their input.