-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ML-KEM refactor #1763
ML-KEM refactor #1763
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1763 +/- ##
==========================================
- Coverage 78.44% 78.32% -0.13%
==========================================
Files 580 581 +1
Lines 96780 97176 +396
Branches 13863 13930 +67
==========================================
+ Hits 75921 76113 +192
- Misses 20243 20440 +197
- Partials 616 623 +7 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with minor changes
} else { | ||
cbd3(r, buf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This else
is risky because release builds may remove the above assert
so this code path may be reached by code with eta1 != 3
. Might be worth replacing with else if (params->eta1 == 3)
and do a separate else
that is a no-op.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ or move the assert to that else
clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the assert
still be a no-op in release builds?
const uint8_t seed[KYBER_SYMBYTES]) | ||
const uint8_t *seed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't seed
always KYBER_SYMBYTES = 32
bytes long, irrespective of parameter set? And we use exactly KYBER_SYMBYTES
bytes of seed
, so seems worth keeping the constraint.
@@ -157,15 +162,15 @@ static unsigned int rej_uniform(int16_t *r, | |||
**************************************************/ | |||
#define GEN_MATRIX_NBLOCKS ((12*KYBER_N/8*(1 << 12)/KYBER_Q + XOF_BLOCKBYTES)/XOF_BLOCKBYTES) | |||
// Not static for benchmarking | |||
void gen_matrix(polyvec *a, const uint8_t seed[KYBER_SYMBYTES], int transposed) | |||
void gen_matrix(ml_kem_params *params, polyvec *a, const uint8_t *seed, int transposed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above for seed
.
uint8_t *c, | ||
const uint8_t *m, | ||
const uint8_t *pk, | ||
const uint8_t *coins) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
poly_getnoise_*
assumes coins
is KYBER_SYMBYTES
, so might be worth keeping the constraint.
#include "params.h" | ||
|
||
static void ml_kem_params_init(ml_kem_params *params, size_t k) { | ||
assert((k == 2) || (k == 3) || (k == 4)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
release builds may ignore this assert
, leading to unexpected data for the below conditionals.
|
||
for(k=0;k<8;k++) | ||
r->vec[i].coeffs[8*j+k] = ((uint32_t)(t[k] & 0x7FF)*KYBER_Q + 1024) >> 11; | ||
assert((params->poly_vec_compressed_bytes == params->k * 352) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this code would be a bit more legible if we did e.g. #define K_MLKEM_1024 352
for the 3 security levels in params.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code changes look good, no blockers. per sanketh's earlier comments, can we keep parameter-invariant buffers as byte arrays rather than pointers?
NOTE: my review consisted of validating that refactored code remained unchanged and that null dereferences aren't possible given current initialization logic. I assumed the correctness of the existing implementation.
@@ -6,22 +6,25 @@ | |||
#include "polyvec.h" | |||
|
|||
#define gen_matrix KYBER_NAMESPACE(gen_matrix) | |||
void gen_matrix(polyvec *a, const uint8_t seed[KYBER_SYMBYTES], int transposed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to sanketh's earlier comment -- seed size is invariant over parameters, right? why can't we define this as an array rather than a pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, will fix in #1775
## What's Changed * Use OPENSSL_STATIC_ASSERT which handles all the platform/compiler/C s… by @andrewhop in #1791 * ML-KEM refactor by @dkostic in #1763 * ML-KEM-IPD to ML-KEM as defined in FIPS 203 by @dkostic in #1796 * Add KDA OneStep testing to ACVP by @skmcgrail in #1792 * Updating erroneous documentation for BIO_get_mem_data and subsequent usage by @smittals2 in #1752 * No-op impls for several EVP_PKEY_CTX functions by @justsmth in #1759 * Drop "ipd" suffix from ML-KEM related code by @dkostic in #1797 * Upstream merge 2024 08 19 by @skmcgrail in #1781 * ML-KEM move to the FIPS module by @dkostic in #1802 * Reduce collision probability for variable names by @torben-hansen in #1804 * Refactor ENGINE API and memory around METHOD structs by @smittals2 in #1776 * bn: Move x86-64 argument-based dispatching of bn_mul_mont to C. by @justsmth in #1795 * Check at runtime that the tool is loading the same libcrypto it was built with by @andrewhop in #1716 * Avoid matching prefixes of a symbol as arm registers by @torben-hansen in #1807 * Add CI for FreeBSD by @justsmth in #1787 * Move curve25519 implementations to fips module except spake25519 by @torben-hansen in #1809 * Add CAST for SP 800-56Cr2 One-Step function by @skmcgrail in #1803 * Remove custom PKCS7 ASN1 functions, add new structs by @WillChilds-Klein in #1726 * NASM use default debug format by @justsmth in #1747 * Add KDF in counter mode ACVP Testing by @skmcgrail in #1810 * add support for OCSP_request_verify by @samuel40791765 in #1778 * Fix GitHub/CodeBuild Purge Lambda by @justsmth in #1808 * KBKDF_ctr_hmac FIPS Service Indicator by @skmcgrail in #1798 * Update x509 tool to write all output to common BIO which is a file or stdout by @andrewhop in #1800 * Add ML-KEM to speed.cc, bump AWSLC_API_VERSION to 30 by @andrewhop in #1817 * Add EVP_PKEY_asn1_* functions by @justsmth in #1751 * Improve portability of CI integration script by @torben-hansen in #1815 * Upstream merge 2024 08 23 by @justsmth in #1799 * Replace ECDSA_METHOD with EC_KEY_METHOD and add the associated API by @smittals2 in #1785 * Cherrypick "Add some barebones support for DH in EVP" by @samuel40791765 in #1813 * Add KDA OneStep (SSKDF_digest and SSKDF_hmac) to FIPS indicator by @skmcgrail in #1793 * Add EVP_Digest one-shot test XOFs by @WillChilds-Klein in #1820 * Wire-up ACVP Testing for SHA3 Signatures with RSA by @skmcgrail in #1805 * Make SHA3 (not SHAKE) Approved for EVP_DigestSign/Verify, RSA and ECDSA. by @nebeid in #1821 * Begin tracking RelWithDebInfo library statistics by @andrewhop in #1822 * Move EVP ed25519 function table under FIPS module by @torben-hansen in #1826 * Avoid C11 Atomics on Windows by @justsmth in #1824 * Improve pre-sandbox setup by @torben-hansen in #1825 * Add OCSP round trip integration test with minor fixes by @samuel40791765 in #1811 * Add various PKCS7 getters and setters by @WillChilds-Klein in #1780 * Run clang-format on pkcs7 code by @WillChilds-Klein in #1830 * Move KEM API and ML-KEM definitions to FIPS module by @torben-hansen in #1828 * fix socat integration CI by @samuel40791765 in #1833 * Retire out-of-module KEM folder by @torben-hansen in #1832 * Refactor RSA_METHOD and expand API by @smittals2 in #1790 * Update benchmark documentation in tool/readme.md by @andrewhop in #1812 * Pre jail unit test by @torben-hansen in #1835 * Move EVP KEM implementation to in-module and correct OID by @torben-hansen in #1838 * More minor symbols Ruby depends on by @samuel40791765 in #1837 * ED25519 Power-on Self Test / CAST / KAT by @skmcgrail in #1834 * ACVP ML-KEM testing by @skmcgrail in #1840 * ACVP ECDSA SHA3 Digest Testing by @skmcgrail in #1819 * ML-KEM Service Indicator for EVP_PKEY_keygen, EVP_PKEY_encapsulate, EVP_PKEY_decapsulate by @skmcgrail in #1844 * Add ML-KEM CAST for KeyGen, Encaps, and Decaps by @skmcgrail in #1846 * ED25519 Service Indicator by @skmcgrail in #1829 * Update Allowed RSA KeySize Generation to FIPS 186-5 specification by @skmcgrail in #1823 * Add ED25519 ACVP Testing by @skmcgrail in #1818 * Make EDDSA/Ed25519 POST lazy initalized by @skmcgrail in #1848 * add support for PEM Parameters without ASN1 hooks by @samuel40791765 in #1831 * Add OpenVPN tip of main to CI by @smittals2 in #1843 * Ensure SSE2 is enabled when using optimized assembly for 32-bit x86 by @graebm in #1841 * Add support for `EVP_PKEY_CTX_ctrl_str` - Step #1 by @justsmth in #1842 * Added SHA3/SHAKE XOF functionality by @jakemas in #1839 * Migrated ML-KEM SHA3/SHAKE usage to fipsmodule by @jakemas in #1851 * AVX-512 support for RSA Signing by @pittma in #1273
### Issues: CryptoAlg-2724 ### Description of changes: #### Parameterization of ML-DSA Previous to this change, ML-DSA was implemented such that the code for a parameter set was selected by defining the correct C pre-processor flag (for example, if you want to compile the code for ML-DSA-65 parameter set you would `#define DILITHIUM_MODE 3`). The consequence of this was that we had to compile the code three times for three ML-DSA parameter sets. We do this by adding a C file for each parameter set where we first define the corresponding `DILITHIUM_MODE` value and then include all the ML-DSA C files. Besides being an awkward way to handle multiple parameter sets, this will not work for the FIPS module where we bundle all C files inside `bcm.c` and compile it as a single compilation unit. In this change we refactor ML-DSA by parametrizing all functions that depend on values that are specific to a parameter set, i.e., that directly or indirectly depend on the value of `DILITHIUM_MODE`. To do this, in `params.h` we define a structure that holds those ML-DSA parameters and functions that initialize a given structure with values corresponding to a parameter set. This structure can then be passed to every function that requires it. For example, `polyvecl_add` function was previously implemented as: ``` void polyvecl_add(polyvecl *w, const polyvecl *u, const polyvecl *v) { unsigned int i; for(i = 0; i < L; ++i) poly_add(&w->vec[i], &u->vec[i], &v->vec[i]); } ``` Is now changed to: ``` void polyvecl_add(ml_dsa_params *params, polyvecl *w, const polyvecl *u, const polyvecl *v) { unsigned int i; for(i = 0; i < params->l; ++i) poly_add(&w->vec[i], &u->vec[i], &v->vec[i]); } ``` Of course, now we have to change all callers of `polyvecl_add` to also have `ml_dsa_params` as an input argument, and then callers of the caller, etc. These changes bubble up to the highest level API defined in sign.h where we now have: ``` int crypto_sign_keypair(ml_dsa_params *params, uint8_t *pk, uint8_t *sk); int crypto_sign_signature(ml_dsa_params *params, uint8_t *sig, size_t *siglen, const uint8_t *m, size_t mlen, const uint8_t *ctx, size_t ctxlen, const uint8_t *sk); int crypto_sign(ml_dsa_params *params, uint8_t *sm, size_t *smlen, const uint8_t *m, size_t mlen, const uint8_t *ctx, size_t ctxlen, const uint8_t *sk); int crypto_sign_verify(ml_dsa_params *params, const uint8_t *sig, size_t siglen, const uint8_t *m, size_t mlen, const uint8_t *ctx, size_t ctxlen, const uint8_t *pk); int crypto_sign_open(ml_dsa_params *params, uint8_t *m, size_t *mlen, const uint8_t *sm, size_t smlen, const uint8_t *ctx, size_t ctxlen, const uint8_t *pk); ``` Then we can easily implement these functions for specific parameter sets, which can be seen in `sig_dilithium3.c` file. For example: ``` int ml_dsa_65_keypair(uint8_t *public_key /* OUT */, uint8_t *secret_key /* OUT */) { ml_dsa_params params; ml_dsa_65_params_init(¶ms); return crypto_sign_keypair(¶ms, public_key, secret_key); } int ml_dsa_65_sign(uint8_t *sig /* OUT */, size_t *sig_len /* OUT */, const uint8_t *message /* IN */, size_t message_len /* IN */, const uint8_t *ctx /* IN */, size_t ctx_len /* IN */, const uint8_t *secret_key /* IN */) { ml_dsa_params params; ml_dsa_65_params_init(¶ms); return crypto_sign_signature(¶ms, sig, sig_len, message, message_len, ctx, ctx_len, secret_key); } int ml_dsa_65_verify(const uint8_t *message /* IN */, size_t message_len /* IN */, const uint8_t *sig /* IN */, size_t sig_len /* IN */, const uint8_t *ctx /* IN */, size_t ctx_len /* IN */, const uint8_t *public_key /* IN */) { ml_dsa_params params; ml_dsa_65_params_init(¶ms); return crypto_sign_verify(¶ms, sig, sig_len, message, message_len, ctx, ctx_len, public_key); } ``` As such, files: - `dilithium3r3_ref.c` - `api.h` - `config.h` are no longer required, and have been removed. #### Other Changes Also modified in this PR are the KAT test framework in `p_dilithium_test.cc`. This KAT framework has been modified to support multiple levels of ML-DSA (to be added in a later PR). ### Call-outs: See similar changes to ML-KEM in #1763 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.
Issues:
CryptoAlg-2614
Description of changes:
Previous to this change, ML-KEM was implemented such that
the code for a parameter set was selected by defining the correct
C pre-processor flag (for example, if you want to compile the code
for ML-KEM 512 parameter set you would
#define KYBER_K 2
).The consequence of this was that we had to compile the code
three times for three ML-KEM parameter sets. We do this by adding
a C file for each parameter set where we first define the corresponding
KYBER_K
value and then include all the ML-KEM C files. Besides beingan awkward way to handle multiple parameter sets, this will not work
for the FIPS module where we bundle all C files inside
bcm.c
andcompile it as a single compilation unit.
In this change we refactor ML-KEM by parametrizing all functions that
depend on values that are specific to a parameter set, i.e., that directly
or indirectly depend on the value of
KYBER_K
. To do this, inparams.h
we define a structure that holds those ML-KEM parameters and functions
that initialize a given structure with values corresponding to a parameter
set. This structure can then be passed to every function that requires it.
For example,
polyvec_ntt
function was previously implemented like this:We now change that to:
Of course, now we have to change all callers of
polyvec_ntt
to alsohave
ml_kem_params
as an input argument, and then callers of thecaller, etc. These changes bubble up to the highest level API defined
in
kem.h
where we now have:Then we can easily implement these functions for specific parameter sets,
which can be seen in
ml_kem.c
file. For example:Call-outs:
Point out areas that need special attention or support during the review process. Discuss architecture or design changes.
Testing:
How is this change tested (unit tests, fuzz tests, etc.)? Are there any testing steps to be verified by the reviewer?
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.