Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Previously, the package worked by copying the input (or the output) into a buffer, and then XOR'ing (or copying) it into (or out of) the state. (Except for an input fast path.) There's no need for that! We can XOR straight into the state, and copy straight out of it, at least on little endian machines. This is a bit faster, almost halves the state size, and will make it easier to implement marshaling, but most importantly look at how much simpler it makes the code! go: go1.23.0 goos: linux goarch: amd64 pkg: golang.org/x/crypto/sha3 cpu: AMD Ryzen 7 PRO 8700GE w/ Radeon 780M Graphics │ v0.27.0-2-g42ee18b9637 │ v0.27.0-2-g42ee18b9637-dirty │ │ sec/op │ sec/op vs base │ PermutationFunction-8 270.8n ± 0% 270.4n ± 0% ~ (p=0.099 n=10) Sha3_512_MTU-8 5.762µ ± 0% 5.658µ ± 0% -1.80% (p=0.000 n=10) Sha3_384_MTU-8 4.179µ ± 0% 4.070µ ± 0% -2.60% (p=0.000 n=10) Sha3_256_MTU-8 3.316µ ± 0% 3.214µ ± 0% -3.08% (p=0.000 n=10) Sha3_224_MTU-8 3.175µ ± 0% 3.061µ ± 0% -3.61% (p=0.000 n=10) Shake128_MTU-8 2.779µ ± 0% 2.681µ ± 0% -3.51% (p=0.000 n=10) Shake256_MTU-8 2.947µ ± 0% 2.957µ ± 0% +0.32% (p=0.000 n=10) Shake256_16x-8 44.15µ ± 0% 44.45µ ± 0% +0.67% (p=0.000 n=10) Shake256_1MiB-8 2.319m ± 0% 2.274m ± 0% -1.93% (p=0.000 n=10) Sha3_512_1MiB-8 4.204m ± 0% 4.219m ± 0% +0.34% (p=0.000 n=10) geomean 13.75µ 13.54µ -1.55% │ v0.27.0-2-g42ee18b9637 │ v0.27.0-2-g42ee18b9637-dirty │ │ B/s │ B/s vs base │ PermutationFunction-8 704.3Mi ± 0% 705.4Mi ± 0% ~ (p=0.105 n=10) Sha3_512_MTU-8 223.5Mi ± 0% 227.6Mi ± 0% +1.83% (p=0.000 n=10) Sha3_384_MTU-8 308.1Mi ± 0% 316.4Mi ± 0% +2.67% (p=0.000 n=10) Sha3_256_MTU-8 388.2Mi ± 0% 400.5Mi ± 0% +3.17% (p=0.000 n=10) Sha3_224_MTU-8 405.5Mi ± 0% 420.7Mi ± 0% +3.73% (p=0.000 n=10) Shake128_MTU-8 463.4Mi ± 0% 480.2Mi ± 0% +3.64% (p=0.000 n=10) Shake256_MTU-8 436.9Mi ± 0% 435.5Mi ± 0% -0.32% (p=0.000 n=10) Shake256_16x-8 353.9Mi ± 0% 351.5Mi ± 0% -0.66% (p=0.000 n=10) Shake256_1MiB-8 431.2Mi ± 0% 439.7Mi ± 0% +1.97% (p=0.000 n=10) Sha3_512_1MiB-8 237.8Mi ± 0% 237.1Mi ± 0% -0.33% (p=0.000 n=10) geomean 375.7Mi 381.6Mi +1.57% Even stronger effect when patched on top of CL 616555 (forced on). go: go1.23.0 goos: darwin goarch: arm64 pkg: golang.org/x/crypto/sha3 cpu: Apple M2 │ old │ new │ │ sec/op │ sec/op vs base │ PermutationFunction-8 154.7n ± 2% 153.8n ± 1% ~ (p=0.469 n=10) Sha3_512_MTU-8 3.260µ ± 2% 3.143µ ± 2% -3.60% (p=0.000 n=10) Sha3_384_MTU-8 2.389µ ± 2% 2.244µ ± 2% -6.07% (p=0.000 n=10) Sha3_256_MTU-8 1.950µ ± 2% 1.758µ ± 1% -9.87% (p=0.000 n=10) Sha3_224_MTU-8 1.874µ ± 2% 1.686µ ± 1% -10.06% (p=0.000 n=10) Shake128_MTU-8 1.827µ ± 3% 1.447µ ± 1% -20.80% (p=0.000 n=10) Shake256_MTU-8 1.665µ ± 3% 1.604µ ± 3% -3.63% (p=0.003 n=10) Shake256_16x-8 25.14µ ± 1% 25.23µ ± 2% ~ (p=0.912 n=10) Shake256_1MiB-8 1.236m ± 2% 1.243m ± 2% ~ (p=0.631 n=10) Sha3_512_1MiB-8 2.296m ± 2% 2.305m ± 1% ~ (p=0.315 n=10) geomean 7.906µ 7.467µ -5.56% │ old │ new │ │ B/op │ B/op vs base │ PermutationFunction-8 1.204Gi ± 2% 1.212Gi ± 1% ~ (p=0.529 n=10) Sha3_512_MTU-8 394.9Mi ± 2% 409.7Mi ± 2% +3.73% (p=0.000 n=10) Sha3_384_MTU-8 539.0Mi ± 2% 573.8Mi ± 2% +6.45% (p=0.000 n=10) Sha3_256_MTU-8 660.3Mi ± 2% 732.6Mi ± 1% +10.95% (p=0.000 n=10) Sha3_224_MTU-8 687.1Mi ± 2% 763.9Mi ± 1% +11.17% (p=0.000 n=10) Shake128_MTU-8 704.7Mi ± 2% 889.6Mi ± 2% +26.24% (p=0.000 n=10) Shake256_MTU-8 773.4Mi ± 3% 802.5Mi ± 3% +3.76% (p=0.004 n=10) Shake256_16x-8 621.6Mi ± 1% 619.3Mi ± 2% ~ (p=0.912 n=10) Shake256_1MiB-8 809.1Mi ± 2% 804.7Mi ± 2% ~ (p=0.631 n=10) Sha3_512_1MiB-8 435.6Mi ± 2% 433.9Mi ± 1% ~ (p=0.315 n=10) geomean 653.6Mi 692.0Mi +5.88% Change-Id: I33a0a1ddf305c395f99bf17f81473e2f42c5ce42 Reviewed-on: https://go-review.googlesource.com/c/crypto/+/616575 Reviewed-by: Daniel McCarney <[email protected]> Reviewed-by: Michael Pratt <[email protected]> Reviewed-by: Roland Shoemaker <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> Auto-Submit: Filippo Valsorda <[email protected]> Reviewed-by: Andrew Ekstedt <[email protected]>
- Loading branch information