Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chainhash, wire, btcutil, main: Memory efficient txhash #1978

Merged
merged 1 commit into from
Dec 15, 2023

Conversation

kcalvinalvin
Copy link
Collaborator

When profiling the ibd, I noticed that TxHash() is allocating quite a bit of memory.

IMG_0153

I also noticed this with utreexod in the past and I resolved this by creating a new double hash function. I've benchmarked it and made is more efficient in this PR.

Benchstat for BenchmarkTxHash() showed a slight bit of increase in sec/op which is likely due to the overhead of serializing into a hash.Hash instead of the bytes.Buffer in the previous TxHash(). However, for real life cases, the new TxHash() will be faster as we're saving on the unnecessary serialization into bytes.Buffer as sha256.Sum256()will call Write() into the hash anyways.

The memory allocation savings is ~37% compared to the old TxHash(). This matters a lot because TxHash() gets called a lot.

goos: linux
goarch: amd64
pkg: github.com/btcsuite/btcd/wire
cpu: AMD Ryzen 5 3600 6-Core Processor              
          │   old.txt   │              new.txt               │
          │   sec/op    │   sec/op     vs base               │
TxHash-10   1.347µ ± 1%   1.411µ ± 2%  +4.79% (p=0.000 n=10)

          │  old.txt   │              new.txt               │
          │    B/op    │    B/op     vs base                │
TxHash-10   256.0 ± 0%   160.0 ± 0%  -37.50% (p=0.000 n=10)

          │  old.txt   │            new.txt             │
          │ allocs/op  │ allocs/op   vs base            │
TxHash-10   2.000 ± 0%   2.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

For bigger transactions, the memory savings is even greater. I performed the same test but with multiTx instead of genesisCoinbaseTx.

diff --git a/wire/bench_test.go b/wire/bench_test.go
index b9f18b83..e4cb9c88 100644
--- a/wire/bench_test.go
+++ b/wire/bench_test.go
@@ -598,7 +598,7 @@ func BenchmarkDecodeMerkleBlock(b *testing.B) {
 // transaction.
 func BenchmarkTxHash(b *testing.B) {
        for i := 0; i < b.N; i++ {
-               genesisCoinbaseTx.TxHash()
+               multiTx.TxHash()
        }
 }

Below is the benchstat for that test. We see less of a penalty for speed and see ~41% savings in memory allocated.

goos: linux
goarch: amd64
pkg: github.com/btcsuite/btcd/wire
cpu: AMD Ryzen 5 3600 6-Core Processor              
          │ old-multitx.txt │          new-multitx.txt           │
          │     sec/op      │   sec/op     vs base               │
TxHash-10       1.507µ ± 0%   1.543µ ± 2%  +2.39% (p=0.000 n=10)

          │ old-multitx.txt │          new-multitx.txt           │
          │      B/op       │    B/op     vs base                │
TxHash-10        272.0 ± 0%   160.0 ± 0%  -41.18% (p=0.000 n=10)

          │ old-multitx.txt │        new-multitx.txt         │
          │    allocs/op    │ allocs/op   vs base            │
TxHash-10        2.000 ± 0%   2.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

The benchmarks for BenchmarkDoubleHash* show a significant speedup for DoubleHashRaw(). This is likely because the other hashes have to call digest.Write() in sha256.Sum256() while the DoubleHashRaw() function gets to skip that.

[I] calvin@bitcoin ~/b/g/u/g/s/g/b/b/wire (memory-efficient-txhash) [1]> go test -run=XXX -bench=BenchmarkDoubleHash -benchmem
goos: linux
goarch: amd64
pkg: github.com/btcsuite/btcd/wire
cpu: AMD Ryzen 5 3600 6-Core Processor              
BenchmarkDoubleHashB-10          1581974               758.1 ns/op            32 B/op          1 allocs/op
BenchmarkDoubleHashH-10          1600311               750.5 ns/op             0 B/op          0 allocs/op
BenchmarkDoubleHashRaw-10        3185388               370.1 ns/op            32 B/op          1 allocs/op
PASS

I can also post some before and after ibd profiling if requested.

@coveralls
Copy link

coveralls commented May 5, 2023

Pull Request Test Coverage Report for Build 7197319343

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage remained the same at 56.07%

Files with Coverage Reduction New Missed Lines %
peer/peer.go 1 74.52%
Totals Coverage Status
Change from base Build 7168733027: 0.0%
Covered Lines: 27977
Relevant Lines: 49897

💛 - Coveralls

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice optimization!

@@ -31,3 +34,15 @@ func DoubleHashH(b []byte) Hash {
first := sha256.Sum256(b)
return Hash(sha256.Sum256(first[:]))
}

// DoubleHashRaw calculates hash(hash(h)) and returns the resulting bytes.
func DoubleHashRaw(h hash.Hash) []byte {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps specify that the hash.Hash instance should already have the message pre-image written to it?

I think with that added context, the test below is also easier to understand.

wire/msgtx.go Outdated Show resolved Hide resolved
wire/msgtx.go Outdated
buf := bytes.NewBuffer(make([]byte, 0, msg.SerializeSizeStripped()))
_ = msg.SerializeNoWitness(buf)
return chainhash.DoubleHashH(buf.Bytes())
txHash := sha256.New()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other optimization that I've found to really speed things up a lot of caching the result after the first instance. During validation, we end up recomputing this hash a few times: merkle tree check, sighash calculation, etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btcutil.Tx caches the hash right? Maybe try to use that more instead of the raw MsgTx. Though I've not looked yet and maybe that's not possible.

Copy link
Member

@Roasbeef Roasbeef Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btcutil.Tx caches the hash right?

Yeah that caches (the hash), but in most instances we use wire.MsgTx instead, mainly for historical reasons (but also because we decode from the wire into that struct). For other areas like sighash calculation, we don't quite want just the hash but also want the fully serialized transaction bytes as well. I was doing some syncing tests the other day and did some profiling of a catch up sync well past the checkpoints: #1376 (comment)

@@ -24,3 +24,5 @@ require (
replace github.com/btcsuite/btcd/btcutil => ../

replace github.com/btcsuite/btcd => ../..

replace github.com/btcsuite/btcd/chaincfg/chainhash => ../../chaincfg/chainhash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid making the circular dependency problem between some of the packages more annoying, I think we should first push the change in chaincfg/chainhash, push a new tag, then update these two go.mod files to the latest tag.
I wanted to get rid of the other local replace directives as well in another PR, just never got to it...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I make a separate PR that only adds the DoubleHashRaw function?

I can definitely break this up into multiple PRs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, to avoid needing more local replace directives, this needs to be done in two PRs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah ordering wise we'll need to:

  1. Remove this section.
  2. Merge the PR.
  3. Tag a new version.
  4. Make a new PR that actually uses the new functions.

@kcalvinalvin kcalvinalvin force-pushed the memory-efficient-txhash branch 2 times, most recently from 839657c to cd8309f Compare May 9, 2023 05:12
@kcalvinalvin
Copy link
Collaborator Author

The last 2 force pushes change the function signature of DoubleHashRaw to accept a closure that returns an hash. This is done as I found that the

buf := make([]byte, 0, 32)

escapes to the heap with how it was implemented in the previous commits. It ended up allocating less memory than the current master but did allocate 2 objects per hash instead of 1. This change allows the compiler to keep the buf on the stack and only do 1 allocation instead of 2.

Copy link
Collaborator

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool optimization!

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good, one question about the modification made to the benchmarks

for i := 0; i < b.N; i++ {
var buf bytes.Buffer
if err := genesisCoinbaseTx.Serialize(&buf); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ends up including the serialization over head in each invocation, which I don't think is what we want? So you can either use StopTimer, or just move it back to lie outside the loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reading the commit message, I think that' was the purpose of the BenchmarkTxHash benchmark: to account for that extra overhead. Also if you add ReportAllocs, then all these calls will also report on the allocations.

// in byte slice. Pre-allocating here saves an allocation on the second
// hash as we can reuse it. This allocation also does not escape to the
// heap, saving an allocation.
buf := make([]byte, 0, HashSize)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a tangible difference here (re allocations) with delcaring this as a var buf [HashSize]byte vs using make and a slice here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping re this, not terribly blocking though.

Copy link
Collaborator Author

@kcalvinalvin kcalvinalvin Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the same right?

// DoubleHashRaw calculates hash(hash(w)) where w is the resulting bytes from
// the given serialize function and returns the resulting bytes as a Hash.
func DoubleHashRaw(serialize func(w io.Writer) error) Hash {
        // Encode the transaction into the hash.  Ignore the error returns
        // since the only way the encode could fail is being out of memory
        // or due to nil pointers, both of which would cause a run-time panic.
        h := sha256.New()
        _ = serialize(h)
        
        // This buf is here because Sum() will append the result to the passed
        // in byte slice.  Pre-allocating here saves an allocation on the second
        // hash as we can reuse it.  This allocation also does not escape to the
        // heap, saving an allocation.
        var buf [HashSize]byte
        first := h.Sum(buf[:])
        h.Reset()
        h.Write(first)
        res := h.Sum(buf[:])
        return *(*Hash)(res)
}

This will fail tests because the buf already has 0 values occupying it.

[I] calvin@bitcoin ~/b/g/b/g/s/g/b/b/c/chainhash (memory-efficient-txhash)> go test
--- FAIL: TestDoubleHashFuncs (0.00s)
   hashfuncs_test.go:148: DoubleHashRaw("") = 0000000000000000000000000000000000000000000000000000000000000000, want 5df6e0e2761359d30a8275058e299fcc0381534545f55cf43e41983f5d4c9456
   hashfuncs_test.go:148: DoubleHashRaw("a") = 0000000000000000000000000000000000000000000000000000000000000000, want bf5d3affb73efd2ec6c36ad3112dd933efed63c4e1cbffcfa88e2759c144f2d8
   hashfuncs_test.go:148: DoubleHashRaw("ab") = 0000000000000000000000000000000000000000000000000000000000000000, want a1ff8f1856b5e24e32e3882edd4a021f48f28a8b21854b77fdef25a97601aace
   hashfuncs_test.go:148: DoubleHashRaw("abc") = 0000000000000000000000000000000000000000000000000000000000000000, want 4f8b42c22dd3729b519ba6f68d2da7cc5b2d606d05daed5ad5128cc03e6c6358
   hashfuncs_test.go:148: DoubleHashRaw("abcd") = 0000000000000000000000000000000000000000000000000000000000000000, want 7e9c158ecd919fa439a7a214c9fc58b85c3177fb1613bdae41ee695060e11bc6
   hashfuncs_test.go:148: DoubleHashRaw("abcde") = 0000000000000000000000000000000000000000000000000000000000000000, want 1d72b6eb7ba8b9709c790b33b40d8c46211958e13cf85dbcda0ed201a99f2fb9
   hashfuncs_test.go:148: DoubleHashRaw("abcdef") = 0000000000000000000000000000000000000000000000000000000000000000, want ce65d4756128f0035cba4d8d7fae4e9fa93cf7fdf12c0f83ee4a0e84064bef8a
   hashfuncs_test.go:148: DoubleHashRaw("abcdefg") = 0000000000000000000000000000000000000000000000000000000000000000, want dad6b965ad86b880ceb6993f98ebeeb242de39f6b87a458c6510b5a15ff7bbf1
   hashfuncs_test.go:148: DoubleHashRaw("abcdefgh") = 0000000000000000000000000000000000000000000000000000000000000000, want b9b12e7125f73fda20b8c4161fb9b4b146c34cf88595a1e0503ca2cf44c86bc4
   hashfuncs_test.go:148: DoubleHashRaw("abcdefghi") = 0000000000000000000000000000000000000000000000000000000000000000, want 546db09160636e98405fbec8464a84b6464b32514db259e235eae0445346ffb7
   hashfuncs_test.go:148: DoubleHashRaw("abcdefghij") = 0000000000000000000000000000000000000000000000000000000000000000, want 27635cf23fdf8a10f4cb2c52ade13038c38718c6d7ca716bfe726111a57ad201
   hashfuncs_test.go:148: DoubleHashRaw("Discard medicine more than two years old.") = 0000000000000000000000000000000000000000000000000000000000000000, want ae0d8e0e7c0336f0c3a72cefa4f24b625a6a460417a921d066058a0b81e23429
   hashfuncs_test.go:148: DoubleHashRaw("He who has a shady past knows that nice guys finish last.") = 0000000000000000000000000000000000000000000000000000000000000000, want eeb56d02cf638f87ea8f11ebd5b0201afcece984d87be458578d3cfb51978f1b
   hashfuncs_test.go:148: DoubleHashRaw("I wouldn't marry him with a ten foot pole.") = 0000000000000000000000000000000000000000000000000000000000000000, want dc640bf529608a381ea7065ecbcd0443b95f6e4c008de6e134aff1d36bd4b9d8
   hashfuncs_test.go:148: DoubleHashRaw("Free! Free!/A trip/to Mars/for 900/empty jars/Burma Shave") = 0000000000000000000000000000000000000000000000000000000000000000, want 42e54375e60535eb07fc15c6350e10f2c22526f84db1d6f6bba925e154486f33
   hashfuncs_test.go:148: DoubleHashRaw("The days of the digital watch are numbered.  -Tom Stoppard") = 0000000000000000000000000000000000000000000000000000000000000000, want 4ed6aa9b88c84afbf928710b03714de69e2ad967c6a78586069adcb4c470d150
   hashfuncs_test.go:148: DoubleHashRaw("Nepal premier won't resign.") = 0000000000000000000000000000000000000000000000000000000000000000, want 590c24d1877c1919fad12fe01a8796999e9d20cfbf9bc9bc72fa0bd69f0b04dd
   hashfuncs_test.go:148: DoubleHashRaw("For every action there is an equal and opposite government program.") = 0000000000000000000000000000000000000000000000000000000000000000, want 37d270687ee8ebafcd3c1a32f56e1e1304b3c93f252cb637d57a66d59c475eca
   hashfuncs_test.go:148: DoubleHashRaw("His money is twice tainted: 'taint yours and 'taint mine.") = 0000000000000000000000000000000000000000000000000000000000000000, want 306828fd89278838bb1c544c3032a1fd25ea65c40bba586437568828a5fbe944
   hashfuncs_test.go:148: DoubleHashRaw("There is no reason for any individual to have a computer in their home. -Ken Olsen, 1977") = 0000000000000000000000000000000000000000000000000000000000000000, want 49965777eac71faf1e2fb0f6b239ba2fae770977940fd827bcbfe15def6ded53
   hashfuncs_test.go:148: DoubleHashRaw("It's a tiny change to the code and not completely disgusting. - Bob Manchek") = 0000000000000000000000000000000000000000000000000000000000000000, want df99ee4e87dd3fb07922dee7735997bbae8f26db20c86137d4219fc4a37b77c3
   hashfuncs_test.go:148: DoubleHashRaw("size:  a.out:  bad magic") = 0000000000000000000000000000000000000000000000000000000000000000, want 920667c84a15b5ee3df4620169f5c0ec930cea0c580858e50e68848871ed65b4
   hashfuncs_test.go:148: DoubleHashRaw("The major problem is with sendmail.  -Mark Horton") = 0000000000000000000000000000000000000000000000000000000000000000, want 5e817fe20848a4a3932db68e90f8d54ec1b09603f0c99fdc051892b776acd462
   hashfuncs_test.go:148: DoubleHashRaw("Give me a rock, paper and scissors and I will move the world.  CCFestoon") = 0000000000000000000000000000000000000000000000000000000000000000, want 6a9d47248ed38852f5f4b2e37e7dfad0ce8d1da86b280feef94ef267e468cff2
   hashfuncs_test.go:148: DoubleHashRaw("If the enemy is within range, then so are you.") = 0000000000000000000000000000000000000000000000000000000000000000, want 2e7aa1b362c94efdbff582a8bd3f7f61c8ce4c25bbde658ef1a7ae1010e2126f
   hashfuncs_test.go:148: DoubleHashRaw("It's well we cannot hear the screams/That we create in others' dreams.") = 0000000000000000000000000000000000000000000000000000000000000000, want e6729d51240b1e1da76d822fd0c55c75e409bcb525674af21acae1f11667c8ca
   hashfuncs_test.go:148: DoubleHashRaw("You remind me of a TV show, but that's all right: I watch it anyway.") = 0000000000000000000000000000000000000000000000000000000000000000, want 09945e4d2743eb669f85e4097aa1cc39ea680a0b2ae2a65a42a5742b3b809610
   hashfuncs_test.go:148: DoubleHashRaw("C is as portable as Stonehedge!!") = 0000000000000000000000000000000000000000000000000000000000000000, want 1018d8b2870a974887c5174360f0fbaf27958eef15b24522a605c5dae4ae0845
   hashfuncs_test.go:148: DoubleHashRaw("Even if I could be Shakespeare, I think I should still choose to be Faraday. - A. Huxley") = 0000000000000000000000000000000000000000000000000000000000000000, want 97c76b83c6645c78c261dcdc55d44af02d9f1df8057f997fd08c310c903624d5
   hashfuncs_test.go:148: DoubleHashRaw("The fugacity of a constituent in a mixture of gases at a given temperature is proportional to its mole fraction.  Lewis-Randall Rule") = 0000000000000000000000000000000000000000000000000000000000000000, want 6bcbf25469e9544c5b5806b24220554fedb6695ba9b1510a76837414f7adb113
   hashfuncs_test.go:148: DoubleHashRaw("How can you write a big system without C++?  -Paul Glick") = 0000000000000000000000000000000000000000000000000000000000000000, want 1041988b06835481f0845be2a54f4628e1da26145b2de7ad1be3bb643cef9d4f
FAIL
exit status 1
FAIL    github.com/btcsuite/btcd/chaincfg/chainhash     0.001s

wire/msgtx.go Outdated
buf := bytes.NewBuffer(make([]byte, 0, msg.SerializeSizeStripped()))
_ = msg.SerializeNoWitness(buf)
return chainhash.DoubleHashH(buf.Bytes())
return chainhash.DoubleHashRaw(msg.SerializeNoWitness)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Roasbeef
Copy link
Member

I think this just needs a rebase!

DoubleHashRaw provides a simple function for doing double hashes.  Since
it uses the digest instead of making the caller allocate a byte slice, it
can be more memory efficient vs other double hash functions.
@kcalvinalvin
Copy link
Collaborator Author

Rebased

For ibd and block verification, I think #2023 is going to be better but that one's still in the works.

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready to land, but we'll need to do the module dance to make sure we don't break things along the way (see the newly added comment).

@@ -24,3 +24,5 @@ require (
replace github.com/btcsuite/btcd/btcutil => ../

replace github.com/btcsuite/btcd => ../..

replace github.com/btcsuite/btcd/chaincfg/chainhash => ../../chaincfg/chainhash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah ordering wise we'll need to:

  1. Remove this section.
  2. Merge the PR.
  3. Tag a new version.
  4. Make a new PR that actually uses the new functions.

// in byte slice. Pre-allocating here saves an allocation on the second
// hash as we can reuse it. This allocation also does not escape to the
// heap, saving an allocation.
buf := make([]byte, 0, HashSize)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping re this, not terribly blocking though.

@Roasbeef Roasbeef added this to the v0.24 milestone Dec 9, 2023
@kcalvinalvin
Copy link
Collaborator Author

Removed commits that were changing the go.mod files

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 📸

@Roasbeef
Copy link
Member

Will tag the new version, then make a new PR updating relevant go.mods, before a final PR that swaps this in everywhere.

@Roasbeef Roasbeef merged commit 96c9fd8 into btcsuite:master Dec 15, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants