Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic in txpool with 2 sentries #4679

Closed
yperbasis opened this issue Jul 8, 2022 · 3 comments · Fixed by #6048
Closed

Panic in txpool with 2 sentries #4679

yperbasis opened this issue Jul 8, 2022 · 3 comments · Fixed by #6048
Assignees

Comments

@yperbasis
Copy link
Member

When I set up 2 sentries (backend.sentryServers in eth/backend.go), after some time Erigon crashes with

[INFO] [07-01|11:51:03.646] [1/16 Headers] Inserting headers         progress=14739189 queue=1
panic: interface conversion: interface {} is nil, not *simplelru.entry

goroutine 3496 [running]:
github.com/hashicorp/golang-lru/simplelru.(*LRU).removeElement(0x17b61e0?, 0xc0612b5660?)
        github.com/hashicorp/[email protected]/simplelru/lru.go:172 +0x107
github.com/hashicorp/golang-lru/simplelru.(*LRU).removeOldest(0x1693540?)
        
github.com/hashicorp/[email protected]/simplelru/lru.go:165 +0x34
github.com/hashicorp/golang-lru/simplelru.(*LRU).Add(0xc073811d80, {0x161a440?, 0xc0c67ab5c0}, {0x1664a60?, 0x2a7af50})
        github.com/hashicorp/[email protected]/simplelru/lru.go:67 +0x394
github.com/ledgerwatch/erigon-lib/txpool.(*TxPool).discardLocked(0xc08c3baf00, 0xc0c6b4dcc0, 0xe)
        github.com/ledgerwatch/[email protected]/txpool/pool.go:1077 +0x15a
github.com/ledgerwatch/erigon-lib/txpool.promote(0xc06aab7bc0, 0xc06aab7c00, 0xc06aab7c40, 0x0, 0xc0612b5be8)
        github.com/ledgerwatch/[email protected]/txpool/pool.go:1310 +0x394
github.com/ledgerwatch/erigon-lib/txpool.addTxs(0x0, {0x1f27260, 0xc0c6b18c78}, 0x1152?, {{0xc0b802c600, 0x2c, 0x40}, {0xc0c646b200, 0x370, 0x480}, ...}, ...)
        github.com/ledgerwatch/[email protected]/txpool/pool.go:937 +0x787
github.com/ledgerwatch/erigon-lib/txpool.(*TxPool).processRemoteTxs(0xc08c3baf00, {0x1f30168, 0xc000718300?})
        github.com/ledgerwatch/[email protected]/txpool/pool.go:510 +0x48d
github.com/ledgerwatch/erigon-lib/txpool.MainLoop({0x1f30168?, 0xc000718300}, {0x1f36ef0, 0xc05750a460}, {0x20?, 0x3fb2b6d0c?}, 0xc08c3baf00, 0xc0738627e0, 0xc06aab7c80, 0xc073801ec0, ...)
        github.com/ledgerwatch/[email protected]/txpool/pool.go:1343 +0x54e
created by github.com/ledgerwatch/erigon/eth.New
        github.com/ledgerwatch/erigon/eth/backend.go:477 +0x3031
@AskAlexSharov
Copy link
Collaborator

Great. Now we know that need 2 sentries to reproduce it.
Need run txpool with -race to find reason

@holiman
Copy link
Contributor

holiman commented Nov 11, 2022

https://github.com/ledgerwatch/erigon-lib/blob/f0140de612446011db101d28a217b347827f2dc4/txpool/pool.go#L579

RLock around Get is insufficient, because Get updates the recentness.

Same as

#3799
#4396

I made the same mistake in go-ethereum recently. Fix: ethereum/go-ethereum#26164

@yperbasis
Copy link
Member Author

https://github.com/ledgerwatch/erigon-lib/blob/f0140de612446011db101d28a217b347827f2dc4/txpool/pool.go#L579

RLock around Get is insufficient, because Get updates the recentness.

Same as

#3799 #4396

I made the same mistake in go-ethereum recently. Fix: ethereum/go-ethereum#26164

Oh, I see. We'll fix it. @holiman thank you for pointing this out!

yperbasis added a commit to ledgerwatch/erigon-lib that referenced this issue Nov 14, 2022
[simplelru](https://github.com/hashicorp/golang-lru/tree/master/simplelru)
is not thread safe. During the `Get` operation, the recentness of the
accessed item is updated, so it is not a pure read-operation. Therefore,
the mutex we need to protect the LRUs in txpool is a full mutex, not
`RLock`. See
erigontech/erigon#4679 (comment)
and ethereum/go-ethereum#26164.

Also, RWMutex has a performance overhead compared with the vanilla one
(see, for example, golang/go#38813).

Kudos to Martin Swende for pointing to the issue.
@yperbasis yperbasis linked a pull request Nov 14, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants