Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p: panic in enode DB Close on shutdown (#9237) #9240

Merged
merged 1 commit into from
Jan 16, 2024
Merged

Conversation

battlmonstr
Copy link
Contributor

If any DB method is called while Close() is waiting for db.kv.Close()
(it waits for ongoing method calls/transactions to finish)
a panic: "WaitGroup is reused before previous Wait has returned" might happen.

Use context cancellation to ensure that new method calls immediately return during db.kv.Close().

If any DB method is called while db.Close() is waiting for db.kv.Close()
(it waits for ongoing method calls to finish)
a panic: "WaitGroup is reused before previous Wait has returned" might happen.

Use context cancellation to ensure that new method calls immediately return during db.kv.Close().
Copy link
Collaborator

@AskAlexSharov AskAlexSharov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

problem is a bit deeper:

panic: sync: WaitGroup is reused before previous Wait has returned

goroutine 508149 [running]:
sync.(*WaitGroup).Wait(0x3?)
        sync/waitgroup.go:118 +0x7f
github.com/ledgerwatch/erigon-lib/kv/mdbx.(*MdbxKV).Close(0xc0076199e0?)
        github.com/ledgerwatch/[email protected]/kv/mdbx/kv_mdbx.go:497 +0x45
github.com/ledgerwatch/erigon/p2p/enode.(*DB).Close(0xc002708030)
        github.com/ledgerwatch/erigon/p2p/enode/nodedb.go:615 +0x65
github.com/ledgerwatch/erigon/p2p.(*Server).Stop(0xc000004c00)
        github.com/ledgerwatch/erigon/p2p/server.go:451 +0xed

kv_mdbx.go has WaitGroup+closed atomic bool. Purpose of this fields is: 1. guarantee all active readers are done before close (otherwise may get segfault/race). 2. must be safe to call Close multiple times (underlying env.Close will called once) 3. if db closing started - new readers must be rejected

So, i think we need solve it inside kv_mdbx.go (of-course using context.Background() here is not good also). Then we will not need additional mechanics in each db: txpool/nodedb/downloader/etc...

@AskAlexSharov
Copy link
Collaborator

related to #8678

@AskAlexSharov
Copy link
Collaborator

hmm... not sure if this race is still present or fixed: #8409

@AskAlexSharov AskAlexSharov merged commit e979d79 into devel Jan 16, 2024
7 checks passed
@AskAlexSharov AskAlexSharov deleted the pr/9237 branch January 16, 2024 08:34
battlmonstr added a commit that referenced this pull request Jan 17, 2024
If any DB method is called while Close() is waiting for db.kv.Close()
(it waits for ongoing method calls/transactions to finish)
a panic: "WaitGroup is reused before previous Wait has returned" might
happen.

Use context cancellation to ensure that new method calls immediately
return during db.kv.Close().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants