Skip to content
This repository has been archived by the owner on Jun 3, 2020. It is now read-only.

tmkms v0.7.0 hang #389

Closed
creamjun opened this issue Dec 31, 2019 · 10 comments
Closed

tmkms v0.7.0 hang #389

creamjun opened this issue Dec 31, 2019 · 10 comments

Comments

@creamjun
Copy link

no error, just hang in tmkms v0.7.0

After tmkms stops first, I get the following error from gaiad:

E[2020-01-01|06:07:02.792] SignerClient::SignVote                       module=privval err="read tcp 127.0.0.1:26666->127.0.0.1:47828: i/o timeout: endpoint read timed out"
E[2020-01-01|06:07:02.792] Error signing vote                           module=consensus height=248863 round=0 vote="Vote{48:68F5BBEACEF1 248863/00/1(Prevote) DA64AD3E2B74 000000000000 @ 2019-12-31T21:06:59.79003623Z}" err="read tcp 127.0.0.1:26666->127.0.0.1:47828: i/o timeout: endpoint read timed out"
E[2020-01-01|06:07:05.792] SignerListener: Ping timeout                 module=privval
E[2020-01-01|06:07:08.793] SignerClient::GetPubKey                      module=privval err="endpoint connection timed out"
E[2020-01-01|06:07:08.797] CONSENSUS FAILURE!!!                         module=consensus err="runtime error: invalid memory address or nil pointer dereference" stack="goroutine 883 [running]:\nruntime/debug.Stack(0xc000c8f358, 0x1047dc0, 0x1f50a00)\n\t/usr/local/go/src/runtime/debug/stack.go:24 +0x9d\ngithub.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine.func2(0xc002aa6700, 0x13aa1e0)\n\t/home/jun/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:602 +0x57\npanic(0x1047dc0, 0x1f50a00)\n\t/usr/local/go/src/runtime/panic.go:679 +0x1b2\ngithub.com/tendermint/tendermint/consensus.(*ConsensusState).signAddVote(0xc002aa6700, 0x3cc02, 0xc00f754b20, 0x20, 0x20, 0x1, 0xc00f754b80, 0x20, 0x20, 0xb)\n\t/home/jun/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:1745 +0x585\ngithub.com/tendermint/tendermint/consensus.(*ConsensusState).enterPrecommit(0xc002aa6700, 0x3cc1f, 0x0)\n\t/home/jun/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:1155 +0x12ec\ngithub.com/tendermint/tendermint/consensus.(*ConsensusState).addVote(0xc002aa6700, 0xc004c2adc0, 0xc0020a20c0, 0x28, 0xc000c8faa8, 0x4466f7, 0xc002aa6798)\n\t/home/jun/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:1663 +0xdfa\ngithub.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote(0xc002aa6700, 0xc004c2adc0, 0xc0020a20c0, 0x28, 0xcc2f5edad59daf76, 0x109, 0x101)\n\t/home/jun/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:1527 +0x59\ngithub.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg(0xc002aa6700, 0x15612c0, 0xc00fda7e48, 0xc0020a20c0, 0x28)\n\t/home/jun/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:691 +0x252\ngithub.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine(0xc002aa6700, 0x0)\n\t/home/jun/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:633 +0x6eb\ncreated by github.com/tendermint/tendermint/consensus.(*ConsensusState).OnStart\n\t/home/jun/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:334 +0x13a\n"
@tarcieri
Copy link
Contributor

This looks potentially related to #385

Can you post the KMS-side logs, preferably with debug logging enabled?

Which signing backend are you using?

@creamjun
Copy link
Author

creamjun commented Jan 1, 2020

This looks potentially related to #385

Can you post the KMS-side logs, preferably with debug logging enabled?

nothing logs in kms.
I'll post if there's a problem after debug logging enabled.

Which signing backend are you using?

yubihsm

@tarcieri
Copy link
Contributor

tarcieri commented Jan 1, 2020

nothing logs in kms.

Are you running it with tmkms start -v to enable verbose logging?

What do you see immediately before it stops?

Can you perhaps try running it under strace? What OS are you using?

@creamjun
Copy link
Author

creamjun commented Jan 3, 2020

nothing logs in kms.

Are you running it with tmkms start -v to enable verbose logging?

tmkms did not start with -v when an error occurred.
now restarted with -v.
next time I'm having trouble with tmkms, I'll post log.

What do you see immediately before it stops?

Before stopping, there is right sign log. (signed PreVote & PreCommit)

Can you perhaps try running it under strace? What OS are you using?

OS : CentOS7

@tarcieri
Copy link
Contributor

tarcieri commented Jan 3, 2020

OS : CentOS7

For what it's worth, we also use CentOS 7 and don't experience this issue, so it doesn't seem to be specific to that.

@creamjun
Copy link
Author

creamjun commented Jan 4, 2020

OS : CentOS7

For what it's worth, we also use CentOS 7 and don't experience this issue, so it doesn't seem to be specific to that.

more info..
i using 3 chain per 1 hsm. (cosmos, terra, kava)

@creamjun
Copy link
Author

creamjun commented Jan 15, 2020

@tarcieri again, this time tmkms hang without log. (although given the -v option)

What's special is that there is a "yubihsm: session ...." log every second, and that log hasn't been taken five seconds before tmkms hang.

@tarcieri
Copy link
Contributor

Any chance you can try running it under strace?

@creamjun
Copy link
Author

I didn't think of strace.
next time I will use strace to record the log.

@creamjun
Copy link
Author

creamjun commented Feb 7, 2020

Since the v0.7.1 update, it doesn't seem to be a problem anymore.
Even if prevote occurs continuously, it works wall by reconnecting.
thank you.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants