Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test: Identify tests failing intermittently #1781

Closed
p-shahi opened this issue May 30, 2023 · 2 comments
Closed

Test: Identify tests failing intermittently #1781

p-shahi opened this issue May 30, 2023 · 2 comments
Labels
kind/bug A bug in existing code (including security flaws) kind/test Testing work P0 Critical: Tackled by core team ASAP

Comments

@p-shahi
Copy link
Member

p-shahi commented May 30, 2023

The upgrader test that checks if disconnect and connect events are being emitted fails intermittently as well as https://github.com/libp2p/js-libp2p/actions/runs/5117265449/jobs/9200220113?pr=1779 as result of a stream reset which seems to be related to the identify tests not closing streams.

It also results in our node tests occasionally hanging which eventually get shutdown by Github's CI.

As @achingbrain mentioned perhaps we could instrument the setup of the tests with why is node running so that we could have a log dump when CI kills prolonged running tests. This isn't ideal as the logs will still be quite noisy but it can be a start.

@p-shahi p-shahi added the need/triage Needs initial labeling and prioritization label May 30, 2023
@maschad maschad changed the title flaky upgrader test Test: Disconnect and connect events test failing intermittently May 30, 2023
@maschad maschad self-assigned this May 31, 2023
@maschad maschad added P0 Critical: Tackled by core team ASAP kind/bug A bug in existing code (including security flaws) kind/test Testing work and removed need/triage Needs initial labeling and prioritization labels May 31, 2023
@maschad maschad changed the title Test: Disconnect and connect events test failing intermittently Test: Identify tests failing intermittently Jun 2, 2023
@achingbrain
Copy link
Member

I've turned on debug logging in this PR and the tests have obliged by failing:

libp2p libp2p is stopping +15ms
libp2p:transports closing listeners for @libp2p/websockets +15ms
libp2p:connection-manager closing 1 connections +16ms
libp2p:transports all listeners closed +1ms
libp2p:yamux sending GoAway reason=InternalError +874ms
libp2p:connection-manager stopped +1ms
libp2p:yamux sending GoAway reason=InternalError +2ms
libp2p libp2p has stopped +3ms
libp2p libp2p is stopping +0ms
libp2p:transports closing listeners for @libp2p/websockets +3ms
libp2p:connection-manager closing 1 connections +3ms
libp2p:transports all listeners closed +0ms
libp2p:connection-manager stopped +0ms
libp2p libp2p has stopped +2ms
libp2p:transports adding transport @libp2p/websockets +2ms
libp2p:transports adding transport libp2p/circuit-relay-v2 +0ms
libp2p libp2p is starting +2ms
libp2p importing self key into keychain +0ms
libp2p:connection-manager started +3ms
libp2p:connection-manager dial 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +3ms
libp2p:connection-manager:auto-dial not enough connections 0/50 - will dial peers to increase the number of connections +24ms
libp2p:connection-manager:auto-dial selected 0/0 peers to dial +1ms
libp2p:websockets dialing 127.0.0.1:15001 +1ms
libp2p:upgrader Starting the outbound connection upgrade +22ms
libp2p:upgrader encrypting outbound connection to 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +2ms
libp2p:plaintext write pubkey exchange to peer 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +31ms
libp2p:plaintext read pubkey exchange from peer 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +3ms
libp2p:plaintext plaintext key exchange completed successfully with peer 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +0ms
libp2p:upgrader /yamux/1.0.0 selected as muxer protocol +3ms
libp2p:upgrader Successfully upgraded outbound connection +0ms
libp2p:connection-manager:connection-pruner checking max connections limit 1/300 +33ms
libp2p:circuit-relay:transport:reservation-store add relay 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +24s
libp2p:connection-manager dial 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +20ms
libp2p:connection-manager had an existing connection to 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +1ms
libp2p:mss:handle respond with "na" for "/ipfs/id/1.0.0" +209ms
libp2p:upgrader:error @http://127.0.0.1:32975/bundle-out.js:57457:38 +219ms
libp2p:circuit-relay:transport:reservation-store created reservation on relay peer 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +25ms
libp2p libp2p has started +52ms
libp2p:transports adding transport @libp2p/websockets +2ms
libp2p:transports adding transport libp2p/circuit-relay-v2 +0ms
libp2p libp2p is starting +1ms
libp2p importing self key into keychain +1ms
libp2p:connection-manager started +30ms
libp2p:transports no addresses were provided for listening, this node is dial only +2ms
libp2p:connection-manager:auto-dial not enough connections 0/50 - will dial peers to increase the number of connections +51ms
libp2p libp2p has started +2ms
libp2p:connection-manager dial 12D3KooWNvSZnPi3RrhrTwEY4LuuBeB6K6facKUCJcyWG1aoDd2p +3ms
libp2p:connection-manager:auto-dial selected 0/0 peers to dial +2ms
libp2p:connection-manager dial 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +3ms
libp2p:connection-manager:dial-queue loading multiaddrs for 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +2ms
libp2p:websockets dialing 127.0.0.1:15001 +1ms
libp2p:upgrader Starting the outbound connection upgrade +44ms
libp2p:upgrader encrypting outbound connection to 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +4ms
libp2p:plaintext write pubkey exchange to peer 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +55ms
libp2p:plaintext read pubkey exchange from peer 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +3ms
libp2p:plaintext plaintext key exchange completed successfully with peer 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +0ms
libp2p:upgrader /yamux/1.0.0 selected as muxer protocol +3ms
libp2p:upgrader Successfully upgraded outbound connection +1ms
libp2p:connection-manager:connection-pruner checking max connections limit 1/300 +60ms
libp2p:mss:handle respond with "na" for "/ipfs/id/1.0.0" +59ms
libp2p:upgrader:error @http://127.0.0.1:32975/bundle-out.js:57457:38 +55ms
libp2p:mss:handle respond with "na" for "/libp2p/circuit/relay/0.2.0/stop" +12ms
libp2p:upgrader outbound: starting new stream on [ '/echo/1.0.0' ] +1ms
libp2p:upgrader No abort signal was passed while trying to negotiate protocols [ '/echo/1.0.0' ] falling back to default timeout +0ms
libp2p:upgrader:error @http://127.0.0.1:32975/bundle-out.js:57457:38 +22ms
libp2p:connection-manager:auto-dial not enough connections 0/50 - will dial peers to increase the number of connections +66ms
libp2p:connection-manager:auto-dial selected 1/1 peers to dial +1ms
libp2p:connection-manager:auto-dial connecting to a peerStore stored peer 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +2ms
libp2p:connection-manager dial 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +67ms
libp2p:connection-manager:dial-queue loading multiaddrs for 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +45ms
libp2p:websockets dialing 127.0.0.1:15001 +0ms
libp2p:yamux sending GoAway reason=NormalTermination +134ms
    1) should emit connect and disconnect events
libp2p libp2p is stopping +82ms
libp2p:transports closing listeners for @libp2p/websockets +83ms
libp2p:transports closing listeners for libp2p/circuit-relay-v2 +2ms
libp2p:connection-manager closing 1 connections +14ms
libp2p:transports all listeners closed +1ms
libp2p:upgrader Starting the outbound connection upgrade +62ms
libp2p:yamux sending GoAway reason=NormalTermination +29ms
libp2p:connection-manager stopped +19ms
libp2p libp2p has stopped +22ms
libp2p libp2p is stopping +0ms
libp2p:transports closing listeners for @libp2p/websockets +21ms
libp2p:transports closing listeners for libp2p/circuit-relay-v2 +0ms
libp2p:connection-manager closing 0 connections +4ms
libp2p:transports all listeners closed +2ms
libp2p:connection-manager stopped +1ms
libp2p libp2p has stopped +7ms
libp2p:upgrader encrypting outbound connection to 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +22ms
libp2p:plaintext write pubkey exchange to peer 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +93ms
libp2p:plaintext read pubkey exchange from peer 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +3ms
libp2p:plaintext plaintext key exchange completed successfully with peer 12D3KooWHFKTMzwerBtsVmtz4ZZEQy2heafxzWw6wNn5PPYkBxJ5 +0ms
  191 passing (35s)
  1 pending
  1 failing
  1) libp2p.upgrader
       should emit connect and disconnect events:
     stream reset

Yamux is selected as a muxer, but then the interesting thing here is that it fails to negotiate identify and circuit relay/stop over the connection. Yamux then closes the connection which I would imagine is what causes the reset.

libp2p:upgrader /yamux/1.0.0 selected as muxer protocol +3ms
libp2p:upgrader Successfully upgraded outbound connection +1ms
libp2p:mss:handle respond with "na" for "/ipfs/id/1.0.0" +59ms
libp2p:upgrader:error @http://127.0.0.1:32975/bundle-out.js:57457:38 +55ms
libp2p:mss:handle respond with "na" for "/libp2p/circuit/relay/0.2.0/stop" +12ms
libp2p:upgrader outbound: starting new stream on [ '/echo/1.0.0' ] +1ms
libp2p:upgrader No abort signal was passed while trying to negotiate protocols [ '/echo/1.0.0' ] falling back to default timeout +0ms
libp2p:upgrader:error @http://127.0.0.1:32975/bundle-out.js:57457:38 +22ms
libp2p:yamux sending GoAway reason=NormalTermination +134ms

So there may be a race condition on startup - are connections being opened before the protocol handlers are registered?

@achingbrain
Copy link
Member

Should have been fixed by #1795 please reopen if it reoccurs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) kind/test Testing work P0 Critical: Tackled by core team ASAP
Projects
Archived in project
Development

No branches or pull requests

3 participants