Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestBootstrap fails occasionally #2745

Closed
Kubuxu opened this issue May 19, 2016 · 7 comments · Fixed by #2855
Closed

TestBootstrap fails occasionally #2745

Kubuxu opened this issue May 19, 2016 · 7 comments · Fixed by #2855
Labels
kind/test Testing work topic/test failure Topic test failure

Comments

@Kubuxu
Copy link
Member

Kubuxu commented May 19, 2016

http://ci.ipfs.team:8111/viewLog.html?buildId=2045&buildTypeId=GoIpfs_CiTests&tab=buildLog#_focus=4136&state=4136

More here: #188

@Kubuxu Kubuxu added topic/test failure Topic test failure kind/test Testing work labels May 19, 2016
@whyrusleeping
Copy link
Member

i hate this one. Its the 'misdial' issue, it affects a few different tests.

@whyrusleeping
Copy link
Member

@Kubuxu
Copy link
Member Author

Kubuxu commented Jun 11, 2016

My sanity tests caught it: https://travis-ci.org/ipfs/go-ipfs/jobs/136932615#L3812

@Kubuxu
Copy link
Member Author

Kubuxu commented Jun 11, 2016

I am able to replicate failure in less than a minute:

cd routing/dht
while time go test -run TestBootstrap; do; done

Looks like failure about one in five.

And sundelly now it goes without a failure for a long time, yeah but it is repeatable.

@whyrusleeping
Copy link
Member

https://travis-ci.org/ipfs/go-ipfs/jobs/137017648 heres another one

@Kubuxu
Copy link
Member Author

Kubuxu commented Jun 14, 2016

It think I figured out what is happening, the SO_REUSE_PORT flag on a socket makes it possible for socket 127.0.0.1:0 to randomise to the same address.
https://github.com/ipfs/go-libp2p/blob/master/p2p/net/swarm/swarm_listen.go#L27

I am running now tests on my server with IPFS_REUSEPORT=false env variable. If I am sure that it is the solution (no failure for few hours) I will work on a solution.

@Kubuxu
Copy link
Member Author

Kubuxu commented Jun 14, 2016

I found it pretty interesting that it is a problem but it makes sense, it is birthday paradox with calendar not of 365 days but 2^16. As it turns out for 10% failure rate binding 140 addresses is enough: https://www.wolframalpha.com/input/?i=y%3D+1+-+(((2%5E16)+nPr+n+)%2F(2%5E16)%5En)++from+0+to+200

I still will wait few hours to confirm that issues is gone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/test Testing work topic/test failure Topic test failure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants