-
Notifications
You must be signed in to change notification settings - Fork 882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
port already allocated #1790
Comments
Close moby#1790 Signed-off-by: saiwl <[email protected]>
ping @mavenugo |
@saiwl is this a case with |
@saiwl the problem that you describe makes sense to me. I actually created this PR #1805 with the objective to simplify a bit the logic there in the sandboxCleanup. As you are correctly stating in the description it can happen that the driver and the sandbox stores go out of sync. The idea of my patch is to make the network the source of truth, so that we can reconstruct the sandbox endpoints directly from there. That simplify the logic and should ensure that we are not missing endpoints. |
@fcrisciani Thanks. |
I am hitting this issue also, but I can't use live-restore as I am in swarm mode. testing both your PR's I will let you know how I go. |
Thanks @saiwl yes, I took the part where you fetch all the endpoints from the network, I also aggregated them by sandbox ID so to not iterate through them again later and also I removed the logic that was looping on the ones from the sandbox. If @saiwl and @bdeluca can give it a try considering that you were seeing this issue would be a great validation for the patch itself and proceed further. This code path is kind of tricky so we want to avoid introducing some new issues. Thanks! |
Hi @fcrisciani so your patch doesn't appear to fix my issue, My issue is I followed a trail of similar issues to @saiwl last PR and I think this might be related. But if you say this seems like a different issue I will disappear (my issue is trivial to replicate) |
@bdeluca
|
1. single node.
2. the more containers I have the more likely it is to happen.
example
docker service create --name registry0 --constraint node.role==manager
--publish 5000:5000 registry:2
docker service create --name registry1 --constraint node.role==manager
--publish 5001:5000 registry:2
docker service create --name registry2 --constraint node.role==manager
--publish 5002:5000 registry:2
......
docker service create --name registry10 --constraint node.role==manager
--publish 5010:5000 registry:2
after is created first time you will be able to attack the ports on
localhost 5000-5010.
reboot the machine.
ports will randomly not be available.
Note: I am just using the registry image as an example.
Mostly the swarm cluster isnt down but I was testing failure modes and
discovered this one.
3. I see
address already in use
port already in use.
or just nothing, everything looks like it should be open but it is not.
On 13 June 2017 at 21:25, Flavio Crisciani ***@***.***> wrote:
@bdeluca
couple of questions:
are you able to reproduce on single node or only multi node?
"a lot of containers" means? can you give an indication about your test?
is there a specific error message in the logs that can explain why the
port is not exposed?
…
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
example after is created first time you will be able to attack the ports on localhost 5000-5010. Note: I am just using the registry image as an example. Mostly the swarm cluster isnt down but I was testing failure modes and discovered this one.
|
Some where some thing is very confused. swarm thinks things have other ip addresses than they do.
|
my simple example with the registry doesnt work because on every container port 5000 is open. |
@fcrisciani I am also hitting this issue even with live-restore option and also i am using always restart option for all the containers . this only happens when the system is rebooted . I have given you the logs i receive below .
|
@fcrisciani @saiwl I get same issue on my environment. My reproduce script as follow:
The #1805 can not resolve this scene. |
Any chance of merging #1805 ? Would be great to address the original issue here, even if other problem scenarios remain. |
We met "port already allocated" problem in our docker environment. It always happened after docker-daemon restarts abnormally or machine restarts abnormally.
I read the related code, and found a possible bug about this.
In the source code, the process of creating a container using port mapping is like below:
And the restore process after the daemon restarts is like below:
In the creating process, if the docker daemon or the machine restarts abnormally between step 5 and step 7 which truely happened in our environment, after docker daemon restarts, the port mapping would be restored in step 1 of restore process and will not be released in step 2 of restore process because sandbox was not updated, which causing ports leak.
I have made a simple fix which has been tested in our environment. I will make a PR later. Looking forward to your suggestions!
The text was updated successfully, but these errors were encountered: