-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues when tests are run inside a container #134
Comments
Hi Oliver Re 1, I'm definitely supportive in principle! My only slight worry is that you might have a fair amount of work to do to cover all the possibilities. I think if we could have a way to 'do the right thing automatically' 80% of the time but allow some kind of manual override for teams who manage to find a more exotic network configuration, that would be fine. I did a small portion of the work on this for #90, which is actually sitting in a branch and could be picked up if you want ( Re 2, the problem sounds interesting - like the docker TCP proxy is leaving its listening socket open after the first attempt at connecting through fails. I think you're quite right that the solution to this is some smarter liveness checks, and #133 seems like the right way to do this. I'll work with @outofcoffee to get this PR in as soon as possible. Thanks Richard |
I share your concern about exotic network configurations. I guess there are a few variables:
If I make the change somewhere in DockerClientFactory.dockerHostIpAddress() or DockerClientConfigUtils, then the affected scope (as far as I can see) is just GenericContainer and its subclasses. Currently, there's only support in GenericContainer for running containers in "bridge" mode (unless a user subclasses applyConfiguration), which simplifies point 1. Point 2 is more of an issue. As I see the cases:
Point 3 is also nontrivial. Fortunately, since we're only concerned with cases where the tests are running inside a Docker container, we can assume that the test environment is Linux-based - no BSD or Windows to worry about. I suggested using "netstat -nr" to look at routing information, but that command is not installed by default in debian:jessie or various ubuntus. The more "modern" replacement is "ip route", which is present everywhere... except the freshly released Ubuntu 16.04. I'm still looking for an alternative there. |
Hi @ostacey, FYI
We would be happy if you test it! |
My organization is running Jenkins build jobs in Docker. That makes configuring Jenkins a lot simpler; there's only one typeof build node regardless of what you're building.
However, I've run into some problems when triggering testcontainers from inside Docker.
This issue is fixable. First, you need to detect if the test is running inside Docker. For that, you can look to see if the file "/.dockerenv" exists - inside a container, it is always present. Second, you need to know what IP to provide in that case. Luckily, you don't need the IP of the docker host; the default gateway (usually 172.17.0.1) is normally sufficient. You can get this by running "netstat -nr" and parsing the output. (That may vary with more complex networking scenarios.)
If there's interest, and the approach seems palatable, I can assemble a pull request for this.
The final nc command will connect to the socket and just sit there. That's right, for some reason Docker is accepting connections on the mapped port, even though there's nothing listening. If you try to do anything with the port, you'll get "connection reset by peer" or other error. (I'm not sure if this is intended behavior on the Docker side or an accident; I haven't dug in that deeply.)
This behavior significantly breaks the container startup flow in GenericContainer. There, we consider the container to have started as soon as we can get a socket connection to the port. Sadly, the "new Socket().close()" code there is just as fooled as "nc", so it can proceed to testing with a not-yet-functional container.
Obviously, a Thread.sleep() call at the top of a test is a (bad) workaround for this. The changes in pull request 113 would provide a solution for most users out-of-the-box, and will allow developers who are testing non-http services to write their own wait strategies.
The text was updated successfully, but these errors were encountered: