Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker for mac beta and docker:start with wait/tcp/ports incorrectly waits on container port #430

Closed
stephenc opened this issue Apr 18, 2016 · 23 comments

Comments

@stephenc
Copy link

Running a maven project that uses the wait/tcp/ports functionality to wait for the container to listen on a specific port:

Observed results:

With docker-machine, the logs output

[INFO] DOCKER> [mysql:5.7] "db": Start container 5289bed0ef48
[INFO] DOCKER> [mysql:5.7] "db": Waiting for exposed ports [32780] on remote host (192.168.251.207), since they are not directly accessible.

With docker for mac beta:

[INFO] DOCKER> [mysql:5.7] "db": Start container d79b36323981
[INFO] DOCKER> [mysql:5.7] "db": Waiting for ports [3306] directly on container with IP (172.17.0.2).
[ERROR] DOCKER> [mysql:5.7] "db": Timeout after 60405 ms while waiting on tcp port '[/172.17.0.2:3306]'
[ERROR] DOCKER> Error occurred during container startup, shutting down...

Expected results:

Build should pass with either docker-machine or docker for mac

Preliminary analysis:

Something seems to be going wrong in the detection of whether the container is directly accessible.

@rhuss
Copy link
Collaborator

rhuss commented Apr 25, 2016

Unfortunately I don't have acess to Docker for Mac yet, so its a bit hard to reproduce it ;-)

'hope its ok to wait until Docker for Mac is publicly available ....

@agudian
Copy link

agudian commented Apr 29, 2016

I had similar problems, but they were resolved with yesterdays update to Docker for Mac Version 1.11.0-beta9. @stephenc, perhaps your use-case works now as well?

@rhuss
Copy link
Collaborator

rhuss commented May 19, 2016

Finally I got my Docker-for-Mac token. @stephenc do you have a pom.xml to share so that I can reproduce the issue ?

@stephenc
Copy link
Author

<?xml version="1.0" encoding="utf-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>localdomain.localhost</groupId>
  <artifactId>hangs-on-mac</artifactId>
  <version>1-SNAPSHOT</version>

  <build>
    <plugins>
      <plugin>
        <groupId>io.fabric8</groupId>
        <artifactId>docker-maven-plugin</artifactId>
        <version>0.15.2</version>
        <configuration>
          <autoPull>true</autoPull>
          <images>
            <image>
              <alias>db</alias>
              <name>mysql:5.7</name>
              <run>
                <wait>
                  <tcp>
                    <ports>
                      <port>3306</port>
                    </ports>
                  </tcp>
                  <time>60000</time>
                  <exec>
                    <postStart>mysqladmin -uroot -pnew-password create test-db</postStart>
                  </exec>
                </wait>
                <ports>
                  <port>+mysql.host:mysql.port:3306</port>
                </ports>
                <env>
                  <MYSQL_ROOT_PASSWORD>new-password</MYSQL_ROOT_PASSWORD>
                </env>
              </run>
            </image>
          </images>
        </configuration>
        <executions>
          <execution>
            <id>test</id>
            <goals>
              <goal>start</goal>
              <goal>stop</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

Using Docker:

screen shot 2016-05-23 at 09 27 23

I get the following output:

$ mvn clean verify
[INFO] Scanning for projects...
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building hangs-on-mac 1-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hangs-on-mac ---
[INFO] Deleting /Users/stephenc/tmp/target
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hangs-on-mac ---
[WARNING] Using platform encoding (UTF-8 actually) to copy filtered resources, i.e. build is platform dependent!
[INFO] skip non existing resourceDirectory /Users/stephenc/tmp/src/main/resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hangs-on-mac ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hangs-on-mac ---
[WARNING] Using platform encoding (UTF-8 actually) to copy filtered resources, i.e. build is platform dependent!
[INFO] skip non existing resourceDirectory /Users/stephenc/tmp/src/test/resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hangs-on-mac ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ hangs-on-mac ---
[INFO] No tests to run.
[INFO] 
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ hangs-on-mac ---
[WARNING] JAR will be empty - no content was marked for inclusion!
[INFO] Building jar: /Users/stephenc/tmp/target/hangs-on-mac-1-SNAPSHOT.jar
[INFO] 
[INFO] --- docker-maven-plugin:0.15.2:start (test) @ hangs-on-mac ---
[INFO] DOCKER> [mysql:5.7] "db": Start container 26860bc3443c
[INFO] DOCKER> [mysql:5.7] "db": Waiting for ports [3306] directly on container with IP (172.17.0.2).
[ERROR] DOCKER> [mysql:5.7] "db": Timeout after 60359 ms while waiting on tcp port '[/172.17.0.2:3306]'
[ERROR] DOCKER> Error occurred during container startup, shutting down...
[INFO] DOCKER> [mysql:5.7] "db": Stop and remove container 26860bc3443c
[ERROR] DOCKER> [mysql:5.7] "db": Timeout after 60359 ms while waiting on tcp port '[/172.17.0.2:3306]'
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:02 min
[INFO] Finished at: 2016-05-23T09:28:12+01:00
[INFO] Final Memory: 20M/308M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal io.fabric8:docker-maven-plugin:0.15.2:start (test) on project hangs-on-mac: [mysql:5.7] "db": Timeout after 60359 ms while waiting on tcp port '[/172.17.0.2:3306]' -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

The project builds just fine if I switch to docker-machine start && eval $(docker-machine env) && mvn clean verify

joshua-rutherford pushed a commit to joshua-rutherford/docker-maven-plugin that referenced this issue Jun 22, 2016
@joshua-rutherford
Copy link

Looking at this quickly it seems that if the host is 'localhost' and the container has an IP address then the plugin assumes that the container is reachable on that IP address (https://github.com/fabric8io/docker-maven-plugin/blob/master/src/main/java/io/fabric8/maven/docker/StartMojo.java#L192). This is not the case with my version of Docker for Mac (Version 1.12.0-rc2-beta16 (build: 9493)). I'm curious if it wouldn't be simpler to just omit the special case of localhost and an IP address. Is there in fact a case where this is used?

At any rate, I've opened a pull request that does just that.

@rhuss
Copy link
Collaborator

rhuss commented Jun 22, 2016

Now that Docker for Mac is in public beta, I can verify the issue and that Docker for Mac joins the Mac's network space, so in this case localhost is the correct choice.

I have to grab in my memories, why this fallback has been introduced. Not sure whether it ever worked as it seems that the internal IP was used (which is not accessible from the outside by this plugin).

@joshua-rutherford
Copy link

One interesting point I'm seeing locally is that with my pull request, I'm getting different behavior with docker for mac than I was with docker-machine. In short, with docker-machine the TCP connection failed thus the wait works. On docker for mac, the connection actually succeeds but then is closed immediately by the remote (but not before we pass the check). In particular, I'm seeing this with the following configuration:

                        <image>
                            <name>cassandra:2.2.6</name>
                            <run>
                                <ports>
                                    <port>cassandra.port:9042</port>
                                </ports>
                                <wait>
                                    <tcp>
                                        <ports>
                                            <port>9042</port>
                                        </ports>
                                    </tcp>
                                    <time>60000</time>
                                </wait>
                            </run>
                        </image>

I've not done a lot of trouble shooting here, but I'm moving to a log based wait in the mean time.

@rhuss
Copy link
Collaborator

rhuss commented Jun 25, 2016

Now that I had a closer look, the issue we had already as described in #304 .

Waiting on a TCP ports works by trying to open a connections on the port. Docker maps the internal container ports to ports on the Docker daemon's host. However, Docker uses by default a so called user-proxy which opens these port immediately even when they are not reachable on the container. So a connect always works on the very beginning. That was also the reason why this localhost hack was introduced. (see #304 for details).

It seems now that Docker for Mac doesnt use user proxy of works differently in general.

I still need to dig a bit deeper to find a proper solution.

@rhuss
Copy link
Collaborator

rhuss commented Jun 25, 2016

@stephenc @joshua-rutherford had the chance to refresh my memory again with respect to the issue #304 and in particular with the Docker user-proxy used for port mapping (as described in moby/moby#14856).

There are two modes how you can access a Docker daemon: Either via Unix socket on Linux system (and now with Docker for Mac) other via remote HTTP. When using a Docker daemon via HTTP over TCP then everything is fine: The ports get connected via the external IP and the mapped ports.

However when using a unix socket, the port mapping works that there will be a proxy started when a container is started which immediately opens the mapped ports and proxies these to the container ports. This means that in this case a wait check will return immediately, even when the the container is not ready. On a linux system you can however access the container IP directly (via the Docker bridge) and which gets routed, so that you can access the original (so, non mapped ports) with the check. Thats was the 'localhost' mode mentioned by @joshua-rutherford

Now for Docker for Mac the container IP is not routed anymore so that the solution above does not work anymore ;-( Unfortunately the user proxy is still the default mode (however it was to supposed to be switched off by default for quite some time).

I introduced now a tcp 'mode' which can direct or mapped so that the behaviour can be selected explicitely (and not via the host). However, neither will work for Docker for Mac

  • direct does not work because the container IP is not routed.
  • mapped does not work since the user proxy will open the port immediately so the TCP wait return immediately.

Currently I dont see a good solution ;-( Any ideas ?

@rhuss rhuss mentioned this issue Jun 26, 2016
rhuss added a commit that referenced this issue Jun 26, 2016
Can bei either "mapped" which uses a remote Docker IP and mapped ports or "direct" where the container is contacted directly with the unmapped ports (needed to avoid issues with the user-proxy of Docker daemons which are enabled by default).

"mapped" is the default when host is not "localhost", "direct" when host is "localhost".
@aromanet42
Copy link

Hi

Some of my coworkers have the same issue.

Do you have any update ?
Is there a workaround ?

Cheers

@rhuss
Copy link
Collaborator

rhuss commented Sep 14, 2016

Sorry, I'm not aware for a workaround. Haven't check the upstream development recently, though. So maybe it could be possible that the container IP is directly accessible nowadays. But not really sure, and probably not the case (since its still running in a VM).

A solution could be to configure Docker for Mac to be accessed via TCP, but I don't know whether this is possible. Remember, when accessing the Docker daemon via TCP there is no problem at all.

Yet another idea is to support the new Docker health checks directly. Tbh didn't yet looked into it.

I'm open for any ideas to solve this.

@sina-golesorkhi
Copy link

@aromanet42 The "workaround" is to have a docker machine like before version 1.12 if you want to still use 1.12 and this plugin at the same time.

@joshua-rutherford
Copy link

I found that in most cases I could wait on a log message that announced the port binding instead of hitting the port directly.

@aromanet42
Copy link

Indeed, waiting on a log message work for us as well.

Thanks :)

@anuruddhal
Copy link

anuruddhal commented Apr 6, 2017

Any update on this? Waiting on log message doesn't work for MySQL as the server is actually not ready to serve the requests, by the time log is printed.
mysqld: ready for connections.

@Sammers21
Copy link

Sammers21 commented Jun 30, 2017

I have the same problem with wait timeout on Mac

image
image

@ol2ka
Copy link

ol2ka commented Nov 8, 2017

Hi! Any news/workarounds for this?

@Sammers21
Copy link

@ol2ka , sure, just remove these tags

<wait>
            <tcp>
                        <ports>
                                       <port>

@joshua-rutherford
Copy link

I for one haven't had to use another work around, but reviewing the documentation. You could configure a health check that is more robust than just establishing an ephemeral connection and wait on that with the healthy wait: https://dmp.fabric8.io/#build-healthcheck

Haven't looked at this, just tossing it out as an idea.

@DannyNoam
Copy link

The following seems to do the trick for me:

export DOCKER_HOST=unix:///var/run/docker.sock

@nuzayats
Copy link

nuzayats commented Jun 2, 2019

I ended up putting the following shell script as "/mysql/wait.sh" into my Maven project

#!/bin/sh
until echo select 1 | mysql -h localhost -uroot -pmy-secret-pw --protocol tcp --port=3306; do sleep 2s; done

Then setting like this:

<configuration>
    <images>
        <image>
            <alias>database</alias>
            <name>mysql:5.7.26</name>
            <run>
                <ports>
                    <port>3306:3306</port>
                </ports>
                <volumes>
                    <bind>
                        <volume>${project.basedir}/mysql:/mysql</volume>
                    </bind>
                </volumes>
                <wait>
                    <log>mysqld: ready for connections</log>
                    <time>20000</time>
                    <exec>
                        <postStart>sh /mysql/wait.sh</postStart>
                    </exec>
                </wait>
                <env>
                    <MYSQL_ROOT_PASSWORD>my-secret-pw</MYSQL_ROOT_PASSWORD>
                    <MYSQL_DATABASE>testdb</MYSQL_DATABASE>
                </env>
            </run>
        </image>
    </images>
</configuration>

It seems to be working for me. Here's working example: https://github.com/nuzayats/junit5examples/blob/master/pom.xml

@stephenc
Copy link
Author

stephenc commented Jun 2, 2019

I have had so many issues with images that do not have proper health checks that I just have developed the habit of ensuring they have a proper healthcheck. As a result I can bypass this issue by just waiting for the image to be healthy... though that does sometimes require (if the image connects to others) that I make the downstream images retry failed connections - but you'd want your images to do that anyway

@stephenc
Copy link
Author

stephenc commented Oct 1, 2021

Closing as likely an upstream issue with docker and not much this plugin can do about it

@stephenc stephenc closed this as completed Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests