-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster up: use docker engine-api client #14729
cluster up: use docker engine-api client #14729
Conversation
@openshift/devex ptal Just a note on the client code ... a good bit of similar code exists in the kube code base and I copied a couple of functions from it. However, I decided against calling the Kube client directly because:
|
[testextended][extended:clusterup] |
1776315
to
72d9a8f
Compare
err error | ||
} | ||
|
||
type apiResult interface{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused?
I wish we could get away with not having yet another docker API wrapper, but assuming this wish can't come true, honestly I would prefer it to closely match the naming/semantics of the underlying client - for example:
My impression is that wherever we add restrictions on the underlying API (e.g. provide 2 underlying API calls in one wrapper function, or hard code arguments to the underlying API), we create additional future work for ourselves when we undo that. Also I don't see the benefit of adding additional layers of scaffolding (e.g. dockerClient, containerClient, imageClient, execClient...). Also it feels like we just have a huge amount of "helper" infrastructure in pkg/bootstrap. I think it adds quite a lot of cognitive load, e.g. when diagnosing concurrency issues in the underlying API, which I've had to do twice in the last 6 months. |
return c.endpoint | ||
} | ||
|
||
func (c *dockerClient) Version() (*types.Version, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would prefer ServerVersion, etc., etc. (audit all function names)
return &info, err | ||
} | ||
|
||
func (c *dockerClient) ContainerNames() ([]string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would prefer this to be ContainerList - move the additional smarts to the caller - there's only one.
} | ||
|
||
type containerClient struct { | ||
name string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest id (since there's an ID() function)
} | ||
|
||
type execClient struct { | ||
name string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest id (since there's an ID() function)
stdinDone := make(chan struct{}) | ||
go func() { | ||
if inputStream != nil { | ||
io.Copy(resp.Conn, inputStream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no error reporting if inputStream copy fails
return c.name | ||
} | ||
|
||
func (c *execClient) StartAndWait(stdIn io.Reader, stdOut, stdErr io.Writer) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
misleading name
|
||
const ( | ||
// defaultTimeout is the default timeout of short running docker operations. | ||
defaultTimeout = 2 * time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defaultDockerOpTimeout? :-)
|
||
const ( | ||
// defaultTimeout is the default timeout of short running docker operations. | ||
defaultTimeout = 2 * time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we've found this to be problematic elsewhere (s2i) and ultimately removed it. at a minimum i'd consider bumping it to 10 mins if we don't think we can live without it entirely. (problematic when the docker daemon is overloaded)
} | ||
} | ||
return names, nil | ||
return h.client.ContainerNames() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless you do what @jim-minter suggested in terms of moving the conversion to array logic here, i don't think there's even a reason for this function to exist anymore.
os::cmd::try_until_text "oc get pods -n logging -l logging-infra=deployer" "Completed" $(( 20*minute )) 2 | ||
os::cmd::try_until_text "oc get endpoints logging-kibana -o jsonpath='{ .subsets[*].ports[?(@.name==\"3000-tcp\")].port }' -n logging" "3000" $(( 10*minute )) 1 | ||
os::cmd::expect_success "oc login -u developer" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can i haz service-catalog test plz?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I tried running on my Mac with --service-catalog
and I get this:
oc cluster up --service-catalog
Starting OpenShift using openshift/origin:v3.6.0-alpha.2 ...
-- Checking OpenShift client ... OK
-- Checking Docker client ... OK
-- Checking Docker version ... OK
-- Checking for existing OpenShift container ... OK
-- Checking for openshift/origin:v3.6.0-alpha.2 image ... OK
-- Checking Docker daemon configuration ... OK
-- Checking for available ports ... OK
-- Checking type of volume mount ...
Using Docker shared volumes for OpenShift volumes
-- Creating host directories ... OK
-- Finding server IP ...
Using 127.0.0.1 as the server IP
-- Starting OpenShift container ...
Creating initial OpenShift configuration
Starting OpenShift using container 'origin'
Waiting for API server to start listening
OpenShift server started
-- Adding default OAuthClient redirect URIs ... OK
-- Installing registry ... OK
-- Installing router ... OK
-- Importing image streams ... OK
-- Importing templates ... OK
-- Installing service catalog ... FAIL
Error: cannot instantiate service catalog template
Caused By:
Error: cannot create objects from template openshift/service-catalog
Caused By:
Error: [policybindings.authorization.openshift.io "service-catalog:default" not found, policybindings.authorization.openshift.io "kube-system:default" not found]
The outcome is the same whether using this branch or master. I'll create an issue for it and when it's fixed, it should be very easy to add to the extended test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it needs to use a new origin image when running. the requirement to have a policybinding before creating a binding was recently lifted.
I would have expected this test to run using the local origin image anyway, though (--version=latest)
72d9a8f
to
63a9a69
Compare
Thanks for the reviews, comments addressed |
63a9a69
to
d1777b0
Compare
lgtm but @jim-minter has stronger opinions so will let him sign off before merging. |
// holdHijackedConnection holds the HijackedResponse, redirects the inputStream to the connection, and redirects the response | ||
// stream to stdout and stderr. | ||
func holdHijackedConnection(inputStream io.Reader, outputStream, errorStream io.Writer, resp types.HijackedResponse) error { | ||
receiveStdout := make(chan error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
must be buffered (make(chan error, 1)
) otherwise the goroutine will never be able to exit if the channel reader has gone away
}() | ||
} | ||
|
||
sendStdin := make(chan error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also must be buffered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(actually this is the key one)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not make the channel buffered if I don't have to. Looking at this function, I don't see how we wouldn't read the channel after starting the goroutine, and I want to wait for a result to be available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buffered != non-blocking read. As things stand, if the select reads a value from receiveStdout before reading a value from sendStdin, the function will return. However in this case the inputStream goroutine will block forever trying to write a value to sendStdin, because it is unbuffered and no-one will read from it once the function has returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ic, got it. Thx for the explanation
return c.client.CopyToContainer(context.Background(), container, dest, src, options) | ||
} | ||
|
||
func (c *dockerClient) CopyFromContainer(container string, src string, dest io.Writer) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest you return all the values from c.client.CopyFromContainer straight. Then you get rid of the io.Copy here, and the goroutine and pipe in newContainerDownloader.
} | ||
|
||
type dockerClient struct { | ||
endpoint string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently appears to be unused - are you intending to use it in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the link didn't work for me - I can see it's used in Endpoint(), but I can't see Endpoint() being used anywhere...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line 202 of dockerhelper/helper.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks.
|
||
// The OSE version will have > 4 parts to the version string | ||
// We'll only take the first 3 | ||
parts := strings.Split(versionStr, ".") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parts := strings.SplitN(versionStr, ".", 3)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not what I want. SplitN will give me the last segment including the dot. Here I want to remove 4th segment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry :(
minor comments, otherwise go for it. |
d1777b0
to
883c112
Compare
883c112
to
488c111
Compare
@jim-minter comments addressed |
@csrwng looks good to me, and also, I don't think I was sufficiently careful with the review and probably caused you unnecessary work with it. Apologies, I will take more care next time. |
thx for the review, no worries |
Evaluated for origin merge up to 488c111 |
[Test]ing while waiting on the merge queue |
Evaluated for origin test up to 488c111 |
[merge][severity:blocker] (this fixes a p1) |
Evaluated for origin testextended up to 488c111 |
continuous-integration/openshift-jenkins/testextended SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_extended/678/) (Base Commit: 8c00852) (PR Branch Commit: 488c111) (Extended Tests: clusterup) |
continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/2475/) (Base Commit: 8c00852) (PR Branch Commit: 488c111) |
@@ -675,16 +671,17 @@ func (c *CommonStartConfig) CheckNsenterMounter(out io.Writer) error { | |||
// CheckDockerVersion checks that the appropriate Docker version is installed based on whether we are using the nsenter mounter | |||
// or shared volumes for OpenShift | |||
func (c *CommonStartConfig) CheckDockerVersion(out io.Writer) error { | |||
ver, _, err := c.DockerHelper().Version() | |||
ver, isRHDocker, err := c.DockerHelper().APIVersion() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have no idea how much this disappoints me
continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_request_origin/1085/) (Base Commit: 7657e99) (PR Branch Commit: 488c111) (Extended Tests: blocker) (Image: devenv-rhel7_6392) |
Switches cluster up from the fsouza go docker client to the docker engine-api client
Fixes #14546