Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli failed to connect #80

Closed
lluunn opened this issue May 9, 2018 · 10 comments
Closed

cli failed to connect #80

lluunn opened this issue May 9, 2018 · 10 comments

Comments

@lluunn
Copy link
Contributor

lluunn commented May 9, 2018

I was trying the getting start guide.
I can see the pod using kubectl.

But

katib-cli -s gke-test-katib-default-pool-b88188d2-jnvp:30678 Getstudies
2018/05/09 11:38:13 connecting gke-test-katib-default-pool-b88188d2-jnvp:30678
2018/05/09 11:38:14 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup gke-test-katib-default-pool-b88188d2-jnvp on 127.0.0.1:53: no such host"; Reconnecting to {gke-test-katib-default-pool-b88188d2-jnvp:30678 <nil>}
2018/05/09 11:38:14 GetStudy failed: rpc error: code = 14 desc = grpc: the connection is unavailable
11:38:14lunkai@None:katib$ katib-cli
2018/05/09 11:38:34 connecting 127.0.0.1:6789
2018/05/09 11:38:34 Method not found: 

Do we need to configure cli somehow?

@gaocegege
Copy link
Member

Do you download the CLI from GitHub or build it on your own?

@lluunn
Copy link
Contributor Author

lluunn commented May 10, 2018

@gaocegege
Copy link
Member

Then you need to checkout v0.1.0-alpha to have a try. We do not have a CD process for CLI so it is version specific.

@gaocegege
Copy link
Member

Wait a minute, @YujiOshima Does our images have the tag v0.1.0-alpha or v0.1.1-alpha for test?

@YujiOshima
Copy link
Contributor

@lluunn @gaocegege If you are using latest tagged katib images, you need to use v0.1.1-alpha.

@lluunn
Copy link
Contributor Author

lluunn commented May 22, 2018

I am using v0.1.1-alpha now. Got a different error

13:28:10lunkai@None:katib$ katib-cli -s gke-test-katib-default-pool-b88188d2-jnvp:30678 get studies
2018/05/22 13:28:41 GetStudy failed: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: 
connection error: desc = "transport: Error while dialing dial tcp: lookup gke-test-katib-default-pool-b88188d2-jnvp on 127.0.0.1:53: no such host"

It seems it's trying to find the pool on 127.0.0.1?

@YujiOshima
Copy link
Contributor

It looks problem of DNS setting on your host.
Can you nslookup gke-test-katib-default-pool-b88188d2-jnvp from your host?

@lluunn
Copy link
Contributor Author

lluunn commented May 23, 2018

nslookup gke-test-katib-default-pool-b88188d2-jnvp
Server:         127.0.0.1
Address:        127.0.0.1#53

** server can't find gke-test-katib-default-pool-b88188d2-jnvp: NXDOMAIN

To clarify, I am deploying katib to a GKE cluster, and calling katib-cli from my laptop.
Is it expected to work this way?
Do a need to setup anything for my cluster?

@YujiOshima
Copy link
Contributor

You failed to resolve the name gke-test-katib-default-pool-b88188d2-jnvp.
Try to use the IP address of the node.
And in GKE, you should add a firewall rule to accessing to port 30678.
e.g. gcloud compute firewall-rules create katibservice --allow tcp:30678

@lluunn
Copy link
Contributor Author

lluunn commented May 24, 2018

Working now, thank you!
I will sent out a PR to fix getting-started.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants