Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clientv3: 'there is no address available' in second call, after Sync #7009

Closed
gyuho opened this issue Dec 14, 2016 · 16 comments
Closed

clientv3: 'there is no address available' in second call, after Sync #7009

gyuho opened this issue Dec 14, 2016 · 16 comments

Comments

@gyuho
Copy link
Contributor

gyuho commented Dec 14, 2016

package main

import (
	"fmt"
	"log"
	"time"

	"context"

	"github.com/coreos/etcd/clientv3"
	"github.com/coreos/etcd/clientv3/naming"
	gnaming "google.golang.org/grpc/naming"
)

func main() {
	cli, err := clientv3.New(clientv3.Config{
		Endpoints:   []string{"localhost:2379"},
		DialTimeout: 3 * time.Second,
	})
	if err != nil {
		log.Fatal(err)
	}
	defer cli.Close()

	if err := cli.Sync(cli.Ctx()); err != nil {
		log.Fatal(err)
	}

	gr := &naming.GRPCResolver{Client: cli}
	if err := gr.Update(context.Background(), "test", gnaming.Update{Op: gnaming.Add, Addr: "test"}); err != nil {
		log.Fatal(err)
	}

	// fails with 'rpc error: code = 14 desc = there is no address available' in grpc.Invoke
	if err := gr.Update(context.Background(), "test", gnaming.Update{Op: gnaming.Add, Addr: "test"}); err != nil {
		log.Fatal(err)
	}
	fmt.Println("Done!")
}

The second call is blocking forever.

At least we should return an error?

@heyitsanthony
Copy link
Contributor

Is this limited to naming? It's not being used by the client grpc connection and Update is just a marshal+put; is Sync/Put/Put broken too?

@gyuho
Copy link
Contributor Author

gyuho commented Dec 15, 2016

@heyitsanthony Put also breaks

if err := cli.Sync(cli.Ctx()); err != nil {
	log.Fatal(err)
}

resp, err := cli.Put(context.Background(), "foo", "bar")
if err != nil {
	log.Fatal(err)
}
fmt.Println(resp)
resp, err = cli.Put(context.Background(), "foo", "bar")
if err != nil {
	log.Fatal(err)
}
fmt.Println("Done!")

Could be something wrong in clientv3.Client.Sync implementation.

@gyuho gyuho changed the title clientv3/naming: 'there is no address available' in second Update call clientv3: 'there is no address available' in second call, after Sync Dec 15, 2016
@gyuho
Copy link
Contributor Author

gyuho commented Dec 15, 2016

I should have made it clear.

This happens if the endpoint string does not exactly match with member list string, like http://10.7.3.90:2379 as a default host with this command:

./bin/etcd

Sync will update the endpoints with http://10.7.3.90:2379, and the client cannot connect to this endpoint, so grpc will return grpc: the connection is drained, then no available endpoints.

But this will work, because the client is still able to connect 127.0.0.1:2379

./bin/etcd --name my-etcd-1 \
    --listen-client-urls http://localhost:2379 \
    --advertise-client-urls http://localhost:2379 \
    --listen-peer-urls http://localhost:2380 \
    --initial-advertise-peer-urls http://localhost:2380 \
    --initial-cluster my-etcd-1=http://localhost:2380 \
    --initial-cluster-token my-etcd-token \
    --initial-cluster-state new
cli, err := clientv3.New(clientv3.Config{
	Endpoints:   []string{"127.0.0.1:2379"},
	DialTimeout: 3 * time.Second,
})
if err != nil {
	log.Fatal(err)
}
defer cli.Close()

if err := cli.Sync(cli.Ctx()); err != nil {
	log.Fatal(err)
}

if _, err := cli.Put(context.Background(), "foo", "bar"); err != nil {
	log.Fatal(err)
}
if _, err := cli.Put(context.Background(), "foo", "bar"); err != nil {
	log.Fatal(err)
}

So my question is whether we should use advertised default hosts (e.g. http://10.7.3.90:2379) from MemberList RPC when we run Sync.

@heyitsanthony
Copy link
Contributor

@gyuho sync should always use the advertised client URL. This is happening because the auto-detected IP is being used for the advertise client URL even though the client listen URL is localhost; the cluster is misconfigured.

@gyuho
Copy link
Contributor Author

gyuho commented Dec 15, 2016

@heyitsanthony Yeah think this is just my misconfiguration. Thanks!

@gyuho gyuho closed this as completed Dec 15, 2016
@heyitsanthony
Copy link
Contributor

Does this mean a cluster started with only ./etcd won't work with clientv3 when AutoSync is turned on? If so, should probably reopen.

@gyuho
Copy link
Contributor Author

gyuho commented Dec 15, 2016

Yeah if the cluster uses auto detected IP as advertise client URL, it fails after auto sync.

@gyuho gyuho reopened this Dec 15, 2016
@gyuho
Copy link
Contributor Author

gyuho commented Dec 15, 2016

@heyitsanthony How do we want to handle this? Problem was that there was no error or warnings when the client failed to connect to default host. The error was returned from grpc right here https://github.com/coreos/etcd/blob/master/clientv3/balancer.go#L153.

@heyitsanthony
Copy link
Contributor

@gyuho the grpc code is fine; etcd's defaults are broken. Probably the fix is to advertise the auto-detected IP for the client only if the client listen address is something besides localhost and the client advertise address is set to the default? /cc @xiang90

@xiang90
Copy link
Contributor

xiang90 commented Dec 15, 2016

Yeah if the cluster uses auto detected IP as advertise client URL, it fails after auto sync.

I think if auto detect is enabled, it should work. The problem is that the default setting will both listen and advertise on localhost. The default only works for local development. So I guess it is fine?

/cc @heyitsanthony What is your expectation? Listen on 0.0.0.0 or default route and advertise default route by default?

@heyitsanthony
Copy link
Contributor

@xiang90 if etcd is told to listen on 0.0.0.0 but gives a default client advertise url, have it advertise the default route ip?

@xiang90
Copy link
Contributor

xiang90 commented Dec 15, 2016

@heyitsanthony I think it is, at least on linux.

@heyitsanthony
Copy link
Contributor

@xiang90 it's misadvertising with the default route IP when it defaults to listening on localhost for a plain ./etcd call. The brokenness is it should advertise localhost instead:

2016-12-15 14:30:17.112196 I | etcdmain: etcd Version: 3.1.0-rc.1+git
2016-12-15 14:30:17.112236 I | etcdmain: Git SHA: Not provided (use ./build instead of go build)
2016-12-15 14:30:17.112246 I | etcdmain: Go Version: go1.7.1
2016-12-15 14:30:17.112253 I | etcdmain: Go OS/Arch: linux/amd64
2016-12-15 14:30:17.112262 I | etcdmain: setting maximum number of CPUs to 8, total number of available CPUs is 8
2016-12-15 14:30:17.112271 W | etcdmain: no data-dir provided, using default data-dir ./default.etcd
2016-12-15 14:30:17.112314 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2016-12-15 14:30:17.112325 I | etcdmain: advertising using detected default host "A.B.C.D"
2016-12-15 14:30:17.112562 I | embed: listening for peers on http://localhost:2380
2016-12-15 14:30:17.112636 I | embed: listening for client requests on localhost:2379
2016-12-15 14:30:17.113887 I | etcdserver: name = default
2016-12-15 14:30:17.113895 I | etcdserver: data dir = default.etcd
2016-12-15 14:30:17.113903 I | etcdserver: member dir = default.etcd/member
2016-12-15 14:30:17.113910 I | etcdserver: heartbeat = 100ms
2016-12-15 14:30:17.113918 I | etcdserver: election = 1000ms
2016-12-15 14:30:17.113925 I | etcdserver: snapshot count = 10000
2016-12-15 14:30:17.113948 I | etcdserver: advertise client URLs = http://A.B.C.D:2379
2016-12-15 14:30:17.134828 I | etcdserver: restarting member edaa22c5c35dce32 in cluster 13c6edf06c610097 at commit index 6543

@xiang90
Copy link
Contributor

xiang90 commented Dec 15, 2016

@heyitsanthony Agree.

@gyuho
Copy link
Contributor Author

gyuho commented Dec 16, 2016

The brokenness is it should advertise localhost instead:

How should we detect it? This only breaks in localhost with local etcd server, I assume.

@heyitsanthony
Copy link
Contributor

only override the default advertised client URL if the client listen URL is 0.0.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants