Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple endpoints #55

Open
wants to merge 32 commits into
base: master
Choose a base branch
from
Open

Conversation

dariko
Copy link
Contributor

@dariko dariko commented Feb 25, 2019

This commits adds a new endpoints parameter for the BaseClient class.
This parameter, which must be used alternatively to host and port,
can be set to a list of EtcdEndpoint, which is a simple data class with
address and port attributes.

A new decorator is present: retry all hosts. This decorator if
applied to a function of a class descendant from BaseClient has the
function calls retried against all the endpoints if an exception
occours.
If one of the tries returns with no errors the return value is propagated;
if all the tries throws exceptions of the same type the first one is
propagated; if they throw different exceptions an Etcd3Exception is
thrown. All the failed tries are logged.

A new utility class EtcdCluster is added to manage the lifecycle of
a etcd cluster run in containers. This semplifies the testing code and
lays the groundwork for testing requests failovers.
This class replaces the methods in docker_cli and etcd_go_cli.

A set of fixtures(etcd_cluster, etcd_cluster_ssl, client and
io_client manages the lifecycle of the test resources.

All the tests are set to run using the BaseClient's endpoint parameter,
additional tests have been added to test host and port.

A fix for the aiter methods in python 3.7 is included.

TODO: docs, failover testing

@dariko
Copy link
Contributor Author

dariko commented Feb 25, 2019

this fixes the auto retry between endpoint part of #25

@dariko dariko mentioned this pull request Feb 25, 2019
2 tasks
@codecov
Copy link

codecov bot commented Feb 25, 2019

Codecov Report

Merging #55 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #55   +/-   ##
=======================================
  Coverage   90.41%   90.41%           
=======================================
  Files          52       52           
  Lines        2890     2890           
  Branches      324      324           
=======================================
  Hits         2613     2613           
  Misses        170      170           
  Partials      107      107           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8206a66...8206a66. Read the comment docs.

@Revolution1
Copy link
Owner

@stupidchen

This commits adds a new `endpoints` parameter for the `BaseClient` class.
This parameter, which must be used alternatively to `host` and `port`,
can be set to a list of EtcdEndpoint, which is a simple data class with
`address` and `port` attributes.

A new decorator is  present: `retry all hosts`. This decorator if
applied to a function of a class descendant from `BaseClient` has the
function calls retried against all the `endpoints` if an exception
occours.
If one of the tries returns with no errors the return value is propagated;
if all the tries throws exceptions of the same type the first one is
propagated; if they throw different exceptions an Etcd3Exception is
thrown. All the failed tries are logged.
A new utility class `EtcdCluster` is added to manage the lifecycle of
a etcd cluster run in containers.
This class replaces the methods in `docker_cli` and `etcd_go_cli`.

A set of fixtures(`etcd_cluster`, `etcd_cluster_ssl`, `client` and
`io_client` manages the lifecycle of the test resources.

All the tests are set to run using the `BaseClient`'s `endpoint` parameter,
additional tests have been added to test `host` and `port`.
@dariko
Copy link
Contributor Author

dariko commented Feb 25, 2019

This still needs works on the stateful. classes

Repository owner deleted a comment Feb 25, 2019
Repository owner deleted a comment Feb 25, 2019
etcd3/aio_client.py Show resolved Hide resolved
ret = func(self, *args, **kwargs)
got_result = True
break
except Exception as e:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only retry when connection fails

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just catch these errors aiohttp.ClientError requests.RequestException urllib3.exceptions.HTTPError

etcd3/baseclient.py Outdated Show resolved Hide resolved
NO_DOCKER_SERVICE = True
try: # pragma: no cover
import docker # noqa
NO_DOCKER_SERVICE = False
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was intended to mock all the api, so I added the NO_ETCD_SERVICE to make test runnable when no etcd server avaliable.

you can just delete this since NO_DOCKER_SERVICE always false now

@@ -382,3 +382,12 @@ def find_executable(executable, path=None): # pragma: no cover
f = os.path.join(p, execname)
if os.path.isfile(f):
return f


class EtcdEndpoint():
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class contains only host and port but made creating a client less friendly

any further design on this?

Copy link
Owner

@Revolution1 Revolution1 Feb 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better put this into etcd3/__init__.py

tests/test_maintenance_apis.py Outdated Show resolved Hide resolved
@Revolution1
Copy link
Owner

Under current design, we should create a multi-endpoint client like this:

client=Client(endpoints=[EtcdEndpoint(), EtcdEndpoint()])

seems not very friendly

here are the some other code for your reference:

seems other clients does not do "auto switch" of one single request.

may be we should discuss about the influence of "auto switch"

Repository owner deleted a comment Feb 25, 2019
@dariko
Copy link
Contributor Author

dariko commented Feb 25, 2019

@Revolution1
Thank you for the references!

The constructor can still be called as before, the only change is that the the arguments needs to be named ([Aio]Client(host='127.0.0.1', port=2379)); calling it like this will populate the endpoints list with a single EtcdEndpoint (see https://github.com/dariko/etcd3-py/blob/multiple_endpoints/etcd3/baseclient.py#L98-L105) from which, when I'll have integrated server discovery, the list will be expanded.
Still, I think it's important to allow for many endpoints to be specified as this could prevent a situation in which a client is configured to use only a non-available node.
I'm open to suggestion as to how to accept more endpoints while requiring less keystrokes. right now the only idea I can propose is to accept a list of strings:

Client(endpoints=[EtcdEndpoint(host='127.0.0.1', port=1234),
                  EtcdEndpoint(host='127.0.0.2', port=5678)])`
# would become
Client(endpoints=['127.0.0.1:1234','127.0.0.2:5678']

but I'm not so sure it's really better... maybe accept both of them?
I don't think a cluster can have nodes replying to clients with different ssl configurations, but if that is possible the string configuration can become constricting.

About the error detection, my idea was to have something like the code you linked from python-etcd (which I think I think was my inspiration): react to some Exception types on a API request by failing over to the next node.
Automatically retrying should have little or no consequences for some operations (those who do not change data as kv.get or those who can be considered for lack of a better word 'transient' as lease.renew, watch.create ...), but the situation can be less obvious with some other operation (kv.put, transactions ...).
Right now I'd like to get to the point where it's acceptable to enable by default failover for reading methods: I'm pretty sure supporting writing methods requires handling more corner cases which I think are better left to the single application logic, at least as a default.

As per the interface, I was thinking about something on the lines of giving the user the ability to both completely disable request failover or to use a whitelist of safe methods (a failover_whitelist: List[string] parameter which defaults to the name of the 'safe' methods).

this is needed to realistically test discovery
every call to a decorated function will now try the call againts the
current in-use endpoint before trying to failover.

failover will be tried only if the exception thrown is in
failover_exceptions and the api method called is in failover_whitelist.

the failover process is blocking: Client and AioClient defines their own
Lock compatible object
@@ -62,5 +62,5 @@ def event_loop(): # pragma: no cover
async def test_aio_client_host_port(etcd_cluster_ssl):
endpoint = etcd_cluster_ssl.get_endpoints()[0]
aio_client = AioClient(host=endpoint.host, port=endpoint.port,
cert=(CERT_PATH, KEY_PATH), verify=CA_PATH)
cert=(CERT_PATH, KEY_PATH), verify=False)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

@dariko
Copy link
Contributor Author

dariko commented Mar 2, 2019

I'm gonna reenable certificate validation if needed: I'll need to regenerate the certificates.

I reworked the wrapper/endpoints logic. Now:

  • baseclient.baseurl refers to self.current_endpoint
  • self.current_endpoint is initially set to self.current_endpoints[0]

When calling a decorated function the call will be tried against self.current_endpoint.
If the call raises an exception included in failover_exceptions AND the wrapped method is in self.failover_whitelist then the failover process start.
The failover process is serialized using a lock by Client and AioClient: it tries the call againts all the defined endpoints, looping to the next if receiving an exception defined in failover_exceptions. If a call returns the return value is propagated, otherwise an exception will be thrown.

Repository owner deleted a comment Mar 2, 2019
Repository owner deleted a comment Mar 2, 2019
Repository owner deleted a comment Mar 2, 2019
Repository owner deleted a comment Mar 2, 2019
Repository owner deleted a comment Mar 2, 2019
Repository owner deleted a comment Mar 2, 2019
Repository owner deleted a comment Mar 2, 2019
Repository owner deleted a comment Mar 2, 2019
Repository owner deleted a comment Mar 2, 2019
Repository owner deleted a comment Mar 2, 2019
Repository owner deleted a comment Mar 2, 2019
Repository owner deleted a comment Mar 2, 2019
@Revolution1
Copy link
Owner

BTW: you can add your name in AUTHORS.rst 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants