Skip to content

Commit

Permalink
Reestablish watch and retry wait for some errors
Browse files Browse the repository at this point in the history
Armada uses a Kubernetes watch to implement its chart wait logic. This
can be a fairly long-lived connection to the Kubernetes API server, and
is vulnerable to disruption (if, for example, the kubernetes apiserver
chart is being upgraded).

This change allows Armada to retry the wait for some specific errors,
including the establishment of a new watch, until the overall chart
timeout is reached.

kubernetes-client/python#972
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))

Change-Id: I3e68a54becadd5b2a2343960a120bdc3de8e8515
  • Loading branch information
philsphicas committed Apr 19, 2020
1 parent ae1281d commit 6b2c724
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 0 deletions.
12 changes: 12 additions & 0 deletions armada/handlers/wait.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@

from kubernetes import watch
from oslo_log import log as logging
from retry import retry
import urllib3.exceptions

from armada import const
from armada.exceptions import k8s_exceptions
Expand Down Expand Up @@ -318,6 +320,16 @@ def wait(self, timeout):
else:
self._wait(deadline)

# The Kubernetes Python Client does not always recover from broken
# connections to the k8s apiserver, and the resulting uncaught exceptions
# in the Watch.stream method cause the chart installation to fail. As long
# as the wait deadline has not passed, it is better to retry the entire
# wait operation.
@retry(
exceptions=(
urllib3.exceptions.ProtocolError,
urllib3.exceptions.MaxRetryError),
delay=1)
def _wait(self, deadline):
'''
Waits for resources to become ready.
Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ PasteDeploy>=1.5.2
protobuf>=3.4.0
PyYAML==3.12
requests
retry
prometheus_client==0.7.0

# API
Expand Down

0 comments on commit 6b2c724

Please sign in to comment.