Skip to content

Commit

Permalink
Change default of [kubernetes] enable_tcp_keepalive to True (#15338)
Browse files Browse the repository at this point in the history
We've seen instances of connection resets happening, particularly in
Azure, that are remedied by enabling tcp_keepalive. Enabling it by
default should be safe and sane regardless of where we are running.
  • Loading branch information
jedcunningham authored Apr 13, 2021
1 parent 1a85ba9 commit 6e31465
Show file tree
Hide file tree
Showing 4 changed files with 12 additions and 8 deletions.
6 changes: 5 additions & 1 deletion UPDATING.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ https://developers.google.com/style/inclusive-documentation

Moved the pod launcher from `airflow.kubernetes.pod_launcher` to `airflow.providers.cncf.kubernetes.utils.pod_launcher`

This will alow users to update the pod_launcher for the KubernetesPodOperator without requiring an airflow upgrade
This will allow users to update the pod_launcher for the KubernetesPodOperator without requiring an airflow upgrade

### Default `[webserver] worker_refresh_interval` is changed to `6000` seconds

Expand All @@ -91,6 +91,10 @@ serve as a DagBag cache burst time.

The `default_queue` configuration option has been moved from `[celery]` section to `[operators]` section to allow for re-use between different executors.

### Default `[kubernetes] enable_tcp_keepalive` is changed to `True`

This allows Airflow to work more reliably with some environments (like Azure) by default.

## Airflow 2.0.1

### Permission to view Airflow Configurations has been removed from `User` and `Viewer` role
Expand Down
2 changes: 1 addition & 1 deletion airflow/config_templates/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2116,7 +2116,7 @@
version_added: ~
type: boolean
example: ~
default: "False"
default: "True"
- name: tcp_keep_idle
description: |
When the `enable_tcp_keepalive` option is enabled, TCP probes a connection that has
Expand Down
2 changes: 1 addition & 1 deletion airflow/config_templates/default_airflow.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -1045,7 +1045,7 @@ delete_option_kwargs =

# Enables TCP keepalive mechanism. This prevents Kubernetes API requests to hang indefinitely
# when idle connection is time-outed on services like cloud load balancers or firewalls.
enable_tcp_keepalive = False
enable_tcp_keepalive = True

# When the `enable_tcp_keepalive` option is enabled, TCP probes a connection that has
# been idle for `tcp_keep_idle` seconds.
Expand Down
10 changes: 5 additions & 5 deletions airflow/kubernetes/kube_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,9 +85,9 @@ def _enable_tcp_keepalive() -> None:

from urllib3.connection import HTTPConnection, HTTPSConnection

tcp_keep_idle = conf.getint('kubernetes', 'tcp_keep_idle', fallback=120)
tcp_keep_intvl = conf.getint('kubernetes', 'tcp_keep_intvl', fallback=30)
tcp_keep_cnt = conf.getint('kubernetes', 'tcp_keep_cnt', fallback=6)
tcp_keep_idle = conf.getint('kubernetes', 'tcp_keep_idle')
tcp_keep_intvl = conf.getint('kubernetes', 'tcp_keep_intvl')
tcp_keep_cnt = conf.getint('kubernetes', 'tcp_keep_cnt')

socket_options = [
(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1),
Expand Down Expand Up @@ -125,10 +125,10 @@ def get_kube_client(
if config_file is None:
config_file = conf.get('kubernetes', 'config_file', fallback=None)

if conf.getboolean('kubernetes', 'enable_tcp_keepalive', fallback=False):
if conf.getboolean('kubernetes', 'enable_tcp_keepalive'):
_enable_tcp_keepalive()

if not conf.getboolean('kubernetes', 'verify_ssl', fallback=True):
if not conf.getboolean('kubernetes', 'verify_ssl'):
_disable_verify_ssl()

client_conf = _get_kube_config(in_cluster, cluster_context, config_file)
Expand Down

0 comments on commit 6e31465

Please sign in to comment.