Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-6025] Add label to uniquely identify creator of Pod #6621

Merged
merged 2 commits into from
Nov 22, 2019

Conversation

kaxil
Copy link
Member

@kaxil kaxil commented Nov 21, 2019

Make sure you have checked all steps below.

Jira

Description

  • Here are some details about my PR, including screenshots of any UI changes:
    2 Components natively create Kubernetes Pods.
  1. KubernetesPodOperator
  2. KubernetesExecutor

It would be ideal to add a label that identifies which of the two launches pod.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

@kaxil kaxil requested a review from ashb November 21, 2019 00:32
@kaxil kaxil requested a review from dimberman November 21, 2019 11:23
@turbaszek
Copy link
Member

It seems that we have a flaky test:

   Traceback (most recent call last):
    tests/task/task_runner/test_standard_task_runner.py line 129 in test_on_kill
      with open(path, "r") as f:
   FileNotFoundError: [Errno 2] No such file or directory: '/tmp/airflow_on_kill'

   -------------------- >> begin captured stdout << ---------------------
   [%(asctime)s] {{%(filename)s:%(lineno)d}} %(levelname)s - %(message)s
   [2019-11-21 13:02:57,945] {test_task_view_type_check.py:49} INFO - class_instance type: <class 'unusual_prefix_5d280a9b385120fec3c40cfe5be04e2f41b6b5e8_test_task_view_type_check.CallableClass'>
   [%(asctime)s] {{%(filename)s:%(lineno)d}} %(levelname)s - %(message)s
   [%(asctime)s] {{%(filename)s:%(lineno)d}} %(levelname)s - %(message)s
   [%(asctime)s] {{%(filename)s:%(lineno)d}} %(levelname)s - %(message)s
   [%(asctime)s] {{%(filename)s:%(lineno)d}} %(levelname)s - %(message)s
   [%(asctime)s] {{%(filename)s:%(lineno)d}} %(levelname)s - %(message)s
   [%(asctime)s] {{%(filename)s:%(lineno)d}} %(levelname)s - %(message)s

   --------------------- >> end captured stdout << ----------------------

@kaxil
Copy link
Member Author

kaxil commented Nov 21, 2019

@nuclearpinguin Yeah, restarting the test worked :)

# And a label to identify that pod is launched by KubernetesPodOperator
self.labels.update(
{
'airflow_version': airflow_version.replace('+', '-'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Kube recommend that we qualify our labels, like org.apache.airflow.version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is that for annotations? When should we use one over another?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/ recommend some but given we already use labels and don't follow such conventions, I made this change :

labels={
'airflow-worker': worker_uuid,
'dag_id': dag_id,
'task_id': task_id,
'execution_date': execution_date,
'try_number': str(try_number),
},

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. labels vs annotations:

You can use either labels or annotations to attach metadata to Kubernetes objects. Labels can be used to select objects and to find collections of objects that satisfy certain conditions. In contrast, annotations are not used to identify and select objects. The metadata in an annotation can be small or large, structured or unstructured, and can include characters not permitted by labels.
So labels make more sense for e.g to select all the pods that are created by Kube Pod Operator

@kaxil kaxil merged commit 5e5685a into apache:master Nov 22, 2019
@kaxil kaxil deleted the add-k8s-labels branch November 22, 2019 12:34
ashb pushed a commit to ashb/airflow that referenced this pull request Dec 18, 2019
ashb pushed a commit that referenced this pull request Dec 18, 2019
ashb pushed a commit that referenced this pull request Dec 19, 2019
kaxil added a commit that referenced this pull request Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants