-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-5873] KubernetesPodOperator fixes and test #6523
Conversation
@ashb how should I cherry pick the relevant changes made to kubernetes in |
Hey @ddelange. I saw your comment - while writing this. Cool that you are working on it :). So if those changes are based on some existing commit in master it's better to pin-point this and cherry pick from there. If those are new changes, they should be added as PR to master and cherry-picked after it gets merged in master. I think it's not described very well, but we are just in a process of updating contributor's documentation with Google Season of Docs and I think it's a good point to add to CONTRIBUTING.rst (@efedotova -> maybe we can add it to the workflows we were discussing about) |
Hi @potiuk, thanks for the swift reply. I can open a new PR based on master, but master is too different from this branch to test my v1.10 setup with these changes on our cluster (e.g. my pods won't even start because https://github.com/apache/airflow/blob/v1-10-test/airflow/models/user.py doesn't exist anymore in master) [?] To me it seems they have already diverged (master I could alternatively make two PR's, one main one based on |
Two PRs with the same JIRA Issue are also fine @ddelange ! Thanks for doing this! |
Opened up #6524 @potiuk
|
e6cdd27
to
717dac6
Compare
On my local machine I get the same error as Travis. When I add $ git diff --name-only v1-10-test...HEAD
fatal: ambiguous argument 'v1-10-test...HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
$ git diff --name-only origin/v1-10-test...HEAD
.pre-commit-config.yaml
.travis.yml
BREEZE.rst
Dockerfile
Dockerfile-checklicence
airflow/contrib/kubernetes/pod.py
airflow/contrib/operators/kubernetes_pod_operator.py
airflow/models/baseoperator.py
airflow/utils/helpers.py
breeze
common/_files_for_rebuild_check.sh
docs/howto/custom-operator.rst
hooks/build
scripts/ci/_utils.sh
scripts/ci/ci_check_license.sh
scripts/ci/ci_docs.sh
scripts/ci/ci_flake8.sh
scripts/ci/ci_mypy.sh
scripts/ci/ci_refresh_pylint_todo.sh
scripts/ci/ci_run_all_static_tests.sh
scripts/ci/ci_run_all_static_tests_except_licence.sh
scripts/ci/docker-compose.yml
scripts/ci/docker_build/ci_build_install_deps.sh
scripts/ci/in_container/entrypoint_ci.sh
scripts/ci/local_ci_build_ci_slim_image.sh
scripts/ci/local_ci_cleanup.sh
scripts/ci/pre_commit_check_license.sh
scripts/ci/pre_commit_ci_build.sh
scripts/ci/pre_commit_flake8.sh
scripts/ci/pre_commit_lint_dockerfile.sh
scripts/ci/pre_commit_mypy.sh
scripts/ci/pre_commit_pylint_main.sh
scripts/ci/pre_commit_pylint_tests.sh
tests/contrib/operators/test_kubernetes_pod_operator.py
tests/test_impersonation.py``` |
40747f1
to
b4a6c9c
Compare
0a772f2
to
762cdb9
Compare
8e28275
to
265a42f
Compare
- `xcom_push` will be depracated, for now used as `do_xcom_push` - `KubernetesPodOperator` kwarg `in_cluster` erroneously defaults to False in comparison to `default_args.py`, also default `do_xcom_push` was overwritten to False in contradiction to `BaseOperator` - `KubernetesPodOperator` kwarg `resources` is erroneously passed to `base_operator`, instead should only go to `PodGenerator`. The two have different syntax. (both on `master` and `v1-10-test` branches) - `kubernetes/pod.py`: `Resources` does not have `__slots__` so accepts arbitrary values in `setattr`
Thanks @ddelange ! |
- `xcom_push` will be depracated, for now used as `do_xcom_push` - `KubernetesPodOperator` kwarg `in_cluster` erroneously defaults to False in comparison to `default_args.py`, also default `do_xcom_push` was overwritten to False in contradiction to `BaseOperator` - `KubernetesPodOperator` kwarg `resources` is erroneously passed to `base_operator`, instead should only go to `PodGenerator`. The two have different syntax. (both on `master` and `v1-10-test` branches) - `kubernetes/pod.py`: `Resources` does not have `__slots__` so accepts arbitrary values in `setattr`
- `xcom_push` will be depracated, for now used as `do_xcom_push` - `KubernetesPodOperator` kwarg `in_cluster` erroneously defaults to False in comparison to `default_args.py`, also default `do_xcom_push` was overwritten to False in contradiction to `BaseOperator` - `KubernetesPodOperator` kwarg `resources` is erroneously passed to `base_operator`, instead should only go to `PodGenerator`. The two have different syntax. (both on `master` and `v1-10-test` branches) - `kubernetes/pod.py`: `Resources` does not have `__slots__` so accepts arbitrary values in `setattr`
- `xcom_push` will be depracated, for now used as `do_xcom_push` - `KubernetesPodOperator` kwarg `in_cluster` erroneously defaults to False in comparison to `default_args.py`, also default `do_xcom_push` was overwritten to False in contradiction to `BaseOperator` - `KubernetesPodOperator` kwarg `resources` is erroneously passed to `base_operator`, instead should only go to `PodGenerator`. The two have different syntax. (both on `master` and `v1-10-test` branches) - `kubernetes/pod.py`: `Resources` does not have `__slots__` so accepts arbitrary values in `setattr`
return Resources(**resources) if resources else Resources() | ||
|
||
def _set_name(self, name): | ||
validate_key(name, max_length=63) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
I've just upgraded from v.1.10.6 and thus this change has been included, breaking a lot of the DAGs we currently have setup due to this change.
I hugely appreciate the effort in this development, although fair warning of this breaking change would have been very appreciated.
Just leaving this here in case someone using KubernetesPodOperator
finds the same issue when upgrading.
...
File "/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py", line 167, in __init__
self.name = self._set_name(name)
File "/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py", line 269, in _set_name
validate_key(name, max_length=63)
File "/usr/local/lib/python3.6/site-packages/airflow/utils/helpers.py", line 64, in validate_key
"The key has to be less than {0} characters".format(max_length))
airflow.exceptions.AirflowException: The key has to be less than 63 characters
I understand Kubernetes has a character limit of 63 for its label's names but, at least in v1.10.6 (the latest version previous to this change) Airflow adds another 9 characters to the name, so I'm guessing the limit to set here would be 54
, right?
Please let me know if I'm missing something.
Again, thanks a lot for the work on this, I mean this as constructive feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting that it's a breaking change, I was in the understanding that it would fail downstream and hence was a good check to move upstream to point of DAG creation. If this causes reproducible errors (do you have a MWE @dsaiztc with a breaking change?) that didn't occur before, it should either be loosened and for sure mentioned in UPDATING.md, or fixed to incorporate the 9 characters as mentioned above. What do you say @potiuk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not an expert in Kubernetes.
I was creating the POD names in KubernetesPodOperator
just concatenating the DAG name with the task name and haven't had any issue whatsoever. Suddenly when updating Airflow I started to see how many of my DAGs wouldn't be working as they didn't pass the newly-added validation.
Might we be confusing the limit on the labels (63
characters) and the subdomains (253
characters)?
I cannot get a clear view reading the docs:
Also the unofficial Kubernetes mentions a limit of 253
characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ddelange, I can confirm that the limit for the POD name is 253
characters and not 63
.
Tried out with one of my colleagues in our k8s cluster, only giving an error when trying to create a POD with more than 253
characters:
The Pod "eot5ahm9ua4ocahmahghoma1xoh7neihuvohz3oofaiz1iesaet2eibuwee5didietheet7faelohceyo1ci2eithu9nutie5fee5ahhohy0haegaikeizohngahzahteseifo5au2eb0aizi7reph9eibo5sahlee7hiequ2aeko6yiam8ieca4si6hodeiquoh6ceu9ienge2ooh4uo2umaijaec4aeli4ohfeodie3xahkove2iogieno2i" is invalid: metadata.name: Invalid value: "eot5ahm9ua4ocahmahghoma1xoh7neihuvohz3oofaiz1iesaet2eibuwee5didietheet7faelohceyo1ci2eithu9nutie5fee5ahhohy0haegaikeizohngahzahteseifo5au2eb0aizi7reph9eibo5sahlee7hiequ2aeko6yiam8ieca4si6hodeiquoh6ceu9ienge2ooh4uo2umaijaec4aeli4ohfeodie3xahkove2iogieno2i": must be no more than 253 characters
def _set_name(self, name): | ||
validate_key(name, max_length=63) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, with all that into consideration I'd suggest the following change:
def _set_name(self, name): | |
validate_key(name, max_length=63) | |
validate_key(name, max_length=244) |
Considering a maximum char limit of 253
(244
plus the 9
characters mentioned in my first comment).
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Agreed :) Do you wanna make a PR to master? Preferably with a small Jira ticket so it can be cherry-picked to 1-10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can give it a try... (never done this before)
Create a JIRA ticket + PR?
Should I base the PR in branch v1-10-stable
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please base it on master (v2.0) - new releases for the v1 branches are mostly commits cherry-picked from master onto the 1-10-test and later the 1-10-stable branches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it's complete enough, it's the first time I contribute (bear with me 🙈): #8829
Any feedback is welcome!
Make sure you have checked all steps below.
Jira
Description
KubernetesPodOperator
kwargresources
is erroneously passed tobase_operator
, instead should only go toPodGenerator
. The two have different syntax. (both onmaster
andv1-10-test
branches)kubernetes/pod.py
:Resources
does not have__slots__
so accepts arbitrary values insetattr
(present onv1-10-test
branch https://github.com/apache/airflow/blame/50343040ff4679e32e01f138ead80bc4bcef4b47/airflow/contrib/operators/kubernetes_pod_operator.py#L166-L171)KubernetesPodOperator
kwargin_cluster
erroneously defaults to False in comparison todefault_args.py
, also defaultdo_xcom_push
was overwritten to False in contradiction toBaseOperator
v1-10-test
is behindmaster
with KubernetesPodOperator fixes and refactors (will not be addressed in this PR,)move kubernetes folder one level up fromhttps://github.com/apache/airflow/blame/4dd24a2c595d4042ffe745aed947eaaea6abb652/airflow/contrib/operators/kubernetes_pod_operator.py#L21/contrib
fixhttps://github.com/apache/airflow/blame/4dd24a2c595d4042ffe745aed947eaaea6abb652/airflow/contrib/operators/kubernetes_pod_operator.py#L90xcom_push
todo_xcom_push
Tests
contrib/operators/test_kubernetes_pod_operator.py