-
Notifications
You must be signed in to change notification settings - Fork 16.8k
[stable/airflow] Add a feature to run extra init scripts in the scheduler after initdb #21047
Conversation
Hi @NBardelot. Thanks for your PR. I'm waiting for a helm member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
8555df8
to
fdc0fa0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: javamonkey79, NBardelot The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@javamonkey79: adding LGTM is restricted to approvers and reviewers in OWNERS files. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
# the scheduler restarts | ||
# - if those scripts use sensitive data, you should provide the sensitive data using secrets | ||
# (see .Values.airflow.extraConfigmapMounts and .Values.airflow.extraEnv) | ||
extraStartupScripts: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, perhaps I'm missing something, but, is there a way to do this with just helm or, does it require having a configmap ran on the cluster prior to running the helm chart?
- name: extra-startup-scripts | ||
configMap: | ||
name: {{ .Values.airflow.extraStartupScripts }} | ||
defaultMode: 0740 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got permission denied, I think this needs to be more permissive like 755:
[2020-02-26 23:48:53,468] {{__init__.py:51}} INFO - Using executor CeleryExecutor
1 of 1 variables successfully updated.
running extra startup scripts
bash: /usr/local/extra-startup-scripts/run.sh: Permission denied
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A 755
would not change anything since it would mean u=rwx,g=rw,o=rw
. Normally u=rwx,g=r,o=
(i.e. 740
) should've sufficed.
I'll change it to 770
then, maybe the files are executed via the group and not the user OpenShift style...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I pulled your changes and 770 does not work either:
[2020-02-28 15:21:03,242] {{__init__.py:51}} INFO - Using executor CeleryExecutor
4 of 4 variables successfully updated.
running extra startup scripts
bash: /usr/local/extra-startup-scripts/run.sh: Permission denied
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NBardelot did you try it yourself? If you don't get the same permissions error, I'd be curious why... I would think you'd get the same error though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken a more thorough look at it and...
-
you were absolutely right about 755 (which is
u=rwx,g=rx,o=rx
and not what I said... stupid me) -
o=rx
is needed because the airflow user in not included in theroot
group as it should be (it is a good practice for Docker/Kubernetes and especially useful when running OpenShift)
I'll fix this right away. Sorry for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @NBardelot no worries! I have been running this branch out, so I knew there was something up. I appreciate your efforts.
Now, it would be good to get it merged in :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have force-pushed the fix so the pull request is ready now. If you can confirm that this version of the commit works for you it'd be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed (I already had the change exactly as you have it local).
fdc0fa0
to
de4f8f2
Compare
2a24fce
to
21bd7a1
Compare
…uler Signed-off-by: Noël Bardelot <[email protected]>
21bd7a1
to
7b2207d
Compare
Hi @maver1ck, I understand that the Airflow team is going to redo the Helm chart soon (how soon? that I don't know :D), and that may mean a little less attention to PRs for the project currently. But this PR is especially useful in the meanwhile because it's the only way to automate post-deployment tasks properly. Would you be OK to give it a review ? My use-case :
Thanks for your time. |
@NBardelot looks like there are conflicts now. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions. |
This issue is being automatically closed due to inactivity. |
/reopen |
Adds a new configuration value to the Helm chart for Airflow:
If set
.Values.airflow.extraStartupScripts
must be the name of a configmap that contains at least one entry, which key must be namedrun.sh
, and corresponds to a shell script that will be executed in the scheduler container after it has finished initdb and other standard startup scripts (i.e. just before it starts).Other entries provided in the same configmap will also be mounted in the same directory (using the key name as their filename). This allows you to split initialization tasks into several scripts that one can call from the main
run.sh
script, for example.This is needed because initContainers are executed before the airflow initdb is executed in the scheduler at startup. Thus initContainers remains viable for things that do not necessitate that the DB is ready, but this PR allows 'airflow' commands that were previously impossible to automate.
See also by @javamonkey79 :
Issue #20568 (which started the idea)
PR #20593 (which was aborted because it was centered around the idea of doing the same thing in the wrong place)