-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: kedro-airflow DAG kwarg configuration #233
Conversation
0fa48bd
to
acf4aa9
Compare
acf4aa9
to
23036e1
Compare
@noklam @astrojuanlu @deepyaman Could anyone please have a brief review of this PR and #241? Conceptually they are relatively small changes, so they should not take much time to review. They do enable us and other users to switch from Feel free to be critical and brief. Happy to put in the refactor work to get these features supported! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some high-level comments, I didn't review the details of the implementation yet.
Can you provide some steps how can I test with this new feature? I also expect there will be some unit tests or e2e test to make sure it works.
3fdc664
to
b18b985
Compare
72639bc
to
df20823
Compare
d29c687
to
c37ffbf
Compare
@noklam Could you please have another look? The implementation is now simpler and has proper tests. In addition to the configuration this PR also:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some minor comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran the plugin on a kedro starter project and the result seems correct.
c37ffbf
to
575360e
Compare
Thanks @luizvbo! Processed your comments. |
1ec5169
to
002093b
Compare
Happy to give this a deeper look when the linting and test failures are addressed |
002093b
to
3025dc2
Compare
A |
Nice, thanks - the last thing is some failures on Windows:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this @sbrugman! I've left some comments about it's compatibility with OmegaConfigLoader
which needs to be addressed! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did another round of review and I tested it with a starter project. Some minor comments, mostly the README/Release notes. I think this is very close to the finish line, I'll add more reviewers for approval. 💯
@sbrugman I'm sorry, I believe committing suggestions directly from here don't get signed off making the DCO bot fail. 😅 |
0f31ce1
to
e8f2d7e
Compare
Signed-off-by: Simon Brugman <[email protected]>
e8f2d7e
to
42b15a2
Compare
Let me know if there is anything else there is to be done on our side to help! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbrugman Sorry for blocking this PR for so long, I have missed this notification. If you're on Slack feel free to ping me directly.
I made a commit to fix some minor thing about code style. I have tested both --params
and airflow.yml
, works pretty well for me.
One thing that is less intuitive to me is the airflow.yml
parameters doesn't work the same as Kedro's params, as the top level key is pipeline
. i.e. if you want to overide a subkey pipeline.key=3
, you will do kedro airflow create -p pipeline --params key=3
instead of kedro airflow create --params pipeline.key=3
. I think this is fine but just want to make a note about this.
In addition, it may be helpful to create a kedro airflow init
to generate the airflow.yml
with default argument at some point, I've manually test by creating this file for the sake of review.
Excellent work⭐️ We should follow up with a release after this PR.
if "airflow" not in context.config_loader.config_patterns.keys(): | ||
context.config_loader.config_patterns.update( # pragma: no cover | ||
{"airflow": ["airflow*", "airflow/**"]} | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏼 I like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a super minor comment, but mainly came here to say thank you for this great contribution @sbrugman ! 🤩 I haven't been able to try this yet, but the code looks good and thanks for adding docs as well 👍
Signed-off-by: Ankita Katiyar <[email protected]> Co-authored-by: Merel Theisen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @sbrugman for this! I hope you don't mind that I made some commits to resolve conflicts and incorporate the suggestions to the docs from the reviews. Very happy about getting this merged in! 🥳 💯
Thank you everyone for the collaboration! This feature allows us to make an important step in our usage of scheduled Kedro nodes in many real-world pipelines 👌 Great to see that this was merged during my holidays 🌴 |
@sbrugman Thank you for your contribution and hope you have enjoyed your holiday! We have already done a release so please go ahead and use it! |
* feature: kedro-airflow DAG kwarg configuration Signed-off-by: Simon Brugman <[email protected]> * feat: `kedro-airflow` DAG kwarg configuration Signed-off-by: Simon Brugman <[email protected]> * Consistent use of the __future__ annotations Signed-off-by: Nok <[email protected]> * Apply suggestions from code review Signed-off-by: Ankita Katiyar <[email protected]> Co-authored-by: Merel Theisen <[email protected]> --------- Signed-off-by: Simon Brugman <[email protected]> Signed-off-by: Nok <[email protected]> Co-authored-by: Ankita Katiyar <[email protected]> Co-authored-by: Nok <[email protected]> Co-authored-by: Ankita Katiyar <[email protected]> Co-authored-by: Merel Theisen <[email protected]>
Description
Implementation for #229
Development notes
Feature implementation for configuring kedro Airflow DAGs via
airflow.yml
.The arguments can be configured per pipeline.
The arguments are rendered in the template. Initially I attempted specifying variables to change directly in the yaml file. However, rendering the variables in Python requires dealing with the data types of each argument (e.g. datetime, dict, string, bool ...). This added a lot of complexity to the plugin/template. In practice, the user can take care of this simply by providing a (small) template.
In the setup, the user can update the jinja template for new arguments, then can use
airflow.yml
:airflow.yml
:Additionally, parameters can be passed with the
--params
CLI option.The configuration pattern is also customizable.
Tests are present.
Checklist
RELEASE.md
file