-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't allow defaults other than None in context parameters, and improve error message #38015
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perfect
093858e
to
0c3a757
Compare
4208ad8
to
9e3664f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks really good now. @uranusjr ?
c550359
to
d2f65f3
Compare
…ne breaks signature
e3c8a73
to
1ed8b8e
Compare
Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions. |
Uh, oh, late to the party. I noticed this change which - after some reading - really looks understandable. But when testing 2.9.0RC2 I had a lot of broken DAGs and it felt to be like a breaking change. Therefore I raised a PR to add a newsfragment such that we have less support issues and user complaints. Still I see it a bit problematic... in our cases the DAGs just broke because we have strong mypy/pylint checks and want to ensure we have typed code... I our case we had a DAG and wanted to have the DAG params being typed. Imageine something like
... so here the Then changing this to So we had the option to use either That is why in many DAGs we use the signature like With this change being now on the main line. What is the recommended way for key context variables and parallel type checks and demand for typed context? The full example we use typed today, broken in 2.9.0rc2
UPDATE: Oh even trying to get away from the default I now get a validation error: |
I think a little bit more context is needed on that one, becuase this is a follow-up and really fix of another problem introdued in 2.8 in #33430 (where we allowed context params to not have defaults at all). This change is really codifying what has been explained for quite some time - since 2.7.0. I think we explicitly explained that From https://airflow.apache.org/docs/apache-airflow/2.8.0/core-concepts/taskflow.html#context : @task
def print_ti_info(task_instance: TaskInstance | None = None, dag_run: DagRun | None = None):
print(f"Run ID: {task_instance.run_id}") # Run ID: scheduled__2023-08-09T00:00:00+00:00
print(f"Duration: {task_instance.duration}") # Duration: 0.972019
print(f"DAG Run queued at: {dag_run.queued_at}") # 2023-08-10 00:00:01+02:20 This is how we explain that we expect None in those parameters used by the context. That's the "canonical" way I think we suggest people doing it since 2.7. In 2.8 #33430 introduced the possibility of "NOT" having the defaults. Technically speaking - Say - for TaskInstance or DagRun - how would you provide defaults there? I think limiting only to params being set by dag.params as default is even more hack'y and confusing in this context. There is a class of difficult to spot errors that is avoided with this fix - for example this will not work as you think it will work (if you don't know that run_id will be injected by task_flow): @task
def print_ti_info(some_value="x", run_id: str = "my_run_id"):
print(some_value)
print(run_id)
print_ti_info(): In example above you might simply miss that the context param has the And there is another case - it's not However yes - I agree that this change might have much bigger effect than we initially anticipated. Others could have used it for similar purpose to yours. We might want to make some other follow-ups after that if we see that this is a problem for others. The case you have @jscheffl is really a hack, but one that is likely to be used. I don't think it is strong enough reason to make RC4 - don't think so, while I did not get approval from @uranusjr and @Taragolis - we all thought it's a reasonable one, I think (I'd love to hear your thought here as well). But answering the typing question - one of the solutions (and a simple one) is: @task
def print_ti_info(params: ParamsDict | None = None, task_instance: TaskInstance | None = None, dag_run: DagRun | None = None):
assert params and task_instance and dag_run
... A bit ugly, yes, but will work, and I personally think it's much more "straightforward" than smth like that: def print_ti_info(params: ParamsDict = dag.param, task_instance: TaskInstance = TaskInstance(), dag_run: DagRun = DagRun()):
... Ideally (and maybe that should be a follow-up) we should be able to have (and mybe even publish) Airflow MyPy plugin - someone even requested one for default args #38796 recently https://github.com/apache/airflow/tree/main/dev/mypy/plugin (the ones we have work with default args and outputs). Maybe it is possible - we could write a plugin that could lilely be used to also allow this case (though @uranusjr in #33430 hinted that it might be impossible). def print_ti_info(params: ParamsDict, task_instance: TaskInstance, dag_run: DagRun):
....
print_ti_info() And maybe that's the best idea to actually do so for anyone who wants to use MyPy in this case? |
Thanks @potiuk for the detailled background. Convinced. |
Agree :) |
…ve error message (apache#38015) * feat: don't allow defaults other than None for context parameters * feat: improve error message when replacing context parameters with None breaks signature * feat: improve error message for unsupported position of context key parameter * style: make stack trace more readable by putting the error message in a variable * feat: add details to exceptions and related test * fix: put task decorated function outside pytest.raises call
This PR does two things for the Python task decorator:
Closes: #38006
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.