-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Dependency already registered for DAG" warnings during runs in taskflow based tasks #26599
Comments
cc: @ashb - this is the issue which I asked the user to open FYI |
It turns out these warnings have been happening since 2.3.0, but due to the mistake fixed by #26779 we just never saw them! Correction: Since 2.1.0!!!! |
While investigating apache#26599 and the change from AIP-45, I noticed that these warning messages weren't new! The only thing that was new was that we started seeing them. This is because the logger for BaseOperator and all subclasses is `airflow.task.operators`, and the `airflow.task` logger is not configured (with `set_context()`) until we have a TaskInstance, so it just dropped all messages on the floor! This changes it so that log messages are propagated to parent loggers by default, but when we configure a context (and thus have a file to write to) we stop that. A similar change was made for the `airflow.processor` (but that is unlikely to suffer the same fate)
Found the problem. We're calling And |
…26779) * Ensure the log messages from operators during parsing go somewhere While investigating #26599 and the change from AIP-45, I noticed that these warning messages weren't new! The only thing that was new was that we started seeing them. This is because the logger for BaseOperator and all subclasses is `airflow.task.operators`, and the `airflow.task` logger is not configured (with `set_context()`) until we have a TaskInstance, so it just dropped all messages on the floor! This changes it so that log messages are propagated to parent loggers by default, but when we configure a context (and thus have a file to write to) we stop that. A similar change was made for the `airflow.processor` (but that is unlikely to suffer the same fate) * Give a real row count value so logs don't fail The ArangoDB sensor test was logging a mock object, which previously was getting dropped before emitting, but with this change now fails with "Mock is not an integer" when attempting the `%d` interpolation. To avoid making the mock overly specific (`arangodb_client_for_test.db.` `return_value.aql.execute.return_value.count.return_value`!) I have changed the test to mock the hook entirely (which is already tested)
Fixed, will be in 2.4.2 |
…26779) * Ensure the log messages from operators during parsing go somewhere While investigating #26599 and the change from AIP-45, I noticed that these warning messages weren't new! The only thing that was new was that we started seeing them. This is because the logger for BaseOperator and all subclasses is `airflow.task.operators`, and the `airflow.task` logger is not configured (with `set_context()`) until we have a TaskInstance, so it just dropped all messages on the floor! This changes it so that log messages are propagated to parent loggers by default, but when we configure a context (and thus have a file to write to) we stop that. A similar change was made for the `airflow.processor` (but that is unlikely to suffer the same fate) * Give a real row count value so logs don't fail The ArangoDB sensor test was logging a mock object, which previously was getting dropped before emitting, but with this change now fails with "Mock is not an integer" when attempting the `%d` interpolation. To avoid making the mock overly specific (`arangodb_client_for_test.db.` `return_value.aql.execute.return_value.count.return_value`!) I have changed the test to mock the hook entirely (which is already tested) (cherry picked from commit 7363e35)
Apache Airflow version
2.4.0
What happened
On version 2.4.0, in DAGs with simple taskflow based tasks (nothing dynamic), I was getting the warnings about "Dependency already registered for DAG", that weren't giving warnings prior to 2.4.0, and I wanted to test with a simpler one. I copied the exact dag from the taskflow tutorial documentation, the simple extract-transform-load example located here: https://airflow.apache.org/docs/apache-airflow/stable/tutorial/taskflow.html#example-taskflow-api-pipeline
Running these gives these warnings, even though no complex dependencies, no dynamic task generation exists:
Log from the last step:
What you think should happen instead
These tasks should run without warnings about dependencies being already registered.
How to reproduce
Copy the tutorial taskflow DAG and run it on 2.4.0.
Operating System
Red Hat Enterprise Linux 8.6 (Ootpa)
Versions of Apache Airflow Providers
No response
Deployment
Virtualenv installation
Deployment details
CeleryExecutor with rabbitmq, 1 main machine for webserver/scheduler and 1 additional worker node
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: