Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix task_instance_mutation_hook #9910

Merged
merged 8 commits into from
Jul 22, 2020
Merged

fix task_instance_mutation_hook #9910

merged 8 commits into from
Jul 22, 2020

Conversation

waleedsamy
Copy link

@waleedsamy waleedsamy commented Jul 21, 2020

Fixes #9902.
Importing task_instance_mutation_hook will keep the default impl. To fix this I updated Dagrun model to import settings and then later call settings.task_instance_mutation_hook; The same way used while calling policy function.


Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Unit tests coverage for changes (not needed for documentation changes)
  • Target Github ISSUE in description if exists
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

potiuk and others added 6 commits July 14, 2020 17:15
(cherry picked from commit 021bb88bb82adf54b2299f723db0dc4d88fb6ab9)
Python 2.7 is no longer supported in Pygments 2.6.x
* add git sync sidecars

* add a helm test

* add more tests

* allow users to provide git username and pass via  a k8s secrets

* set default values for airflow worker repository & tag

* change ci timeout

* fix link

* add credentials_secret to airflow.cfg configmap

* set GIT_SYNC_ADD_USER on kubernetes worker pods, set uid

* add fsGroup to webserver and kubernete workers

* move gitSync to dags.gitSync

* rename valueFields

* turn off git sync and dag persistence by default

* provide option to specify known_hosts

* add git-sync details into the chart documentation

* Update .gitignore

Co-authored-by: Ash Berlin-Taylor <[email protected]>

* make git sync max failures configurable

* Apply suggestions from code review

Co-authored-by: Jarek Potiuk <[email protected]>

* add back requirements.lock

Co-authored-by: Ash Berlin-Taylor <[email protected]>
Co-authored-by: Jarek Potiuk <[email protected]>
(cherry picked from commit d93555b)
…9816)

Rather than only allowing specific pre-determined config settings, this
change allows the user to place _any_ config setting they like in the
generated airflow.cfg, including overwriting the "generated defaults".

This providers a nicer interface for the users of the chart (even if the
could already set these via the env vars).

(cherry picked from commit e4790d5)
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 21, 2020

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, pylint and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://apache-airflow-slack.herokuapp.com/

@waleedsamy
Copy link
Author

cc @milton0825 @ashb

@milton0825
Copy link
Contributor

Looks like static check is failing:

+from airflow import settings
 from airflow.exceptions import AirflowException
 from airflow.models.base import ID_LEN, Base
 from airflow.models.taskinstance import TaskInstance as TI
-from airflow import settings
 from airflow.stats import Stats
 from airflow.ti_deps.dep_context import DepContext
 from airflow.ti_deps.dependencies_states import SCHEDULEABLE_STATES

@kaxil
Copy link
Member

kaxil commented Jul 21, 2020

that is odd, I don't think it makes any difference. The module is always fully imported into the sys.modules mapping

@waleedsamy
Copy link
Author

@kaxil In dagrun.py, having

# airflow/__init__.py > airflow/models/__init__.py > airflow/models/dagrun.py
from airflow.settings import task_instance_mutation_hook

will load task_instance_mutation_hook with the default implementation inside settings.py which will later updated (when settings.initialize() called, and airflow_local_settings provided). The import of task_instance_mutation_hook will result in caching the old implementation and the change in this PR will drop this caching so calling settings.task_instance_mutation_hook will get you the fresh implementation of task_instance_mutation_hook( after settings.initialize load the airflow_local_settings).

@milton0825
Copy link
Contributor

This is good finding! Do you mind look into the test failures? I assume they are related to the change.

@waleedsamy
Copy link
Author

@milton0825 I'm checking them now (Static checks is passing now)

@kaxil
Copy link
Member

kaxil commented Jul 21, 2020

@kaxil In dagrun.py, having

# airflow/__init__.py > airflow/models/__init__.py > airflow/models/dagrun.py
from airflow.settings import task_instance_mutation_hook

will load task_instance_mutation_hook with the default implementation inside settings.py which will later updated (when settings.initialize() called, and airflow_local_settings provided). The import of task_instance_mutation_hook will result in caching the old implementation and the change in this PR will drop this caching so calling settings.task_instance_mutation_hook will get you the fresh implementation of task_instance_mutation_hook( after settings.initialize load the airflow_local_settings).

when you run airflow scheduler or airflow webserver it loads from airflow import settings that already initializes the settings

from airflow import settings

settings.initialize()

Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running any airflow cli command calls

settings.initialize()

so I don't see the change in this PR making it any different.

@waleedsamy
Copy link
Author

waleedsamy commented Jul 21, 2020

This issue is happening with 1.10.11 - not related to master branch.
Looking into v1.10-stable airflow/__init__.py

from airflow import utils
from airflow import settings
from airflow.configuration import conf
from airflow.models import DAG
from flask_admin import BaseView
from importlib import import_module
from airflow.exceptions import AirflowException

settings.initialize()

it will do a bunch of imports before calling initialize. These import will reach dagrun.py and from airflow.settings import task_instance_mutation_hook will cache task_instance_mutation_hook so even when calling settings.initialize() it won't affect the task_instance_mutation_hook in dagrun.py anymore.

@milton0825
Copy link
Contributor

Should we target the change with v1-10-test branch then?

@waleedsamy
Copy link
Author

Ok, I'll create a new PR for v1-10-test.

@waleedsamy waleedsamy changed the base branch from master to v1-10-test July 21, 2020 20:43
@kaxil
Copy link
Member

kaxil commented Jul 21, 2020

That is a very good point, in that case the correct solution for v1-10-test would be to move the from airflow.models import DAG after settings.initialize or have it similar to master as this will fix other things too (i.e other airflow_local_settings functions that would used old cached functions)

WDYT @waleedsamy @milton0825 ?

@milton0825
Copy link
Contributor

+1

@waleedsamy
Copy link
Author

@kaxil Ok, I'll move from airflow.models import DAG after settings.initialize.

@kaxil
Copy link
Member

kaxil commented Jul 21, 2020

@kaxil Ok, I'll move from airflow.models import DAG after settings.initialize.

Thanks, really appreciate the fix. Can you also please add a comment in that file on why the import line is below settings.initialize?

Also pylint or isort might complain, so you might have to add some #pylint: disable and/or #noqa after the comments to disable the checks

@waleedsamy
Copy link
Author

waleedsamy commented Jul 21, 2020

Can you also please add a comment in that file on why the import line is below settings.initialize?

I'll, thanks guys :)

@kaxil kaxil added this to the Airflow 1.10.12 milestone Jul 21, 2020
@kaxil kaxil changed the base branch from v1-10-test to v1-10-stable July 22, 2020 14:23
@kaxil kaxil changed the base branch from v1-10-stable to v1-10-test July 22, 2020 14:24
@kaxil kaxil merged this pull request into apache:v1-10-test Jul 22, 2020
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 22, 2020

Awesome work, congrats on your first merged pull request!

kaxil pushed a commit that referenced this pull request Jul 22, 2020
potiuk added a commit that referenced this pull request Jul 22, 2020
@waleedsamy waleedsamy deleted the fix-task_instance_mutation_hook branch July 22, 2020 15:33
kaxil pushed a commit that referenced this pull request Aug 11, 2020
cfei18 pushed a commit to cfei18/incubator-airflow that referenced this pull request Mar 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

task_instance_mutation_hook not using my custom implementation
7 participants