Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Airflow Upgrade with Cherry-Picks [only] from 1.10.4+twtr #64

Merged
merged 32 commits into from
Jan 8, 2021

Conversation

ayushSethi22
Copy link

Make sure you have checked all steps below.

JIRA

  • My PR addresses the following Airflow JIRA issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added.

Code Quality

  • Passes flake8

Ayush Sethi and others added 28 commits December 15, 2020 01:32
* CP Contains fb64f2e: [TWTR][AIRFLOW-XXX] Twitter Airflow Customizations + Fixup job scheduling without explicit_defaults_for_timestamp

* reformat

* flake8 fix
[CP][EWT-548][AIRFLOW-6527] Make send_task_to_executor timeout configurable (twitter-forks#63)

Commit already available in 1.10.12
CP contains [EWT-16]: Airflow fix for manual trigger during version upgrade (twitter-forks#13)

* [EWT-16]: Airflow fix for manual trigger during version upgrade
[CP][EWT-548][AIRFLOW-6527] Make send_task_to_executor timeout configurable (twitter-forks#63)

CP of f757a54

Commit already available in 1.10.12
[CP] Contains [AIRFLOW-5597] Linkify urls in task instance log

CP of f757a54

Commit already available in 1.10.12
CP contains [TWTR] CP from 1.10+twtr (twitter-forks#35)

* 99ee040: CP from 1.10+twtr

* 2e01c24: CP from 1.10.4 ([TWTR][AIRFLOW-4939] Fixup use of fallback kwarg in conf.getint)

* 00cb4ae: [TWTR][AIRFLOW-XXXX] Cherry-pick d4a83bc and bump version (twitter-forks#21)

* CP 51b1aee: Relax version requiremets (twitter-forks#24)

* CP 67a4d1c: [CX-16266] Change with reference to 1a4c164 commit in open source (twitter-forks#25)

* CP 54bd095: [TWTR][CX-17516] Queue tasks already being handled by the executor (twitter-forks#26)

* CP 87fcc1c: [TWTR][CX-17516] Requeue tasks in the queued state (twitter-forks#27)

* CP 98a1ca9: [AIRFLOW-6625] Explicitly log using utf-8 encoding (apache#7247) (twitter-forks#31)

* fixing models.py and jobs.py file fix after CP

* fixing typo and version bump

Co-authored-by: Vishesh Jain <[email protected]>
CP Contains Experiment API path fix (twitter-forks#37)

Co-authored-by: Vishesh Jain <[email protected]>
CP Contains Export scheduler env variable into worker pods. (twitter-forks#38)
Cp Contains [EWT-115][EWT-118] Initialise dag var to None and fix for DagModel.fileloc (missed in EWT-16) (twitter-forks#39)

Co-authored-by: Vishesh Jain <[email protected]>
Co-authored-by: aoen <[email protected]>
[CX-16591] Fix regex to work with impersonated clusters like airflow_scheduler_ddavydov (twitter-forks#42)
[CP][EWT-128] Fetch task logs from worker pods (19ac45a) (twitter-forks#43)
[CP][AIRFLOW-6561][EWT-290]: Adding priority class and default resource for worker pod. (twitter-forks#47)
[CP][EWT-302]Patch Pool.DEFAULT_POOL_NAME in BaseOperator (apache#8587) (twitter-forks#49)

Open source commit id: b37ce29

Co-authored-by: Vishesh Jain <[email protected]>
[CP][AIRFLOW-3121] Define closed property on StreamLogWriter (apache#3955) (twitter-forks#52)

CP of 2d5b8a5
…itter-forks#51)

Update the dataflow URL regex as per AIRFLOW-9323

Co-authored-by: Rajat Srivastava <[email protected]>
[EWT-450] fixing sla miss triggering duplicate alerts every minute (twitter-forks#56)

Co-authored-by: Vishesh Jain <[email protected]>
[CP] Handle IntegrityErrors for trigger dagruns & add Stacktrace when DagFileProcessorManager gets killed (twitter-forks#57)

CP of faaf179 - from master
CP of 2102122 - from 1.10.12
[TWTR][EWT-472] Add lifecycle support while launching worker pods (twitter-forks#59)
[TWTTR] Don't enqueue tasks again if already queued for K8sExecutor(twitter-forks#60)

Basically reverting commit 87fcc1c  and making changes specifically into the Celery Executor class only.
[CP][TWTR][EWT-377] Fix DagBag bug when a Dag has invalid schedule_interval (twitter-forks#61)

CP of 5605d10
& apache#11462
[TWTR][EWT-350] Reverting the last commit partially (twitter-forks#62)
[CP][EWT-548][AIRFLOW-6527] Make send_task_to_executor timeout configurable (twitter-forks#63)

CP of f757a54

Commit already available in 1.10.12
…rib files as they have been moved to airflow
dag = self.get_dag()
return dag

def get_dag(self):
Copy link
Collaborator

@vshshjn7 vshshjn7 Jan 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like get_dag logic has changed.
This was in 1.10.4 (https://github.com/apache/airflow/blob/1.10.4/airflow/models/dag.py#L1508)

Can you remove this change?
Here, you just need to replace fileloc with get_local_fileloc(), rest will be same

@@ -1004,6 +1004,7 @@ def signal_handler(signum, frame):
self.state = State.SUCCESS
except AirflowSkipException as e:
# Recording SKIP
# This change is in reference to [AIRFLOW-5653][CX-16266]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this statement?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit was done by me in OS Airflow 1.10.6 so it need not be cherry picked. That should remove this comment as well.

@@ -1767,6 +1772,24 @@ class DagModel(Base):
def __repr__(self):
return "<DAG: {self.dag_id}>".format(self=self)

def get_local_fileloc(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure that we use get_local_fileloc instead of fileloc

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked for rest of the instances, looks good

# Should be supplied in the format: ``key = value``
# Should be supplied in the format: key = value

[kubernetes_worker_resources]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to add this section? Isn't it supposed to be available in 1.10.14 by default?

@@ -595,7 +595,7 @@ pool = prefork

# The number of seconds to wait before timing out ``send_task_to_executor`` or
# ``fetch_celery_task_state`` operations.
operation_timeout = 2
operation_timeout = 10
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add this override in the airflow.cfg we've in source/airflow?

@@ -1004,6 +1004,7 @@ def signal_handler(signum, frame):
self.state = State.SUCCESS
except AirflowSkipException as e:
# Recording SKIP
# This change is in reference to [AIRFLOW-5653][CX-16266]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit was done by me in OS Airflow 1.10.6 so it need not be cherry picked. That should remove this comment as well.

airflow/models/baseoperator.py Show resolved Hide resolved
Copy link

@abhishekbafna abhishekbafna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I do not see this change in PR.

abhishekbafna@4ac9a6d

.arcconfig Show resolved Hide resolved
airflow/models/baseoperator.py Show resolved Hide resolved
@@ -18,4 +18,6 @@
# under the License.
#

version = '1.10.14'
version = '1.10.14+test1'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of test, it should be twtr or something.

README_TWITTER.md Outdated Show resolved Hide resolved
UPDATING.md Outdated
```
Fix for this, https://github.com/apache/airflow/pull/8587

## Airflow 1.10.4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line can be removed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@msumit msumit merged commit d85a9a3 into twitter-forks:1.10.14+twtr Jan 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants