Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streamline & simplify __eq__ methods in models Dag and BaseOperator #13449

Merged
merged 1 commit into from
Jan 4, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions airflow/models/baseoperator.py
Original file line number Diff line number Diff line change
Expand Up @@ -520,8 +520,10 @@ def __init__(
)

def __eq__(self, other):
if type(self) is type(other) and self.task_id == other.task_id:
return all(self.__dict__.get(c, None) == other.__dict__.get(c, None) for c in self._comps)
if type(self) is type(other):
# Use getattr() instead of __dict__ as __dict__ doesn't return
# correct values for properties.
return all(getattr(self, c, None) == getattr(other, c, None) for c in self._comps)
return False

def __ne__(self, other):
Expand Down
3 changes: 1 addition & 2 deletions airflow/models/dag.py
Original file line number Diff line number Diff line change
Expand Up @@ -361,8 +361,7 @@ def __repr__(self):
return f"<DAG: {self.dag_id}>"

def __eq__(self, other):
if type(self) == type(other) and self.dag_id == other.dag_id:

if type(self) == type(other):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can remember, it's optimization so we can check the most common case much faster.

Copy link
Member

@mik-laj mik-laj Jan 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That optimisation may not make sense given two reasons below:

  • all() function has the "exit fast" property, i.e., whenever there is any False element, it will return False immediately, rather than traversing all elements in the iterable. reference
  • For dag model, dag_id is the 1st element in _comps; For BaseOperator, task_id is the 1st element in _comps.

So after the change I make here, there should be zero impact on the performance (actually it improves the performance very minorly: it helps avoid comparing dag_id in dag model's __eq__for two times. Similar for BaseOperator).

Kindly let me know if it makes sense to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

# Use getattr() instead of __dict__ as __dict__ doesn't return
# correct values for properties.
return all(getattr(self, c, None) == getattr(other, c, None) for c in self._comps)
Expand Down