-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-5268] Apply same DAG naming conventions as in literature #5874
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fokko
reviewed
Aug 20, 2019
Fokko
reviewed
Aug 20, 2019
Fokko
reviewed
Aug 20, 2019
Fokko
reviewed
Aug 20, 2019
Fokko
reviewed
Aug 20, 2019
Fokko
reviewed
Aug 20, 2019
Fokko
reviewed
Aug 20, 2019
Fokko
reviewed
Aug 20, 2019
@Fokko applied all your suggestions, fingers crossed for the CI |
Fokko
approved these changes
Aug 20, 2019
kaxil
approved these changes
Aug 20, 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work @BasPH
Jerryguo
pushed a commit
to Jerryguo/airflow
that referenced
this pull request
Sep 2, 2019
Merged
6 tasks
kaxil
pushed a commit
that referenced
this pull request
Oct 3, 2019
ashb
pushed a commit
that referenced
this pull request
Oct 7, 2019
kaxil
pushed a commit
to astronomer/airflow
that referenced
this pull request
Oct 23, 2019
schnie
pushed a commit
to astronomer/airflow
that referenced
this pull request
Oct 24, 2019
…webserver scalability (#67) * [AIRFLOW-5088][AIP-24] Add DAG serialization using JSON (apache#5701) It implements the method proposed in AIP-24 to serialize DAG. It will be used in DAG persistency in DB to solve webserver scalability issue. (cherry picked from commit 2bd1a51ec75f680a6e6e2101bd948a78421a644a) * [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability (apache#5743) * Make _primitive_types Py2 & Py3 compatible (cherry picked from commit 7ce34b2a959fc1f8322836f38f474a831e4901a1) * Fix issue with different class for Pendulum Timezones (cherry picked from commit c068c67c48d294a589b58be0d0ad8b657c361a77) (cherry picked from commit 04fbf2beac57dcf26b118ebbe5a2bf175ce08af8) * Update timezone class & Do not serialize dates in tasks if they have matching date in DAG (cherry picked from commit be412522cb95a19a51b2f208ae8ebea76e8b667a) * Change type of data column to JSON & Add metric for dagbag size (cherry picked from commit d030b10bec9cd0e468f36e97e131d497d5a43fc6) * Code Cleanup for JSON columns - Code Cleanup for JSON columns - Test code to allow old mysql & sqlite versions (cherry picked from commit 1db8044f9d29edf25f2b8ad4cd21c496c243534a) * Add Debug info (cherry picked from commit d14497ff28d123d45d626019cabcbd977c5de79d) * Reduce Sizing of SerializedDAG * Support dateutil.relativedelta in SerializedDAGs This was a valid type for schedule_interval already, so we should continue supporting it (cherry picked from commit ec9d705f1a90790bdcb099196269c77d3cc3d53c) (cherry picked from commit 9805b4a183b87976dc33ae80c7e6a209849ba5d7) (cherry picked from commit f00d9237cd9224571e43bda67ad4dddfb009c402) * Add specific test for schedule_interval serialization * Delete non-existent Dags (cherry picked from commit 92d442d33dd8c81ea73026405d3978d133140807) (cherry picked from commit 7d371d329613c48deef0d8a812c817f2013db8f9) * Remove comment (cherry picked from commit 549c1f9cd9ab0bfeac4f75fa713cbaae842a6e82) * fix bugs that date/time/IntEnum are not supported in serialization. (cherry picked from commit d0ce27e3f3b6046016800855ad2e57fa67d8b57f) (cherry picked from commit 50a60b6a026e6d6249f069944be86560d87a67ca) * Deactivate DAGs instead of deleting if their DAG file is deleted (cherry picked from commit 5a84ca517cef0dacff23f57b360a554b461b5034) * Fix CI (cherry picked from commit 712ff47cbada7373eaa4fa92bb9220a453c445ae) (cherry picked from commit 52a0e9e39dc006501eb9d8ac0881357900548cf7) * Just-in-time loading of DagBag in webserver To save start-up time (and memory) this changes the DabBag to not be populated by the webserver on start up - and when a specific dag is asked for it will be loaded on-demand from the SerializedDAG table. Co-Authored-By: Ash Berlin-Taylor <[email protected]> * Fix flake8 (cherry picked from commit e91ad24b006823eadd6f3e21fc7cc5c8dd57b0d1) * Add default args to decorated_fields (cherry picked from commit 3f08d2f986364315c3e43bde3524f12d069392ae) * [AIRFLOW-5636] Allow adding or overriding existing Operator Links (apache#6302) (cherry-picked from 1de210b) * Add support for OperatorLinks ExtraOperatorLinks are supported if Plugins are registered for them (cherry picked from commit 9cb6e28) (cherry picked from commit 72c75860ecfcd1930f1dedc7a0c713f122ea51a5) * Cleanup (cherry picked from commit e840616) (cherry picked from commit 6d01d8e5bac1b6e829b9da6fc50c1a4b6d23bcaf) * Move serialization directory out of dags folder (cherry picked from commit 8a07aee3e5cf133c45ee4ae26aad6104c84502ab) * Update path * [AIRFLOW-5268] Apply same DAG naming conventions as in literature (apache#5874) cherry-picked from apache@8f6ca53 * [AIRFLOW-4309] Remove Broken Dag error after Dag is deleted (apache#6102) (cherry picked from commit 3140c45) (cherry picked from commit df65f8e) * [AIRFLOW-5481] Allow Deleting Renamed DAGs (apache#6101) (cherry picked from commit 99a5c2e) * Fix bad merge_conflict resolution This was incorrectly removed while cherry-picking and resolving conflicts * Add test for relativedelta * Fix Import * Backport for Py2
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Make sure you have checked all steps below.
Jira
Description
The Airflow codebase is extremely confusing because the concept "root" node in Airflow is actually implemented as the last, finishing node of a DAG, while in DAG literature root nodes are the first nodes to execute. Or, as literature also explains it: root nodes are nodes without upstream dependencies, while it was implemented as nodes without downstream dependencies.
This PR aligns the Airflow implementation of a "root" node with DAG literature. I've also implemented a
leaves
property to make a clear distinction between first/starting nodes and last/finishing nodes. Also, to my surprise there weren't even tests for these basic properties, so I added 3 tests verifying the behaviour ofroots
andleaves
.The implementation involved:
Tests
Added 3 tests, see above.
Commits
Documentation
Code Quality
flake8