-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add different modes to sort dag files for parsing #15046
Conversation
airflow/utils/dag_processing.py
Outdated
# Sort the file paths by the parsing order mode | ||
list_mode = conf.get("scheduler", "file_parsing_sort_mode", fallback="modified_time") | ||
|
||
if list_mode not in FILE_PARSER_MODES: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in configuration.py's validate method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in ade6223
airflow/config_templates/config.yml
Outdated
parse different DAG files. | ||
* ``alphabetical``: Sort by filename | ||
|
||
version_added: 2.0.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be 2.1? It feels like a new feature, not a bug fix to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in e394feb
airflow/configuration.py
Outdated
if list_mode not in file_parser_modes: | ||
raise AirflowConfigException( | ||
"`[scheduler] file_parsing_sort_mode` should not be " | ||
+ list_mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+ list_mode | |
+ repr(list_mode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use an f-string too maybe for this whole lote?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in e394feb
91cdd08
to
e394feb
Compare
d1179db
to
fe3f40b
Compare
This commit adds the feature to allow users to set one of the following modes, the scheduler will list and sort the dag files to decide the parsing order.: - `modified_time`: Sort by modified time of the files. This is useful on large scale to parse the recently modified DAGs first. - `random_seeded_by_host`: Sort randomly across multiple Schedulers but with same order on the same host. This is useful when running with Scheduler in HA mode where each scheduler can parse different DAG files. - `alphabetical`: Sort by filename
fe3f40b
to
31f5947
Compare
This commit adds the feature to allow users to set one of the following modes, the scheduler will list and sort the dag files to decide the parsing order.: - `modified_time`: Sort by modified time of the files. This is useful on large scale to parse the recently modified DAGs first. - `random_seeded_by_host`: Sort randomly across multiple Schedulers but with same order on the same host. This is useful when running with Scheduler in HA mode where each scheduler can parse different DAG files. - `alphabetical`: Sort by filename (cherry picked from commit 2e3eb42)
This commit adds the feature to allow users to set one of the following modes, the scheduler will list and sort the dag files to decide the parsing order.: - `modified_time`: Sort by modified time of the files. This is useful on large scale to parse the recently modified DAGs first. - `random_seeded_by_host`: Sort randomly across multiple Schedulers but with same order on the same host. This is useful when running with Scheduler in HA mode where each scheduler can parse different DAG files. - `alphabetical`: Sort by filename (cherry picked from commit 2e3eb42)
This commit adds the feature to allow users to set one of the following modes, the
scheduler will list and sort the dag files to decide the parsing order.:
modified_time
: Sort by modified time of the files. This is useful on large scale to parse the recently modified DAGs first.random_seeded_by_host
: Sort randomly across multiple Schedulers but with same order on the same host. This is useful when running with Scheduler in HA mode where each scheduler can parse different DAG files.alphabetical
: Sort by filename^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.