-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for multiple cron expressions in schedule_interval #24733
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only question is is it worth exposing this via schedule_interval
or should we make people use a timetable directly for this?
Do we also need a new timetable class, or could the existing CornDataIntervalTimetable
be extended to take multiple patterns, and then all we change is what gets passed when upgrading schedule_interval
.?
You also haven't provided a description
field (which is IIRC what is shown in the UI) for the new timetable class.
+------------------------------------------+----------------------------------------------------------------------------------------------------------------------+--------------------------------------+ | ||
| Cron preset (``str``) | Convenience cron expression for readability + ``"@daily"`` | | ||
+------------------------------------------+----------------------------------------------------------------------------------------------------------------------+--------------------------------------+ | ||
| List of cron expressions/presets | To run at intervals that cannot be expressed by a single cron expression. + ``["0 3 * * *", "0 0 * * MON,TUE"]`` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should say if the mode is "and" or "or" (as implemented it's "or")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can be both "and" and "or". For example, when given ["0 3 * * MON,TUE", "@daily"]
, each expression is looked up in the cron presets and converted if found. Shall I convert to "and/or"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the example to demonstrate ["@daily", "0 3 * * MON,TUE"]
is a valid schedule_interval.
I believe it's more convenient exposing it via
We could extend the existing implementation to take a list of cron expressions (and convert a single string to a list of strings internally to align the business logic).
Added. |
Exposing this in I would propose we accept any |
Merged the MultiCronDataIntervalTimetable into the CronDataIntervalTimetable, which now accepts both a single string and a list of strings. Converted the PR to draft for the moment because I had to change the logic on @uranusjr Agreed, will add. |
@uranusjr Added support for sets (& tuples). Didn't set the type to |
@@ -105,7 +105,8 @@ | |||
|
|||
|
|||
DagStateChangeCallback = Callable[[Context], None] | |||
ScheduleInterval = Union[None, str, timedelta, relativedelta] | |||
MultiCron = Union[List[str], Set[str], Tuple[str, ...]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This additional type alias doesn’t seem to be necessary?
(Also this can probably be simply Collection[str]
)
if isinstance(interval, str) or ( | ||
isinstance(interval, (list, set, tuple)) and all(isinstance(element, str) for element in interval) | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if isinstance(interval, str) or ( | |
isinstance(interval, (list, set, tuple)) and all(isinstance(element, str) for element in interval) | |
): | |
if isinstance(interval, str) or ( | |
isinstance(interval, Collection) and all(isinstance(element, str) for element in interval) | |
): |
orm_dag.schedule_interval = ( | ||
list(dag.schedule_interval) | ||
if isinstance(dag.schedule_interval, set) | ||
else dag.schedule_interval | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it necessary to special-case set
?
def __init__( | ||
self, crons: Union[str, List[str], Set[str], Tuple[str, ...]], timezone: Union[str, Timezone] | ||
) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of normalizing here, I wonder if it’s easier to make this only accept List[str]
and normalize in DAG.__init__
instead.
@@ -197,6 +210,9 @@ schedule interval put in place, the logical date is going to indicate the time | |||
at which it marks the start of the data interval, where the DAG run's start | |||
date would then be the logical date + scheduled interval. | |||
|
|||
.. tip:: | |||
For more information on ``logical date``, see :ref:`data-interval` and :ref:`faq:what-does-execution-date-mean`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For more information on ``logical date``, see :ref:`data-interval` and :ref:`faq:what-does-execution-date-mean`. | |
For more information on *logical date*, see :ref:`data-interval` and :ref:`faq:what-does-execution-date-mean`. |
nit (this word is not code so let’s not make it look like code)
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
Hi @BasPH ! |
Hi all, |
+1, native option here would be great |
Please refrain from posting messages such as “+1” since it does not add to the conversation nor move forward the feature. If you are interested in seeing the functionality, please start a pull request yourself. |
This PR extends the behavior of
CronDataIntervalTimetable
to support a collection (list/set/tuple) of strings to theschedule_interval
. For example:^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragement file, named
{pr_number}.significant.rst
, in newsfragments.