From 30b36209d6a466b4d95fd55eb2968d9fc2cf47df Mon Sep 17 00:00:00 2001 From: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com> Date: Thu, 28 Jan 2021 21:06:50 -0700 Subject: [PATCH 1/2] Fix docs on scheduler latency Also fix typo in the "promote new flags" output --- docs/apache-airflow/faq.rst | 7 ++----- docs/apache-airflow/scheduler.rst | 4 ++-- docs/build_docs.py | 2 +- 3 files changed, 5 insertions(+), 8 deletions(-) diff --git a/docs/apache-airflow/faq.rst b/docs/apache-airflow/faq.rst index edc24ab8847516..1a5df6d62bda06 100644 --- a/docs/apache-airflow/faq.rst +++ b/docs/apache-airflow/faq.rst @@ -205,11 +205,8 @@ This means ``explicit_defaults_for_timestamp`` is disabled in your mysql server How to reduce airflow dag scheduling latency in production? ----------------------------------------------------------- -- ``parsing_processes``: Scheduler will spawn multiple threads in parallel to parse dags. - This is controlled by ``parsing_processes`` with default value of 2. - User should increase this value to a larger value (e.g numbers of cpus where scheduler runs + 1) in production. -- If you're using Airflow 1.10.x, consider moving to Airflow 2, which has reduced dag scheduling latency dramatically, - and allows for running multiple schedulers. +Airflow 2 has low DAG scheduling latency out of the box (particularly when compare with Airflow 1.10.x), +however if you need more throughput you can :ref:`start multiple schedulers`. Why next_ds or prev_ds might not contain expected values? --------------------------------------------------------- diff --git a/docs/apache-airflow/scheduler.rst b/docs/apache-airflow/scheduler.rst index 8e047fe9a482ef..54c8f66cb76446 100644 --- a/docs/apache-airflow/scheduler.rst +++ b/docs/apache-airflow/scheduler.rst @@ -66,11 +66,11 @@ This only has effect if your DAG has no ``schedule_interval``. If you keep default ``allow_trigger_in_future = False`` and try 'external trigger' to run future-dated execution dates, the scheduler won't execute it now but the scheduler will execute it in the future once the current date rolls over to the execution date. +.. _scheduler:ha: + Running More Than One Scheduler ------------------------------- -.. _scheduler:ha: - .. versionadded: 2.0.0 Airflow supports running more than one scheduler concurrently -- both for performance reasons and for diff --git a/docs/build_docs.py b/docs/build_docs.py index f0486ebea03d49..1080533c5189b6 100755 --- a/docs/build_docs.py +++ b/docs/build_docs.py @@ -75,7 +75,7 @@ def _promote_new_flags(): print("Still too slow?") print() print("You can only build one documentation package:") - print(" ./breeze build-docs --package-filter ") + print(" ./breeze build-docs -- --package-filter ") print() print("This usually takes from 20 seconds to 2 minutes.") print() From b187242e8be819b71660f0b8ec260454268f3101 Mon Sep 17 00:00:00 2001 From: Kaxil Naik Date: Fri, 29 Jan 2021 16:44:57 +0000 Subject: [PATCH 2/2] Update docs/apache-airflow/faq.rst --- docs/apache-airflow/faq.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/apache-airflow/faq.rst b/docs/apache-airflow/faq.rst index 1a5df6d62bda06..e5cdfd2050b6ea 100644 --- a/docs/apache-airflow/faq.rst +++ b/docs/apache-airflow/faq.rst @@ -205,7 +205,7 @@ This means ``explicit_defaults_for_timestamp`` is disabled in your mysql server How to reduce airflow dag scheduling latency in production? ----------------------------------------------------------- -Airflow 2 has low DAG scheduling latency out of the box (particularly when compare with Airflow 1.10.x), +Airflow 2 has low DAG scheduling latency out of the box (particularly when compared with Airflow 1.10.x), however if you need more throughput you can :ref:`start multiple schedulers`. Why next_ds or prev_ds might not contain expected values?