Skip to content

Commit

Permalink
[AIRFLOW-XXX] Documents about task_concurrency and pool (#5262)
Browse files Browse the repository at this point in the history
(cherry picked from commit d04d7bc)
  • Loading branch information
milton0825 authored and kaxil committed Dec 17, 2019
1 parent 9393b69 commit 6c3d7f9
Showing 1 changed file with 5 additions and 4 deletions.
9 changes: 5 additions & 4 deletions docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -185,12 +185,13 @@ There are many layers of ``airflow run`` commands, meaning it can call itself.
How can my airflow dag run faster?
----------------------------------

There are three variables we could control to improve airflow dag performance:
There are a few variables we could control to improve airflow dag performance:

- ``parallelism``: This variable controls the number of task instances that the airflow worker can run simultaneously. User could increase the parallelism variable in the ``airflow.cfg``.
- ``concurrency``: The Airflow scheduler will run no more than ``$concurrency`` task instances for your DAG at any given time. Concurrency is defined in your Airflow DAG. If you do not set the concurrency on your DAG, the scheduler will use the default value from the ``dag_concurrency`` entry in your ``airflow.cfg``.
- ``parallelism``: This variable controls the number of task instances that runs simultaneously across the whole Airflow cluster. User could increase the parallelism variable in the ``airflow.cfg``.
- ``concurrency``: The Airflow scheduler will run no more than ``concurrency`` task instances for your DAG at any given time. Concurrency is defined in your Airflow DAG. If you do not set the concurrency on your DAG, the scheduler will use the default value from the ``dag_concurrency`` entry in your ``airflow.cfg``.
- ``task_concurrency``: This variable controls the number of concurrent running task instances across ``dag_runs`` per task.
- ``max_active_runs``: the Airflow scheduler will run no more than ``max_active_runs`` DagRuns of your DAG at a given time. If you do not set the ``max_active_runs`` in your DAG, the scheduler will use the default value from the ``max_active_runs_per_dag`` entry in your ``airflow.cfg``.

- ``pool``: This variable controls the number of concurrent running task instances assigned to the pool.

How can we reduce the airflow UI page load time?
------------------------------------------------
Expand Down

0 comments on commit 6c3d7f9

Please sign in to comment.