Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-5931] Use os.fork when appropriate to speed up task execution. #6627

Merged
merged 3 commits into from
Dec 10, 2019

Conversation

ashb
Copy link
Member

@ashb ashb commented Nov 21, 2019

Make sure you have checked all steps below.

Jira

Description

  • Rather than running a fresh python interpreter which then has to re-load
    all of Airflow and its dependencies we should use os.fork when it is
    available/suitable which should speed up task running, espeically for
    short lived tasks.

    I've profiled this and it took the task duration (as measured by the
    duration column in the TI table) from an average of 14.063s down to
    just 0.932s!

    I could make this change deeper and bypass the CLIFactory/go directly to _run_raw_task, but this makes the change the minimum needed to work.

Tests

  • No unit tests added. Hopefully existing tests good enough. Manual testing shows this working

Other tests I need to perform:

  • Check if os._exit is right (this doesn't run atexit callbacks) - so I need to check if logging in the subprocess istidied up properly.
  • Test if this leaves "dangling"/broken DB connections.
  • Check remote log uploading

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

@dimberman
Copy link
Contributor

I have tested this locally and it seems to work fine.

@ashb when are situations where CAN_FORK is false besides when doing run_as_user?

@dimberman dimberman self-requested a review November 21, 2019 15:16
@ashb
Copy link
Member Author

ashb commented Nov 21, 2019

My benchmarking results of this change below.

All my tests were of BashOperator(..., bash_command="true"). This change helps bring down the average (14.062799955 to 0.9318446075, x15 faster) and the maximum time (from 22.927918 to 1.865752, x12 faster!)

** Without change, 40 dag runs**:

$ psql airflow -c "select dag_id, task_id, execution_date, state, count(*), min(duration) as min_dur, avg(duration) as avg_dur, max(duration) as max_dur, sum(duration) as running_time, max(end_date)-min(start_date) as total_duration, avg(start_date - queued_dttm) as queue_delay from task_instance where state is not null group by grouping sets ((dag_id, task_id, state), (dag_id,execution_date, state), (dag_id, state)) order by 1,2,3"

   dag_id    |   task_id    |     execution_date     |  state  | count |  min_dur  |   avg_dur    |  max_dur  | running_time | total_duration  |   queue_delay
-------------+--------------+------------------------+---------+-------+-----------+--------------+-----------+--------------+-----------------+-----------------
 example_dag | print_date1  |                        | success |    10 | 13.540218 |   14.2083814 | 14.745729 |   142.083814 | 00:00:17.130217 | 00:00:14.927132
 example_dag | print_date10 |                        | success |    10 | 11.480039 |   13.4748527 | 14.403048 |   134.748527 | 00:01:45.278854 | 00:09:42.822216
 example_dag | print_date11 |                        | success |    10 | 10.084806 |   13.2927993 | 14.148362 |   132.927993 | 00:01:43.667161 | 00:09:44.954266
 example_dag | print_date12 |                        | success |    10 |  7.529389 |   13.1063045 | 14.152888 |   131.063045 | 00:01:41.729084 | 00:09:48.473788
 example_dag | print_date13 |                        | success |    10 | 13.352645 |    14.171775 | 15.294458 |    141.71775 | 00:05:34.40845  | 00:07:12.078185
 example_dag | print_date14 |                        | success |    10 | 12.685846 |   14.1814867 |  15.92092 |   141.814867 | 00:05:54.331766 | 00:07:00.183005
 example_dag | print_date15 |                        | success |    10 | 13.261651 |   13.6241582 |   14.4745 |   136.241582 | 00:06:01.605317 | 00:06:46.927206
 example_dag | print_date16 |                        | success |    10 | 13.463467 |   13.9461268 | 14.327437 |   139.461268 | 00:06:03.748068 | 00:06:43.670468
 example_dag | print_date17 |                        | success |    10 | 12.838915 |   13.7431276 | 14.127847 |   137.431276 | 00:06:16.971633 | 00:06:39.727049
 example_dag | print_date18 |                        | success |    10 | 13.478566 |   16.0986632 | 19.783983 |   160.986632 | 00:06:39.378521 | 00:06:20.19122
 example_dag | print_date19 |                        | success |    10 | 13.427385 |   15.7544223 | 22.927918 |   157.544223 | 00:06:40.99252  | 00:06:52.527796
 example_dag | print_date2  |                        | success |    10 | 13.692921 |   14.3700994 | 14.829447 |   143.700994 | 00:00:46.73697  | 00:00:24.922475
 example_dag | print_date20 |                        | success |    10 | 13.749788 |   15.9622433 |  21.03353 |   159.622433 | 00:06:46.79555  | 00:06:47.494261
 example_dag | print_date21 |                        | success |    10 | 13.533584 |   15.3698928 | 20.041363 |   153.698928 | 00:07:01.543776 | 00:06:40.238809
 example_dag | print_date22 |                        | success |    10 | 13.292851 |   14.3784663 | 15.897209 |   143.784663 | 00:07:07.038294 | 00:06:22.919986
 example_dag | print_date23 |                        | success |    10 | 13.571403 |   14.7570801 | 16.540063 |   147.570801 | 00:06:58.41123  | 00:06:17.252477
 example_dag | print_date24 |                        | success |    10 | 13.698992 |   14.5122418 | 17.094243 |   145.122418 | 00:06:51.823565 | 00:06:14.655425
 example_dag | print_date25 |                        | success |    10 | 11.696784 |   13.6606921 | 14.262454 |   136.606921 | 00:07:14.91126  | 00:06:10.941776
 example_dag | print_date26 |                        | success |    10 | 13.115977 |   13.8040275 | 14.581811 |   138.040275 | 00:07:22.865953 | 00:05:55.113162
 example_dag | print_date27 |                        | success |    10 | 13.203122 |   13.8406964 | 14.061522 |   138.406964 | 00:07:21.852219 | 00:05:50.572565
 example_dag | print_date28 |                        | success |    10 | 13.684101 |   14.1242237 | 14.625835 |   141.242237 | 00:07:52.563805 | 00:05:45.37103
 example_dag | print_date29 |                        | success |    10 | 13.392751 |   13.8983641 | 14.356764 |   138.983641 | 00:07:43.298189 | 00:05:33.6447
 example_dag | print_date3  |                        | success |    10 | 11.072175 |   13.7151706 | 14.351104 |   137.151706 | 00:00:47.050922 | 00:00:40.070479
 example_dag | print_date30 |                        | success |    10 | 13.220995 |   13.6871487 | 14.455784 |   136.871487 | 00:08:08.386634 | 00:05:19.734116
 example_dag | print_date31 |                        | success |    10 | 12.529552 |   13.7031299 |  14.13769 |   137.031299 | 00:08:00.575368 | 00:05:07.899519
 example_dag | print_date32 |                        | success |    10 | 13.318386 |   14.0317772 | 14.622018 |   140.317772 | 00:08:23.871494 | 00:05:02.388743
 example_dag | print_date33 |                        | success |    10 | 13.380067 |   14.1291853 | 14.981017 |   141.291853 | 00:08:24.882959 | 00:04:52.273283
 example_dag | print_date34 |                        | success |    10 | 12.821078 |   13.7153179 | 14.443366 |   137.153179 | 00:08:25.688936 | 00:04:45.993094
 example_dag | print_date35 |                        | success |    10 | 12.781473 |    13.671214 | 14.770308 |    136.71214 | 00:08:54.559094 | 00:04:38.23618
 example_dag | print_date36 |                        | success |    10 | 13.293374 |   14.0982102 | 14.837228 |   140.982102 | 00:08:54.508936 | 00:04:27.575895
 example_dag | print_date37 |                        | success |    10 | 13.657723 |   13.9828665 | 14.399264 |   139.828665 | 00:08:44.443426 | 00:04:24.720492
 example_dag | print_date38 |                        | success |    10 | 12.415427 |   13.4307027 | 14.742275 |   134.307027 | 00:09:11.21312  | 00:04:07.207969
 example_dag | print_date39 |                        | success |    10 | 12.977849 |   13.9103038 | 14.494109 |   139.103038 | 00:10:45.931352 | 00:06:18.113138
 example_dag | print_date4  |                        | success |    10 | 13.027779 |   13.9621166 | 14.589971 |   139.621166 | 00:01:59.062323 | 00:09:15.243934
 example_dag | print_date40 |                        | success |    10 | 12.899318 |   13.6532077 |   14.3119 |   136.532077 | 00:05:36.128265 | 00:07:45.713126
 example_dag | print_date5  |                        | success |    10 | 13.364638 |   13.9671966 | 14.606845 |   139.671966 | 00:01:57.868484 | 00:09:15.995714
 example_dag | print_date6  |                        | success |    10 | 13.454817 |   14.1199325 | 14.922081 |   141.199325 | 00:01:50.984789 | 00:09:18.331126
 example_dag | print_date7  |                        | success |    10 | 13.316182 |   14.0112888 | 14.782732 |   140.112888 | 00:01:49.556653 | 00:09:20.466958
 example_dag | print_date8  |                        | success |    10 | 10.542196 |   13.6500996 | 14.425894 |   136.500996 | 00:00:46.269235 | 00:00:15.901673
 example_dag | print_date9  |                        | success |    10 |  3.733766 |   12.7930044 | 14.344923 |   127.930044 | 00:01:36.054977 | 00:09:49.170081
 example_dag |              | 2018-01-01 00:00:00+00 | success |    40 | 12.695903 |  14.51603295 | 20.041363 |   580.641318 | 00:13:04.943902 | 00:04:04.183146
 example_dag |              | 2018-01-01 00:05:00+00 | success |    40 | 11.072175 | 14.490386525 | 19.999951 |   579.615461 | 00:13:04.031953 | 00:04:01.462174
 example_dag |              | 2018-01-01 00:10:00+00 | success |    40 | 10.542196 |  14.67187005 | 22.927918 |   586.874802 | 00:12:53.483644 | 00:04:01.159616
 example_dag |              | 2018-01-01 00:15:00+00 | success |    40 | 11.696784 |   14.1309867 | 18.610719 |   565.239468 | 00:13:09.216955 | 00:04:06.713355
 example_dag |              | 2018-01-01 00:20:00+00 | success |    40 | 12.838915 |  13.97867895 | 19.082492 |   559.147158 | 00:12:32.573585 | 00:06:25.172815
 example_dag |              | 2018-01-01 00:25:00+00 | success |    40 |  3.733766 |   13.2246647 | 14.763435 |   528.986588 | 00:13:11.267078 | 00:07:08.041724
 example_dag |              | 2018-01-01 00:30:00+00 | success |    40 | 12.786672 |  13.91106905 | 14.638537 |   556.442762 | 00:12:38.990129 | 00:06:58.554466
 example_dag |              | 2018-01-01 00:35:00+00 | success |    40 | 12.415427 |  13.84971725 | 14.589971 |    553.98869 | 00:11:49.876577 | 00:06:52.566075
 example_dag |              | 2018-01-01 00:40:00+00 | success |    40 | 12.685846 |  13.92356865 | 14.829447 |   556.942746 | 00:12:01.941327 | 00:08:30.816995
 example_dag |              | 2018-01-01 00:45:00+00 | success |    40 | 13.027779 | 13.931024725 | 14.770308 |   557.240989 | 00:12:05.666341 | 00:08:49.990763
 example_dag |              |                        | success |   400 |  3.733766 | 14.062799955 | 22.927918 |  5625.119982 | 00:13:13.096719 | 00:06:05.866113

With change, 40 dag runs

$ psql airflow -c "select dag_id, task_id, execution_date, state, count(*), min(duration) as min_dur, avg(duration) as avg_dur, max(duration) as max_dur, sum(duration) as running_time, max(end_date)-min(start_date) as total_duration, avg(start_date - queued_dttm) as queue_delay from task_instance where state is not null group by grouping sets ((dag_id, task_id, state), (dag_id,execution_date, state), (dag_id, state)) order by 1,2,3"

   dag_id    |   task_id    |     execution_date     |  state  | count | min_dur  |   avg_dur    | max_dur  | running_time | total_duration  |   queue_delay
-------------+--------------+------------------------+---------+-------+----------+--------------+----------+--------------+-----------------+-----------------
 example_dag | print_date1  |                        | success |    10 | 1.146504 |    1.4268349 | 1.694726 |    14.268349 | 00:00:02.171718 | 00:00:13.903464
 example_dag | print_date10 |                        | success |    10 | 0.593307 |    0.8229632 | 1.042284 |     8.229632 | 00:00:41.72518  | 00:06:17.844413
 example_dag | print_date11 |                        | success |    10 |  0.62847 |    0.8636334 | 1.173789 |     8.636334 | 00:00:40.747711 | 00:06:18.975095
 example_dag | print_date12 |                        | success |    10 | 0.396318 |    0.8446297 | 1.207533 |     8.446297 | 00:00:38.101728 | 00:06:20.102727
 example_dag | print_date13 |                        | success |    10 | 0.845284 |    1.0616951 | 1.417735 |    10.616951 | 00:03:26.884824 | 00:04:13.506178
 example_dag | print_date14 |                        | success |    10 | 0.710254 |    0.9734616 | 1.304478 |     9.734616 | 00:03:30.503875 | 00:04:08.587778
 example_dag | print_date15 |                        | success |    10 | 0.729193 |    0.9978077 | 1.659975 |     9.978077 | 00:03:34.779666 | 00:04:02.680916
 example_dag | print_date16 |                        | success |    10 | 0.736481 |    1.1479496 | 1.660697 |    11.479496 | 00:03:38.17759  | 00:03:58.284443
 example_dag | print_date17 |                        | success |    10 | 0.721791 |     1.089326 | 1.865752 |     10.89326 | 00:03:43.899201 | 00:03:52.045602
 example_dag | print_date18 |                        | success |    10 |  0.82254 |    1.0861933 | 1.602212 |    10.861933 | 00:03:46.563687 | 00:03:45.934179
 example_dag | print_date19 |                        | success |    10 | 0.694369 |    0.9150857 | 1.185493 |     9.150857 | 00:03:52.568222 | 00:04:02.353
 example_dag | print_date2  |                        | success |    10 | 0.806452 |    1.0095171 | 1.230248 |    10.095171 | 00:00:04.82439  | 00:00:20.988461
 example_dag | print_date20 |                        | success |    10 | 0.743303 |    0.8742865 | 0.983655 |     8.742865 | 00:03:53.99485  | 00:03:58.726793
 example_dag | print_date21 |                        | success |    10 | 0.684588 |    0.8367918 | 1.023203 |     8.367918 | 00:03:55.738761 | 00:03:54.74774
 example_dag | print_date22 |                        | success |    10 | 0.589924 |    0.8825712 | 1.029147 |     8.825712 | 00:03:58.548224 | 00:03:50.84068
 example_dag | print_date23 |                        | success |    10 | 0.767636 |    0.8957238 | 0.986097 |     8.957238 | 00:04:04.217391 | 00:03:46.144873
 example_dag | print_date24 |                        | success |    10 |  0.74223 |    0.9251518 | 1.239134 |     9.251518 | 00:04:05.765597 | 00:03:42.429237
 example_dag | print_date25 |                        | success |    10 | 0.594681 |    0.8781399 | 1.121106 |     8.781399 | 00:04:05.490946 | 00:03:39.28868
 example_dag | print_date26 |                        | success |    10 | 0.663151 |    0.8689899 | 1.052992 |     8.689899 | 00:04:09.399829 | 00:03:34.559888
 example_dag | print_date27 |                        | success |    10 | 0.531863 |    0.8011121 | 1.065627 |     8.011121 | 00:04:13.508844 | 00:03:28.977669
 example_dag | print_date28 |                        | success |    10 | 0.623941 |    0.8106821 | 1.027548 |     8.106821 | 00:04:14.959353 | 00:03:25.053474
 example_dag | print_date29 |                        | success |    10 | 0.744605 |    0.8441154 | 0.964904 |     8.441154 | 00:04:11.377597 | 00:03:22.705902
 example_dag | print_date3  |                        | success |    10 | 0.442053 |    0.9375549 | 1.567445 |     9.375549 | 00:00:15.142371 | 00:00:15.495181
 example_dag | print_date30 |                        | success |    10 | 0.675152 |    0.8852712 | 1.107291 |     8.852712 | 00:04:14.102078 | 00:03:18.933734
 example_dag | print_date31 |                        | success |    10 | 0.519751 |     0.843545 | 1.042148 |      8.43545 | 00:04:24.862378 | 00:03:12.841289
 example_dag | print_date32 |                        | success |    10 | 0.483924 |    0.8193908 | 0.998283 |     8.193908 | 00:04:25.487382 | 00:03:08.866348
 example_dag | print_date33 |                        | success |    10 | 0.562085 |    0.9288077 | 1.227881 |     9.288077 | 00:04:23.784358 | 00:03:06.368832
 example_dag | print_date34 |                        | success |    10 | 0.744678 |    1.0033072 | 1.408322 |    10.033072 | 00:04:44.668717 | 00:02:55.888481
 example_dag | print_date35 |                        | success |    10 | 0.652233 |    0.9915692 | 1.152128 |     9.915692 | 00:05:01.192531 | 00:02:48.305163
 example_dag | print_date36 |                        | success |    10 | 0.553713 |     0.933974 | 1.250523 |      9.33974 | 00:05:02.069341 | 00:02:40.600184
 example_dag | print_date37 |                        | success |    10 | 0.688596 |    1.0893281 | 1.532913 |    10.893281 | 00:05:20.416081 | 00:02:29.20499
 example_dag | print_date38 |                        | success |    10 | 0.565303 |    0.9242697 | 1.503941 |     9.242697 | 00:06:15.347112 | 00:04:09.462437
 example_dag | print_date39 |                        | success |    10 | 0.558524 |    0.8705852 | 1.049251 |     8.705852 | 00:03:20.690685 | 00:04:42.223374
 example_dag | print_date4  |                        | success |    10 | 0.581263 |     0.895164 | 1.130376 |      8.95164 | 00:00:41.89009  | 00:05:49.37534
 example_dag | print_date40 |                        | success |    10 |  0.65816 |    0.9504808 | 1.267472 |     9.504808 | 00:03:23.057506 | 00:04:39.000299
 example_dag | print_date5  |                        | success |    10 | 0.615587 |    0.8474517 | 1.051941 |     8.474517 | 00:00:42.64173  | 00:05:50.834835
 example_dag | print_date6  |                        | success |    10 | 0.675339 |    0.9569241 | 1.506232 |     9.569241 | 00:00:42.380933 | 00:05:51.989674
 example_dag | print_date7  |                        | success |    10 |   0.7404 |    0.9408764 | 1.439553 |     9.408764 | 00:00:41.822069 | 00:05:53.02817
 example_dag | print_date8  |                        | success |    10 | 0.514494 |    0.8535814 |  1.28338 |     8.535814 | 00:00:16.014045 | 00:00:12.423858
 example_dag | print_date9  |                        | success |    10 | 0.354399 |    0.7450411 | 0.949153 |     7.450411 | 00:00:35.280334 | 00:06:20.324074
 example_dag |              | 2018-01-01 00:00:00+00 | success |    40 | 0.354399 |  0.903148125 | 1.621628 |    36.125925 | 00:07:43.652252 | 00:02:38.275005
 example_dag |              | 2018-01-01 00:05:00+00 | success |    40 | 0.627125 |  0.967237575 | 1.532913 |    38.689503 | 00:07:31.482163 | 00:02:36.899465
 example_dag |              | 2018-01-01 00:10:00+00 | success |    40 | 0.442053 |   0.94155605 |  1.52581 |    37.662242 | 00:07:09.337594 | 00:02:37.102932
 example_dag |              | 2018-01-01 00:15:00+00 | success |    40 | 0.520419 |  0.896940775 | 1.452449 |    35.877631 | 00:07:34.200548 | 00:02:41.155998
 example_dag |              | 2018-01-01 00:20:00+00 | success |    40 |  0.60063 |  0.878752225 | 1.230248 |    35.150089 | 00:07:14.674988 | 00:03:55.471032
 example_dag |              | 2018-01-01 00:25:00+00 | success |    40 | 0.554367 |  0.981219525 | 1.621297 |    39.248781 | 00:07:12.30197  | 00:04:14.648725
 example_dag |              | 2018-01-01 00:30:00+00 | success |    40 | 0.558524 |    0.9800735 | 1.660697 |     39.20294 | 00:07:18.737654 | 00:04:14.88136
 example_dag |              | 2018-01-01 00:35:00+00 | success |    40 | 0.483924 |    0.9204226 | 1.603143 |    36.816904 | 00:07:25.533756 | 00:04:15.256257
 example_dag |              | 2018-01-01 00:40:00+00 | success |    40 | 0.664791 |     0.977641 | 1.865752 |     39.10564 | 00:07:39.449438 | 00:05:15.616032
 example_dag |              | 2018-01-01 00:45:00+00 | success |    40 | 0.515129 |    0.8714547 | 1.282473 |    34.858188 | 00:07:40.570824 | 00:05:26.655057
 example_dag |              |                        | success |   400 | 0.354399 | 0.9318446075 | 1.865752 |   372.737843 | 00:07:44.01714  | 00:03:47.596186
(51 rows)

@ashb
Copy link
Member Author

ashb commented Nov 21, 2019

And the effect on time to complete 10 DAG runs:

Time for 10 dag runs with 40 tasks each: 803.298s (±8.5607s)
vs with change Time for 10 dag runs with 40 tasks each: 473.165s (±13.7191s)

A x1.69 change

Time for 10 dag runs with 12 tasks each: 240.400s (±3.0390s)
vs with change Time for 10 dag runs with 12 tasks each: 141.476s (±2.2151s)

A x1.7 change

@ashb
Copy link
Member Author

ashb commented Nov 21, 2019

@ashb when are situations where CAN_FORK is false besides when doing run_as_user?

Windows mostly :) Just being defensive.

if self.process and psutil.pid_exists(self.process.pid):
reap_process_group(self.process.pid, self.log)
if self.process:
if self.process.is_running():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you break this into a function?

@mik-laj
Copy link
Member

mik-laj commented Nov 21, 2019

This week I was thinking about it. If I worked at CLI, I saw this problem and it was on my list.

I think this will be better done using multiprocessing.Process. Is there a reason why you did it this way?

@dimberman
Copy link
Contributor

@mik-laj it's because multiprocessing.Process has to re-parse all dependencies/DAGs. It causes a lot of slowdown.

@bolkedebruin
Copy link
Contributor

Can you explain the security of running the tasks and the different processes involved?

Afaik it does Executor -> Task -> Rawtask. So with your change it would now do "Executor -> Task -> Rawtask -> New Process"? I.e. it hasn't become part of the executor I assume (that would be a no go). Just verifying.

reap_process_group(self.process.pid, self.log)
if self.process:
if self.process.is_running():
reap_process_group(self.process.pid, self.log)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: process groups are undeterministic. We would be better off using cgroups

Copy link
Member Author

@ashb ashb Nov 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just so that os.pgkill beahves itself, not for anything more than that.

We've already got a Cgroup task runner (not that I know what state it is in!)

Errr. This is not the on the line I thought it was. Yea, pgroups might not be perfect but this isn't a change to the existing behaviour.

@potiuk
Copy link
Member

potiuk commented Nov 21, 2019

But @dimberman -> multiprocessing also uses os.fork() underneath in fork mode (default for Linux). I have my reservations with using mutlprocessing (but mostly because people do not realise that it actually uses fork (and we plan to use it anyway so no difference).

Using multiprocessing might be a more portable way if we consider running it in different environments. Note that in python 3.8 default mode for the new process is spawn as forking on MacOS might cause crashes because threads are not safe for forking and some system libraries on MacOS run threads. So using multiprocessing.Process will be slower on MacOS in 3.8 but won't crash.

@bolkedebruin
Copy link
Contributor

Aside from from this performance enhancement I think a much bigger gain can be gotten from removing the raw task excecution. Ie. the process is now as follows Executor -> Task -> Rawtask. Which doesn't make sense. It should just be Executor -> Task. Ideally the task would then signal to the executor what its state is rather than setting its own state, which is either undeterministic (in case of a crash of the task) and insecure (it requires airflow db access by the task which is available to all operators)

@bolkedebruin
Copy link
Contributor

BTW I agree with @potiuk that its a bit strange that we get this speed up as both are using fork(). What is the trade off?

@dimberman
Copy link
Contributor

@bolkedebruin having the tasks access the DB is a central part of the k8sexecutor. Unless we want to set up some sort of messaging system/message queue

@dimberman
Copy link
Contributor

@bolkedebruin the understanding I have is that when you spawn a totally new process, you are reinitializing the interpreter, re-loading all dependencies, and restarting airflow. using os.fork directly allows you to keep the same memory state (at least that's what I understood from @ashb )

@potiuk
Copy link
Member

potiuk commented Nov 21, 2019

@dimberman @bolkedebruin : just to clarify -> psutil.Process indeed reloaded everything (just started a new python interpreters). So changing to fork make sense - I fully agree. What @mik-laj asked was using multiprocessing.Process instead of os.fork.

@potiuk
Copy link
Member

potiuk commented Nov 21, 2019

Currently the change manually checks if we "can fork". When using multiprocessing.Process, the check is done by the library and it uses "fork" by default on Linux, and Python < 3.8 on MacOS, but it uses "spawn" on Windows / MacOS for python >= 3.8. So maybe we can simplify the code a bit here.

@bolkedebruin
Copy link
Contributor

@dimberman It's inherently an insecure design to have tasks directly access the DB and has been a pain point in Airflow for a long time. The executor is fine, but not tasks.

There are many ways to do this, but what makes the k8s executor so special that tasks executed require access to the DB apart from the current paradigm?

@bolkedebruin
Copy link
Contributor

@potiuk Gotcha.

@ashb
Copy link
Member Author

ashb commented Nov 22, 2019

Afaik it does Executor -> Task -> Rawtask. So with your change it would now do "Executor -> Task -> Rawtask -> New Process"? I.e. it hasn't become part of the executor I assume (that would be a no go). Just verifying.

Not quite. The existing flow is:

Executor -> exec to spawn new python to run Task "watcher" -> spawn new python to run actual Task

My PR changes this to:

Executor -> exec to spawn new python to run Task "watcher" -> fork to run actual Task

The number of processes in use remains the same -- the only difference is how we create the processes, and wether we have to reload all of python and the airflow modules or not. I am happy that the same semantics and isolation is maintained.

We could also look at merging the "watcher" in to the executor -- the main thing the watcher does is set the task to failed if it errors, or kills it if the TaskInstance state is changed externally (i.e. the watcher is what is responsible for sending a term/kill signal to the task when you clear it in the UI).

And yes, longer term I also want to stop the workers accessing the DB directly.

(Once this is merged/working I plan to fix the Local and Celerey executors to tackle the exec vs fork there too.)

I'll try using multiprocessing to do this,

Note that in python 3.8 default mode for the new process is spawn as forking on MacOS might cause crashes because threads are not safe for forking and some system libraries on MacOS run threads

Sad panda. Interestingly the bug report seems to say it's been a problem since OSX 10.13, but I haven't noticed a problem on 1.014 with this code.

@ashb
Copy link
Member Author

ashb commented Nov 22, 2019

So from a quick look: multiprocessing.Process does not like it when something else reaps the process (which we do via our reap_process_group) -- it thinks it is still running and errors if I attempt to call close() on it.

So multiprocess.Process is not just a drop in replacement.

@codecov-io
Copy link

codecov-io commented Nov 25, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@25e9047). Click here to learn what that means.
The diff coverage is 69.89%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #6627   +/-   ##
=========================================
  Coverage          ?   84.28%           
=========================================
  Files             ?      672           
  Lines             ?    38316           
  Branches          ?        0           
=========================================
  Hits              ?    32296           
  Misses            ?     6020           
  Partials          ?        0
Impacted Files Coverage Δ
airflow/cli/commands/task_command.py 68.11% <100%> (ø)
airflow/utils/dag_processing.py 87.8% <100%> (ø)
airflow/task/task_runner/standard_task_runner.py 66.66% <60.78%> (ø)
airflow/utils/helpers.py 82.75% <78.12%> (ø)
airflow/utils/log/es_task_handler.py 94.17% <85.71%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 25e9047...1151395. Read the comment docs.

@ashb
Copy link
Member Author

ashb commented Nov 25, 2019

Right this should now pass it's tests (I had problems with the logging causing an infinite loop, decided the fix for that was to just re-configure the logging in the subprocess.)

✅ on my fork: https://travis-ci.com/ashb/airflow/builds/138072561

@ashb
Copy link
Member Author

ashb commented Nov 25, 2019

This is what the change for getting it working with multiprocessing looks like.

(give or take, this diff is against a different version.) Do we think it's worth doing like this. I will test the timing, but this should be as fast as os.fork overall (give or take)

diff --git airflow/task/task_runner/standard_task_runner.py airflow/task/task_runner/standard_task_runner.py
index 87b046377..9b0ef162a 100644
--- airflow/task/task_runner/standard_task_runner.py
+++ airflow/task/task_runner/standard_task_runner.py
@@ -19,11 +19,13 @@
 
 import os
 
+import multiprocessing
 import psutil
 from setproctitle import setproctitle
 
+from airflow.configuration import conf
 from airflow.task.task_runner.base_task_runner import BaseTaskRunner
-from airflow.utils.helpers import reap_process_group
+from airflow.utils.helpers import reap_process_group, DEFAULT_TIME_TO_WAIT_AFTER_SIGTERM
 
 CAN_FORK = hasattr(os, 'fork')
 
@@ -34,67 +36,106 @@ class StandardTaskRunner(BaseTaskRunner):
     """
     def __init__(self, local_task_job):
         super().__init__(local_task_job)
+        self.mp_context = multiprocessing.get_context(
+            conf.get('core', 'exector_multiprocess_start_method', fallback=None)
+        )
         self._rc = None
+        self._mp_process = None
 
     def start(self):
-        if CAN_FORK and not self.run_as_user:
-            self.process = self._start_by_fork()
-        else:
+        if self.run_as_user:
             self.process = self._start_by_exec()
+        else:
+            self.process = self._start_by_subprocess()
 
     def _start_by_exec(self):
         subprocess = self.run_command()
         return psutil.Process(subprocess.pid)
 
-    def _start_by_fork(self):
-        pid = os.fork()
-        if pid:
-            self.log.info("Started process %d to run task", pid)
-            return psutil.Process(pid)
-        else:
-            from airflow.bin.cli import get_parser
-            import signal
-            import airflow.settings as settings
-
-            signal.signal(signal.SIGINT, signal.SIG_DFL)
-            signal.signal(signal.SIGTERM, signal.SIG_DFL)
-            # Start a new process group
-            os.setpgid(0, 0)
-
-            # Force a new SQLAlchemy session. We can't share open DB handles
-            # between process. The cli code will re-create this as part of its
-            # normal startup
-            settings.engine.pool.dispose()
-            settings.engine.dispose()
+    def _start_by_subprocess(self):
+        self._mp_process = self.mp_context.Process(
+            target=type(self)._run_in_subprocess,
+            args=(self._command,),
+            daemon=False,
+        )
+        self._mp_process.start()
+
+        self.log.info("Started process %d to run task", self._mp_process.pid)
+        # Return a psutil.process so that we have the same type when doing run_as_user or this
+        return psutil.Process(self._mp_process.pid)
+
+    @classmethod
+    def _run_in_subprocess(cls, command):
+        from airflow.bin.cli import get_parser
+        import signal
+        import airflow.settings as settings
+
+        signal.signal(signal.SIGINT, signal.SIG_DFL)
+        signal.signal(signal.SIGTERM, signal.SIG_DFL)
+        # Start a new process group
+        os.setpgid(0, 0)
+
+        # Force a new SQLAlchemy session. We can't share open DB handles
+        # between process. The cli code will re-create this as part of its
+        # normal startup
+        settings.engine.pool.dispose()
+        settings.engine.dispose()
+
+        parser = get_parser()
+        args = parser.parse_args(command[1:])
+
+        proc_title = "airflow task runner: {0.dag_id} {0.task_id} {0.execution_date}"
+        if hasattr(args, "job_id"):
+            proc_title += " {0.job_id}"
+        setproctitle(proc_title.format(args))
 
-            parser = get_parser()
-            args = parser.parse_args(self._command[1:])
-
-            proc_title = "airflow task runner: {0.dag_id} {0.task_id} {0.execution_date}"
-            if hasattr(args, "job_id"):
-                proc_title += " {0.job_id}"
-            setproctitle(proc_title.format(args))
-
-            try:
-                args.func(args)
-                os._exit(0)
-            except Exception:
-                os._exit(1)
+        try:
+            args.func(args)
+            os._exit(0)
+        except Exception:
+            os._exit(1)
 
     def return_code(self, timeout=0):
         # We call this multiple times, but we can only wait on the process once
-        if self._rc is not None or not self.process:
+        if self._rc is not None:
             return self._rc
 
-        try:
-            self._rc = self.process.wait(timeout=timeout)
-            self.process = None
-        except psutil.TimeoutExpired:
-            pass
+        if self._mp_process:
+            self._mp_process.join(timeout)
+            self._rc = self._mp_process.exitcode
+
+            if self._rc is not None:
+                self._mp_process.close()
+                self._mp_process = None
+                self._process = None
+        elif self.process:
+            try:
+                self._rc = self.process.wait(timeout=timeout)
+                self.process = None
+            except psutil.TimeoutExpired:
+                pass
 
         return self._rc
 
     def terminate(self):
+        if self._mp_process:
+            pgid = os.getpgid(self._mp_process.pid)
+
+            self._mp_process.terminate()
+            self._rc = self._mp_process.exitcode
+            if self._rc is None:
+                self._mp_process.join(DEFAULT_TIME_TO_WAIT_AFTER_SIGTERM)
+                self._mp_process.kill()
+                self._mp_process.join(0)
+                self._rc = self._mp_process.exitcode
+
+            self._mp_process.close()
+            self.process = None
+            self._mp_process = None
+
+            # Make sure we kill and reap any sub-processes that are left!
+            reap_process_group(pgid, self.log)
+
         if self.process:
             if self.process.is_running():
                 rcs = reap_process_group(self.process.pid, self.log)

Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good.

@ashb ashb force-pushed the task-performance-only-exec-when-needed branch 2 times, most recently from 0d74f81 to 9a9b246 Compare November 26, 2019 12:12
@ashb ashb force-pushed the task-performance-only-exec-when-needed branch from 2c95de6 to 3908bfc Compare December 6, 2019 15:54
Rather than running a fresh python interpreter which then has to re-load
all of Airflow and its dependencies we should use os.fork when it is
available/suitable which should speed up task running, espeically for
short lived tasks.

I've profiled this and it took the task duration (as measured by the
`duration` column in the TI table) from an average of 14.063s down to
just 0.932s!
…rocesses

Most of the time we will run the "raw" task in a forked subprocess (the
only time we don't is when we use impersonation) that will have the
logging already configured. So if the EsTaskHandler has already been
configured we don't want to "re"configure it -- otherwise it will
disable JSON output for the actual task!
@ashb ashb force-pushed the task-performance-only-exec-when-needed branch from 3908bfc to 1151395 Compare December 9, 2019 15:10
@ashb ashb merged commit f69aa14 into apache:master Dec 10, 2019
@ashb ashb deleted the task-performance-only-exec-when-needed branch December 10, 2019 10:47
kaxil pushed a commit that referenced this pull request Dec 17, 2019
…n. (#6627)

* [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

  Rather than running a fresh python interpreter which then has to re-load
  all of Airflow and its dependencies we should use os.fork when it is
  available/suitable which should speed up task running, espeically for
  short lived tasks.

  I've profiled this and it took the task duration (as measured by the
  `duration` column in the TI table) from an average of 14.063s down to
  just 0.932s!

* Allow `reap_process_group` to kill processes even when the "group
leader" has already exited.

* Don't re-initialize JSON/stdout logging ElasticSearch inside forked processes

  Most of the time we will run the "raw" task in a forked subprocess (the
  only time we don't is when we use impersonation) that will have the
  logging already configured. So if the EsTaskHandler has already been
  configured we don't want to "re"configure it -- otherwise it will
  disable JSON output for the actual task!
kaxil pushed a commit that referenced this pull request Dec 18, 2019
…n. (#6627)

* [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

  Rather than running a fresh python interpreter which then has to re-load
  all of Airflow and its dependencies we should use os.fork when it is
  available/suitable which should speed up task running, espeically for
  short lived tasks.

  I've profiled this and it took the task duration (as measured by the
  `duration` column in the TI table) from an average of 14.063s down to
  just 0.932s!

* Allow `reap_process_group` to kill processes even when the "group
leader" has already exited.

* Don't re-initialize JSON/stdout logging ElasticSearch inside forked processes

  Most of the time we will run the "raw" task in a forked subprocess (the
  only time we don't is when we use impersonation) that will have the
  logging already configured. So if the EsTaskHandler has already been
  configured we don't want to "re"configure it -- otherwise it will
  disable JSON output for the actual task!
ashb added a commit that referenced this pull request Dec 19, 2019
…n. (#6627)

* [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

  Rather than running a fresh python interpreter which then has to re-load
  all of Airflow and its dependencies we should use os.fork when it is
  available/suitable which should speed up task running, espeically for
  short lived tasks.

  I've profiled this and it took the task duration (as measured by the
  `duration` column in the TI table) from an average of 14.063s down to
  just 0.932s!

* Allow `reap_process_group` to kill processes even when the "group
leader" has already exited.

* Don't re-initialize JSON/stdout logging ElasticSearch inside forked processes

  Most of the time we will run the "raw" task in a forked subprocess (the
  only time we don't is when we use impersonation) that will have the
  logging already configured. So if the EsTaskHandler has already been
  configured we don't want to "re"configure it -- otherwise it will
  disable JSON output for the actual task!
ashb added a commit to astronomer/airflow that referenced this pull request Jan 11, 2020
…n. (apache#6627)

* [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

  Rather than running a fresh python interpreter which then has to re-load
  all of Airflow and its dependencies we should use os.fork when it is
  available/suitable which should speed up task running, espeically for
  short lived tasks.

  I've profiled this and it took the task duration (as measured by the
  `duration` column in the TI table) from an average of 14.063s down to
  just 0.932s!

* Allow `reap_process_group` to kill processes even when the "group
leader" has already exited.

* Don't re-initialize JSON/stdout logging ElasticSearch inside forked processes

  Most of the time we will run the "raw" task in a forked subprocess (the
  only time we don't is when we use impersonation) that will have the
  logging already configured. So if the EsTaskHandler has already been
  configured we don't want to "re"configure it -- otherwise it will
  disable JSON output for the actual task!

(cherry picked from commit f69aa14)
galuszkak pushed a commit to FlyrInc/apache-airflow that referenced this pull request Mar 5, 2020
…n. (apache#6627)

* [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

  Rather than running a fresh python interpreter which then has to re-load
  all of Airflow and its dependencies we should use os.fork when it is
  available/suitable which should speed up task running, espeically for
  short lived tasks.

  I've profiled this and it took the task duration (as measured by the
  `duration` column in the TI table) from an average of 14.063s down to
  just 0.932s!

* Allow `reap_process_group` to kill processes even when the "group
leader" has already exited.

* Don't re-initialize JSON/stdout logging ElasticSearch inside forked processes

  Most of the time we will run the "raw" task in a forked subprocess (the
  only time we don't is when we use impersonation) that will have the
  logging already configured. So if the EsTaskHandler has already been
  configured we don't want to "re"configure it -- otherwise it will
  disable JSON output for the actual task!
@pkqs90
Copy link

pkqs90 commented Mar 10, 2020

Hi guys, I met some logging issues when upgrading to 1.10.7. Subprocess loggings can't be found in airflow webserver nor local task logs anymore. After digging around, I find this PR is related. The old way of spawning a process has a daemon thread that handles the log streams.
https://github.com/apache/airflow/blob/master/airflow/task/task_runner/base_task_runner.py#L134

After I change the code to always create a subprocess without fork, the logs come right back. Can someone take a look? Thanks in advance.

Jira ticket: https://issues.apache.org/jira/browse/AIRFLOW-7030

@potiuk
Copy link
Member

potiuk commented Mar 11, 2020

@pkqs90 -> maybe you can add a PR fixing it ?

@ewjmulder
Copy link

ewjmulder commented Mar 11, 2020

@pkqs90 We experienced a similar issue after upgrading from 1.10.3 to 1.10.9: no more task logging visible in the UI. In our case it was specifically for the KubernetesPodOperator, but not for the BashOperator for instance. We fixed it by adding a logger for the airflow.contrib package (in the main airflow logging configuration file), since the KubernetesPodOperator lives there. So if the operator you're missing logs from is also from the contrib package, you can try this fix to see if it re-appears.

My guess is that the combination of forking instead of starting a whole new process + having no specific logger for the package triggered this particular issue. Hope it solves it for you too.

@davido912
Copy link
Contributor

@ewjmulder could you please elaborate a bit more in regards to what did you actually add to the log_config file?

@ewjmulder
Copy link

@davido912 See further discussion in #8484

@mik-laj
Copy link
Member

mik-laj commented Jun 7, 2020

If anyone has problems with this change, they can use this code to disable this change.
MacOS

sed -i '' "s/CAN_FORK = /CAN_FORK = False # /" $(python -c "from airflow.task.task_runner import standard_task_runner; print(standard_task_runner.__file__)")

Debian

sed -i "s/CAN_FORK = /CAN_FORK = False # /" $(python -c "from airflow.task.task_runner import standard_task_runner; print(standard_task_runner.__file__)")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants