-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-6529] Pickle error occurs when the scheduler tries to run on macOS. #7128
Conversation
I've tried to add a new test case for this issue but found it's difficult because |
tests/utils/test_dag_processing.py
Outdated
@@ -168,6 +169,7 @@ def test_set_file_paths_when_processor_file_path_is_in_new_file_paths(self): | |||
processor_factory=MagicMock().return_value, | |||
processor_timeout=timedelta.max, | |||
signal_conn=MagicMock(), | |||
pickle_dags=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use False in all these tests, which is I think the default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll replace those change with False
.
…lt value of pickle_dags to be passed.
dc36777
to
15c0bc5
Compare
15c0bc5
to
0cee214
Compare
…e where it's defined is changed.
Codecov Report
@@ Coverage Diff @@
## master #7128 +/- ##
==========================================
- Coverage 85.40% 85.08% -0.33%
==========================================
Files 710 723 +13
Lines 39485 39564 +79
==========================================
- Hits 33724 33663 -61
- Misses 5761 5901 +140
Continue to review full report at Codecov.
|
I found another multiprocessing issue with spawn mode (actually, non-fork mode). |
Yet another reason why we should test it automatically I think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need just one test to run (and the current way will not work as you think) - see the detail in review
BTW. It would be great if you rebase, to make sure that we only see changes from your change @sarutak |
…ve-serialization-issue
@potiuk All right. I'll do it. |
…ve-serialization-issue
airflow/utils/dag_processing.py
Outdated
for section in inherited_conf: | ||
for key, value in inherited_conf[section].items(): | ||
if value not in conf: | ||
conf.set(section, key, value.replace("%", "%%")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks suspect. What's going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't see anywhere that we actually pass a value here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spawn
method doesn't inherit environment except for a little bit of things so conf
is not also inherited by child process. I noticed test_reload_module
in test_dag_processing.py
fails because DAG_PROCESSOR_MANAGER_LOG_LOCATION
configured is not inherited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, not DAG_PROCESSOR_MANAGER_LOG_LOCATION
but logging_config_class
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashb If only test_reload_module
needs to dynamically change the configuration and propagate it to child processes, I'll remove inherited_conf
and its related code.
What do you think?
airflow/jobs/scheduler_job.py
Outdated
if value not in conf: | ||
conf.set(section, key, value.replace("%", "%%")) | ||
|
||
setproctitle("airflow scheduler - DagFileProcessor {}".format(file_path)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is inherited_conf used -- I can't find this one used anywhere either.
We want to set the process title always.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this is just a miss-indentation. I'll fix it.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
return self._result | ||
|
||
|
||
def fake_dag_file_processor_factory(file_path, zombies, dag_ids, pickle_dags): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity, it might be worth keeping the argument order consistent. So the preceding line would be:
def fake_dag_file_processor_factory(file_path, pickle_dags, dag_ids, zombies):
if conf.has_option('core', 'mp_start_method'): | ||
mp_start_method = conf.get('core', 'mp_start_method') | ||
else: | ||
mp_start_method = mp.get_start_method() | ||
|
||
possible_value_list = mp.get_all_start_methods() | ||
if mp_start_method not in possible_value_list: | ||
raise AirflowConfigException( | ||
"mp_start_method should not be " + mp_start_method + | ||
". Possible value is one of " + str(possible_value_list)) | ||
cxt = mp.get_context(mp_start_method) | ||
|
||
self._parent_signal_conn, child_signal_conn = cxt.Pipe() | ||
self._process = cxt.Process( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this block of code is identical to https://github.com/apache/airflow/pull/7128/files#diff-c35269bcfbbe386e269ffa7487e86192R171-R184. Given that they would probably need to change at the same time if either ever changed, this might be a good case for a utility mixin. Lines 343-352 could go into a MultiProcessingConfigMixin class or similar.
# as its processing result w/o actually parsing anything. | ||
def __init__(self, file_path, pickle_dags, dag_id_white_list, zombies): | ||
super().__init__(file_path, pickle_dags, dag_id_white_list, zombies) | ||
self._result = zombies, 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth defining the zero to provide some context for what it represents.
@jhtimmins have you tried running with this PR? Is it closer to py3.8 support with it? Someone will need to resolve the conflicts it looks like too |
When we try to run the scheduler on macOS, we will get a serialization error like as follows.
The reason is scheduler try to run subprocesses using multiprocessing with spawn mode and the mode tries to pickle objects. In this case,
processor_factory
inner method is tried to be pickled.Actually, as of Python 3.8, spawn mode is the default mode in macOS.
The solution I propose is that pull the method out of the enclosing method.
Issue link: AIRFLOW-6529
[AIRFLOW-NNNN]
. AIRFLOW-NNNN = JIRA ID** For document-only changes commit message can start with
[AIRFLOW-XXXX]
.In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.