-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Sort __globals__ dict to make pickle string more deterministic #418
Changes from all commits
8d63a2e
b9d2bdc
ca7b0e8
1491be9
b92c296
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -24,6 +24,7 @@ | |
import enum | ||
import typing | ||
from functools import wraps | ||
import time | ||
|
||
import pytest | ||
|
||
|
@@ -49,6 +50,7 @@ | |
from cloudpickle.cloudpickle import _extract_class_dict, _whichmodule | ||
from cloudpickle.cloudpickle import _lookup_module_and_qualname | ||
|
||
from .testutils import subprocess_pickle_string | ||
from .testutils import subprocess_pickle_echo | ||
from .testutils import assert_run_python_script | ||
from .testutils import subprocess_worker | ||
|
@@ -57,6 +59,9 @@ | |
|
||
|
||
_TEST_GLOBAL_VARIABLE = "default_value" | ||
_TEST_GLOBAL_VARIABLE2 = "another_value" | ||
|
||
exec("def _TEST_BIG_GLOBAL_SPACE():\n return %s" % ", ".join([f"a{i}" for i in range(1000)])) | ||
|
||
|
||
class RaiserOnPickle(object): | ||
|
@@ -2321,6 +2326,32 @@ def __type__(self): | |
o = MyClass() | ||
pickle_depickle(o, protocol=self.protocol) | ||
|
||
@pytest.mark.skipif( | ||
sys.version_info < (3, 6, 0), | ||
reason="Dict determinism is a lost cause in Python < 3.6") | ||
def test_sorted_globals(self): | ||
vals = set() | ||
|
||
def func_with_globals(): | ||
return _TEST_GLOBAL_VARIABLE + _TEST_GLOBAL_VARIABLE2 | ||
|
||
for i in range(5): | ||
vals.add( | ||
subprocess_pickle_string(func_with_globals, | ||
protocol=self.protocol, | ||
add_env={"PYTHONHASHSEED": str(i)})) | ||
assert len(vals) == 1 | ||
|
||
def test_efficient_sorted_globals(self): | ||
# Non regression test to demonstrate that large numbers of globals | ||
# do not cause slowdown | ||
gvars = set(f"a{i}" for i in range(1000)) | ||
assert cloudpickle.cloudpickle._extract_code_globals( | ||
_TEST_BIG_GLOBAL_SPACE.__code__) == gvars | ||
tic = time.time() | ||
subprocess_pickle_string(_TEST_BIG_GLOBAL_SPACE, protocol=self.protocol) | ||
assert time.time() - tic < 0.5 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would rather not put tests that depend on the runtime because the CI workers can overloaded from time to time and those tests would fail at random. Instead just run a one-off benchmark and post the results in the comments of this PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No need to address this if we chose to go for #428 (which I think we should ;). |
||
|
||
|
||
class Protocol2CloudPickleTest(CloudPickleTest): | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With 3.5 EOL, it does not seem worth putting any effort into this. Downstream tools that care about determinism have likely been 3.6+ anyway.