Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ordering check for ImmutableSet creation #18

Open
ConstantineLignos opened this issue Jan 24, 2019 · 1 comment
Open

Improve ordering check for ImmutableSet creation #18

ConstantineLignos opened this issue Jan 24, 2019 · 1 comment

Comments

@ConstantineLignos
Copy link
Contributor

We currently do the following check for _require_ordered_input:

if (
            self._require_ordered_input
            and not (isinstance(items, Sequence) or isinstance(items, ImmutableSet))
            and not self._order_key
        ):
            raise ValueError(
                "Builder has require_ordered_input on, but provided collection "
                "is neither a sequence or another ImmutableSet.  A common cause "
                "of this is initializing an ImmutableSet from a set literal; "
                "prefer to initialize from a list instead to help preserve "
                "determinism."
            )

In #12, @rgabbard and I discussed some of the challenges in determining whether the input is ordered. We decided to change the flag to disable_order_check, and have the order check look for specific bad classes as sources. Specifically:

  1. A set that is not an ImmutableSet
  2. Dictionaries or dictionary views that come from unordered dictionaries.

Regarding 2, @rgabbard said before:

Note in particular that things like itertools.chain don't return generators, which blocks us from whitelisting Generator to cover all these cases. So we decided to switch to a blacklisting approach where we block AbstractSet-derived classes which are not ImmutableSet and also KeysView, ValuesView, and ItemsView coming from built-in dicts on Python versions/implementations not guaranteeing deterministic iteration ordering. This should hit most of the common cases. We will allow disabling this check via a disable_order_check flag.

For KeysView, etc., we propose a hack: at module initialization we will make a dummy dict and cache the class of the returned views. We will then use these in the checks. These will be deterministic for Python 3.7+ and on Python 3.6 with the CPython implementation. This may be a little brittle, but since you can always override it we don't think it is too big of a problem.

@rgabbard Anything you want to add?

@gabbard
Copy link

gabbard commented Jan 24, 2019

Nope, this looks good. Thanks for splitting this out to its own issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants