Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New check dict-init-mutate #7794

Merged
merged 7 commits into from
Nov 23, 2022
Merged

Conversation

clavedeluna
Copy link
Collaborator

Type of Changes

Type

| ✓ | ✨ New feature |

Description

Taking over #5765

Add new check to detect mutating dict after having immediately created it.

Closes #2876

@clavedeluna
Copy link
Collaborator Author

@nickdrozd I apologize for not being able to retain your commits as they were. I tried lots of things but got enough permission errors when trying to push directly to the original PR that I gave up.

@coveralls
Copy link

coveralls commented Nov 18, 2022

Pull Request Test Coverage Report for Build 3533781397

  • 29 of 30 (96.67%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.002%) to 95.427%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pylint/extensions/dict_init_mutate.py 29 30 96.67%
Totals Coverage Status
Change from base Build 3533578986: 0.002%
Covered Lines: 17550
Relevant Lines: 18391

💛 - Coveralls

@github-actions

This comment has been minimized.

@clavedeluna
Copy link
Collaborator Author

Primer output looks really good, in fact this suggestion is just what we would've wanted to suggest.

@github-actions

This comment has been minimized.

Dict-Init-Mutate checker Messages
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
:dict-init-mutate (W3301): *Dictionary mutated immediately after initialization*
Dictionaries can be initialized with a single statementusing dictionary
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Dictionaries can be initialized with a single statementusing dictionary
Dictionaries can be initialized with a single statement using dictionary

@nickdrozd
Copy link
Collaborator

This looks great! Thanks for picking it up. The primer results are exactly right.

I noticed that there is a typo in the PR title ("mudate"). I didn't see it anywhere else, but you should double check just to be sure 😄

@clavedeluna clavedeluna changed the title New check dict-init-mudate New check dict-init-mutate Nov 18, 2022
@github-actions

This comment has been minimized.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change look pretty refined already, let's discuss how we can make the name / message clearer/better.

name = "dict-init-mutate"
msgs = {
"W3301": (
"Dictionary mutated immediately after initialization",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this message could be more explicit about the solution and the problem, but I'm not sure. We added a function to print big suggestions of dict recently, so maybe it's possible to do something like

Suggested change
"Dictionary mutated immediately after initialization",
"Declare all known values directly in the constructor with '%s'",

Ie. result would be Declare all known values directly in the constructor with ' {"apple": 1, "banana": 10}' Declare all known values directly in the constructor with ' {"apple": 1, "melon" : 4, ... "banana": 10}'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the updated language, but the suggestion with %s may be tricky because it may be multiple lines after. I'll see what I can do...

class DictInitMutateChecker(BaseChecker):
name = "dict-init-mutate"
msgs = {
"W3301": (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throughout pylint we have some warning inflation where things that should be Convention(C) are Warning(W). This seems like something that could be demoted to convention. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I agree!

Comment on lines 20 to 30
def stringify_dict_items(node: nodes.NodeNG) -> str:
try:
key = node.targets[0].slice.value
value = node.value.value
except AttributeError:
# More complex cases will be handled later on
return ""

if isinstance(value, str):
value = f"'{value}'"
return f"'{key}': {value}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def stringify_dict_items(node: nodes.NodeNG) -> str:
try:
key = node.targets[0].slice.value
value = node.value.value
except AttributeError:
# More complex cases will be handled later on
return ""
if isinstance(value, str):
value = f"'{value}'"
return f"'{key}': {value}"

We can use node.as_string(). Check RefactoringChecker._dict_literal_suggestion and the result in use-dict-literal functional tests. I think it would be better to move this function to pylint.util and reuse it, it's pretty polished already. The only thing it could do better is to take care of really long terminating value when the last element is close to 60 chars.

if not isinstance(sibling_name, nodes.Name):
return [node] + all_siblings

return [node] + self._get_dict_mutating_siblings(sibling, all_siblings)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much as I love recursion, this will crash Pylint if the dictionary is big enough:

config = {}
config[0] = None
config[1] = None
...  # 2 ... 981
config[982] = None
config[983] = None

Hopefully nobody would write something like that by hand! But it could be an issue with generated files.

Maybe it would be best to limit the suggested replacement to checking only a few items (say, three). Then if there are more than that, add an elipsis: Declare all known key/values when initializing the dictionary with 'config = {'dir': 'bin', 'user': 'me', 'workers': 5, ...}'

This could be done in a later PR.

Copy link
Collaborator Author

@clavedeluna clavedeluna Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very good point, I actually I'm really not loving the additional suggestion. I would vote to simply Declare all known key/values when initializing the dictionary and not include the constructing dictionary help. It adds difficult to read and maintain code that doesn't add THAT much value. @Pierre-Sassoulas what do you think?

With the example shows in another comment
pylint/extensions/bad_builtin.py:21:0: C3301: Declare all known key/values when initializing the dictionary with 'BUILTIN_HINTS = {'filter': Name.BUILTIN_HINTS(name='BUILTIN_HINTS')}' (dict-init-mutate)

I'm even all for keeping the msg simple.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RefactoringChecker._dict_literal_suggestion (and the function around it) can probably handle that complexity for us. I'm okay with doing it later on.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be awesome. I'll update the PR to leave the general suggestion and open a new issue to add the improvement suggestion.

)

def _get_dict_mutating_siblings(
self, node: nodes.Assign, all_siblings: List[nodes.Assign] | None = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't use optional-bar syntax in older Python versions 😢

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have python > 3.7.2 we can do it as long as we use from __future__ import annotations

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AHA that's the magic line!!! I couldn't figure it out, thanks Pierre!

@nickdrozd
Copy link
Collaborator

Running this checker against Pylint itself yields a funny edge case:

BUILTIN_HINTS = {"map": "Using a list comprehension can be clearer."}
BUILTIN_HINTS["filter"] = BUILTIN_HINTS["map"]

This gives a confusing error message:

pylint/extensions/bad_builtin.py:21:0: C3301: Declare all known key/values when initializing the dictionary with 'BUILTIN_HINTS = {'filter': Name.BUILTIN_HINTS(name='BUILTIN_HINTS')}' (dict-init-mutate)

Considering the difficulty in catching these edge cases and how much complexity it adds to the checker, it might be worth keeping the error message simple for now and thinking it over further. Perhaps that could be done in conjunction with figuring out how to catch nested dictionaries. The big challenge is determining what exactly goes in the dictionary.

@github-actions

This comment has been minimized.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good at this point, the continuous integration needs to be fixed but we can merge otherwise 👍

@github-actions

This comment has been minimized.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible to rebase now that #7833 is merged.

@@ -0,0 +1,3 @@
fruits = {} # [dict-init-mutate]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fruits = {} # [dict-init-mutate]
fruit_prices = {} # [dict-init-mutate]

@@ -0,0 +1 @@
fruits = {"apple": 1, "banana": 10}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fruits = {"apple": 1, "banana": 10}
fruit_prices = {"apple": 1, "banana": 10}

@github-actions
Copy link
Contributor

🤖 Effect of this PR on checked open source code: 🤖

Effect on music21:
The following messages are now emitted:

  1. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/cuthbertLab/music21/blob/b274aa58d44b89484273071703d17dc2450016eb/music21/stream/base.py#L8655
  2. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/cuthbertLab/music21/blob/b274aa58d44b89484273071703d17dc2450016eb/music21/analysis/floatingKey.py#L131

Effect on pandas:
The following messages are now emitted:

  1. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/pandas-dev/pandas/blob/d1ecf63e2040daecad6e0a8485a35e8a23393795/pandas/_version.py#L276
  2. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/pandas-dev/pandas/blob/d1ecf63e2040daecad6e0a8485a35e8a23393795/pandas/core/indexes/numeric.py#L311
  3. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/pandas-dev/pandas/blob/d1ecf63e2040daecad6e0a8485a35e8a23393795/pandas/tests/reshape/test_get_dummies.py#L110
  4. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/pandas-dev/pandas/blob/d1ecf63e2040daecad6e0a8485a35e8a23393795/pandas/tests/frame/test_query_eval.py#L464
  5. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/pandas-dev/pandas/blob/d1ecf63e2040daecad6e0a8485a35e8a23393795/pandas/tests/frame/test_constructors.py#L1483
  6. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/pandas-dev/pandas/blob/d1ecf63e2040daecad6e0a8485a35e8a23393795/pandas/tests/io/test_stata.py#L1292
  7. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/pandas-dev/pandas/blob/d1ecf63e2040daecad6e0a8485a35e8a23393795/pandas/tests/io/generate_legacy_storage_files.py#L139
  8. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/pandas-dev/pandas/blob/d1ecf63e2040daecad6e0a8485a35e8a23393795/pandas/tests/io/formats/test_to_excel.py#L295
  9. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/pandas-dev/pandas/blob/d1ecf63e2040daecad6e0a8485a35e8a23393795/pandas/io/pytables.py#L2065
  10. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/pandas-dev/pandas/blob/d1ecf63e2040daecad6e0a8485a35e8a23393795/pandas/io/pytables.py#L4150

Effect on pytest:
The following messages are now emitted:

  1. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/pytest-dev/pytest/blob/56544c11b563c1e0e3d3796a124a3f2570376e73/src/_pytest/junitxml.py#L77

Effect on sentry:
The following messages are now emitted:

  1. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/getsentry/sentry/blob/33066f46c1fcf8459f8ba9194a526243a090e631/src/sentry/utils/snowflake.py#L78
  2. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/getsentry/sentry/blob/33066f46c1fcf8459f8ba9194a526243a090e631/src/sentry/api/endpoints/authenticator_index.py#L30
  3. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/getsentry/sentry/blob/33066f46c1fcf8459f8ba9194a526243a090e631/src/sentry/api/serializers/models/plugin.py#L132
  4. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/getsentry/sentry/blob/33066f46c1fcf8459f8ba9194a526243a090e631/src/sentry/plugins/sentry_webhooks/plugin.py#L100
  5. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/getsentry/sentry/blob/33066f46c1fcf8459f8ba9194a526243a090e631/src/sentry/testutils/factories.py#L981
  6. dict-init-mutate:
    Declare all known key/values when initializing the dictionary.
    https://github.com/getsentry/sentry/blob/33066f46c1fcf8459f8ba9194a526243a090e631/src/sentry/testutils/cases.py#L1239

This comment was generated for commit 1f2cdd8

@Pierre-Sassoulas Pierre-Sassoulas merged commit f7d681b into pylint-dev:main Nov 23, 2022
@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.16.0 milestone Nov 23, 2022
@uzzell
Copy link

uzzell commented Dec 28, 2022

I'm concerned about the warning for https://github.com/pandas-dev/pandas/blob/d1ecf63e2040daecad6e0a8485a35e8a23393795/pandas/tests/reshape/test_get_dummies.py#L110 (item 3 in the Pandas list in this github-actions comment):

expected_counts = {"int64": 1, "object": 1}
expected_counts[dtype_name] = 3 + expected_counts.get(dtype_name, 0)

(Here, dtype_name depends on an argument of the method that contains this snippet.)

How would one rewrite the snippet? Doing expected_counts = {"int64": 1, "object": 1, dtype_name: 3 + expected_counts.get(dtype_name, 0)} would throw an error.

@clavedeluna
Copy link
Collaborator Author

#7997

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ✨ Improvement to a component
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New check: Use single-statement dict initialization
6 participants