Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Less false positives for len-as-conditions (pandas's DataFrame, numpy's Array) #3821

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions CONTRIBUTORS.txt
Original file line number Diff line number Diff line change
Expand Up @@ -286,11 +286,7 @@ contributors:

* Michael Scott Cuthbert: contributor

* Pierre Sassoulas : contributor
- Made C0412 (ungrouped import) compatible with isort
- Made multiple message with the same old name possible
- Made Pylint a little faster by refactoring the message store
- Broke down "missing-docstrings" between "module", "class" and "function"
* Pierre Sassoulas : maintainer, contributor

* Nathan Marrow

Expand Down
4 changes: 4 additions & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,10 @@ Release date: TBA

Close #3735

* `len-as-conditions` is now triggered only for classes that are inheriting directly from list, dict, or set and not implementing the `__bool__` function, or from generators like range or list/dict/set comprehension. This should reduce the false positives for other classes, like pandas's DataFrame or numpy's Array.

Close #1879

What's New in Pylint 2.6.0?
===========================

Expand Down
2 changes: 2 additions & 0 deletions doc/whatsnew/2.6.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,5 @@ Other Changes
* Fix vulnerable regular expressions in ``pyreverse``. The ambiguities of vulnerable regular expressions are removed, making the repaired regular expressions safer and faster matching.

.. _the migration guide in isort documentation: https://timothycrosley.github.io/isort/docs/upgrade_guides/5.0.0/#known_standard_library

* `len-as-conditions` is now triggered only for classes that are inheriting directly from list, dict, or set and not implementing the `__bool__` function, or from generators like range or list/dict/set comprehension. This should reduce the false positive for other classes, like pandas's DataFrame or numpy's Array.
55 changes: 45 additions & 10 deletions pylint/checkers/refactoring/len_checker.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,10 @@

# Licensed under the GPL: https://www.gnu.org/licenses/old-licenses/gpl-2.0.html
# For details: https://github.com/PyCQA/pylint/blob/master/COPYING
from typing import List

import astroid
from astroid import DictComp, GeneratorExp, ListComp, SetComp

from pylint import checkers, interfaces
from pylint.checkers import utils
Expand Down Expand Up @@ -55,17 +58,41 @@ def visit_call(self, node):
# a len(S) call is used inside a test condition
# could be if, while, assert or if expression statement
# e.g. `if len(S):`
if utils.is_call_of_name(node, "len"):
# the len() call could also be nested together with other
# boolean operations, e.g. `if z or len(x):`
parent = node.parent
while isinstance(parent, astroid.BoolOp):
parent = parent.parent
if not utils.is_call_of_name(node, "len"):
return
# the len() call could also be nested together with other
# boolean operations, e.g. `if z or len(x):`
parent = node.parent
while isinstance(parent, astroid.BoolOp):
parent = parent.parent
# we're finally out of any nested boolean operations so check if
# this len() call is part of a test condition
if not utils.is_test_condition(node, parent):
return
len_arg = node.args[0]
generator_or_comprehension = (ListComp, SetComp, DictComp, GeneratorExp)
if isinstance(len_arg, generator_or_comprehension):
# The node is a generator or comprehension as in len([x for x in ...])
self.add_message("len-as-condition", node=node)
return
instance = next(len_arg.infer())
mother_classes = self.base_classes_of_node(instance)
affected_by_pep8 = any(
t in mother_classes for t in ["str", "tuple", "list", "set"]
)
if "range" in mother_classes or (
affected_by_pep8 and not self.instance_has_bool(instance)
):
self.add_message("len-as-condition", node=node)

# we're finally out of any nested boolean operations so check if
# this len() call is part of a test condition
if utils.is_test_condition(node, parent):
self.add_message("len-as-condition", node=node)
@staticmethod
def instance_has_bool(class_def: astroid.ClassDef) -> bool:
try:
class_def.getattr("__bool__")
return True
except astroid.AttributeInferenceError:
...
return False

@utils.check_messages("len-as-condition")
def visit_unaryop(self, node):
Expand All @@ -78,3 +105,11 @@ def visit_unaryop(self, node):
and utils.is_call_of_name(node.operand, "len")
):
self.add_message("len-as-condition", node=node)

@staticmethod
def base_classes_of_node(instance: astroid.nodes.ClassDef) -> List[astroid.Name]:
Pierre-Sassoulas marked this conversation as resolved.
Show resolved Hide resolved
"""Return all the classes names that a ClassDef inherit from including 'object'."""
try:
return [instance.name] + [x.name for x in instance.ancestors()]
except TypeError:
return [instance.name]
50 changes: 50 additions & 0 deletions tests/checkers/unittest_refactoring.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# -*- coding: utf-8 -*-

# Licensed under the GPL: https://www.gnu.org/licenses/old-licenses/gpl-2.0.html
# For details: https://github.com/PyCQA/pylint/blob/master/COPYING

import astroid

from pylint.checkers.refactoring import LenChecker


def test_class_tree_detection():
module = astroid.parse(
"""
class ClassWithBool(list):
def __bool__(self):
return True

class ClassWithoutBool(dict):
pass

class ChildClassWithBool(ClassWithBool):
pass

class ChildClassWithoutBool(ClassWithoutBool):
pass
"""
)
with_bool, without_bool, child_with_bool, child_without_bool = module.body
assert LenChecker().base_classes_of_node(with_bool) == [
"ClassWithBool",
"list",
"object",
]
assert LenChecker().base_classes_of_node(without_bool) == [
"ClassWithoutBool",
"dict",
"object",
]
assert LenChecker().base_classes_of_node(child_with_bool) == [
"ChildClassWithBool",
"ClassWithBool",
"list",
"object",
]
assert LenChecker().base_classes_of_node(child_without_bool) == [
"ChildClassWithoutBool",
"ClassWithoutBool",
"dict",
"object",
]
75 changes: 74 additions & 1 deletion tests/functional/l/len_checks.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
if not len('TEST'): # [len-as-condition]
pass

z = False
z = []
if z and len(['T', 'E', 'S', 'T']): # [len-as-condition]
pass

Expand Down Expand Up @@ -102,3 +102,76 @@ def github_issue_1331_v4(*args):

b = bool(len(z)) # [len-as-condition]
c = bool(len('TEST') or 42) # [len-as-condition]


def github_issue_1879():

class ClassWithBool(list):
def __bool__(self):
return True

class ClassWithoutBool(list):
pass

class ChildClassWithBool(ClassWithBool):
pass

class ChildClassWithoutBool(ClassWithoutBool):
pass

assert len(ClassWithBool())
assert len(ChildClassWithBool())
assert len(ClassWithoutBool()) # [len-as-condition]
assert len(ChildClassWithoutBool()) # [len-as-condition]
assert len(range(0)) # [len-as-condition]
assert len([t + 1 for t in []]) # [len-as-condition]
assert len(u + 1 for u in []) # [len-as-condition]
assert len({"1":(v + 1) for v in {}}) # [len-as-condition]
assert len(set((w + 1) for w in set())) # [len-as-condition]

# pylint: disable=import-outside-toplevel
import numpy
hippo91 marked this conversation as resolved.
Show resolved Hide resolved
numpy_array = numpy.array([0])
if len(numpy_array) > 0:
print('numpy_array')
if len(numpy_array):
print('numpy_array')
if numpy_array:
print('b')

import pandas as pd
Pierre-Sassoulas marked this conversation as resolved.
Show resolved Hide resolved
pandas_df = pd.DataFrame()
if len(pandas_df):
print("this works, but pylint tells me not to use len() without comparison")
if len(pandas_df) > 0:
print("this works and pylint likes it, but it's not the solution intended by PEP-8")
if pandas_df:
print("this does not work (truth value of dataframe is ambiguous)")

def function_returning_list(r):
if r==1:
return [1]
return [2]

def function_returning_int(r):
if r==1:
return 1
return 2

# def function_returning_generator(r):
# for i in [r, 1, 2, 3]:
# yield i

# def function_returning_comprehension(r):
# return [x+1 for x in [r, 1, 2, 3]]

# def function_returning_function(r):
# return function_returning_generator(r)

assert len(function_returning_list(z)) # [len-as-condition]
assert len(function_returning_int(z))
# This should raise a len-as-conditions once astroid can infer it
# See https://github.com/PyCQA/pylint/pull/3821#issuecomment-743771514
# assert len(function_returning_generator(z))
# assert len(function_returning_comprehension(z))
# assert len(function_returning_function(z))
8 changes: 8 additions & 0 deletions tests/functional/l/len_checks.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,11 @@ len-as-condition:98:github_issue_1331_v3:Do not use `len(SEQUENCE)` without comp
len-as-condition:101:github_issue_1331_v4:Do not use `len(SEQUENCE)` without comparison to determine if a sequence is empty
len-as-condition:103::Do not use `len(SEQUENCE)` without comparison to determine if a sequence is empty
len-as-condition:104::Do not use `len(SEQUENCE)` without comparison to determine if a sequence is empty
len-as-condition:124:github_issue_1879:Do not use `len(SEQUENCE)` without comparison to determine if a sequence is empty
len-as-condition:125:github_issue_1879:Do not use `len(SEQUENCE)` without comparison to determine if a sequence is empty
len-as-condition:126:github_issue_1879:Do not use `len(SEQUENCE)` without comparison to determine if a sequence is empty
len-as-condition:127:github_issue_1879:Do not use `len(SEQUENCE)` without comparison to determine if a sequence is empty
len-as-condition:128:github_issue_1879:Do not use `len(SEQUENCE)` without comparison to determine if a sequence is empty
len-as-condition:129:github_issue_1879:Do not use `len(SEQUENCE)` without comparison to determine if a sequence is empty
len-as-condition:130:github_issue_1879:Do not use `len(SEQUENCE)` without comparison to determine if a sequence is empty
len-as-condition:171:github_issue_1879:Do not use `len(SEQUENCE)` without comparison to determine if a sequence is empty