Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Rule breaks due to hash collisions for dataclass MatchError #4297

Closed
marcandre-larochelle opened this issue Aug 22, 2024 · 3 comments · Fixed by #4307
Closed

Missing Rule breaks due to hash collisions for dataclass MatchError #4297

marcandre-larochelle opened this issue Aug 22, 2024 · 3 comments · Fixed by #4307
Labels

Comments

@marcandre-larochelle
Copy link

marcandre-larochelle commented Aug 22, 2024

Summary

Rule breaks with the same exact message and line number, but different filename are missing due to a hash collision.

Issue Type
  • Bug Report
OS / ENVIRONMENT

ansible-lint version 24.7.0

STEPS TO REPRODUCE

The MatchError data class has the unsafe_hash=True, which generates automatically a hash function based on the members of the class, however the field filename is non-existent (not defined) in the dataclass, it is added at a later stage (see: var_naming.py#L204) and then all the errors are added to a set (see: runner.py#L682). Due to the filename member being non-existent at the class definition, the hash function does not take into account the filename and cause collisions if you have the same exact rule break in another file at the same line number.

  1. Create 2 different files with var naming issues on the same lines number (but in the 2 sperate file)
  2. Run ansible-lint
  3. There should be only 1 issue raised due to the bug.
Desired Behavior

The same rule break happening in a different file on the same line number should be recorded properly as different rule break.

Actual Behavior

Simplified code example of the reproducible issue we are experiencing:

from dataclasses import dataclass

@dataclass(unsafe_hash=True)
class MatchError(ValueError):
    message: str
    tag: str
    lineno: int = 1

    @property
    def _hash_key(self) -> any:
        # line attr is knowingly excluded, as dict is not hashable
        return (
            self.filename,
            self.lineno,
            self.message,
            self.tag
        )

    def __eq__(self, other):
        return self.__hash__() == other.__hash__()
    


error1 = CustomError(message="An error occurred", tag="ERROR_TAG", lineno=42)
error1.filename = "test.py"
error2 = CustomError(message="An error occurred", tag="ERROR_TAG", lineno=42)
error2.filename = "test2.py"
error3 = CustomError(message="An error occurred", tag="ERROR_TAG", lineno=43)
error3.filename = "test2.py"

print(error1.filename)
print(error2.filename)
print(error3.filename)
print(hash(error1))
print(hash(error2))
print(hash(error3))

Potential fix:

  • Add the filename member to the class definition of MatchError (as "None" by default to preserve the same behaviors, but fix the hash function)
@marcandre-larochelle marcandre-larochelle added bug new Triage required labels Aug 22, 2024
@marcandre-larochelle marcandre-larochelle changed the title Missing Rule breaks due to hash collisions for MatchErrors Missing Rule breaks due to hash collisions for dataclass MatchError Aug 22, 2024
@cavcrosby
Copy link
Contributor

filename was a field for the MatchError dataclass at one point and only was recently removed from it when looking at #4202. I was also able to reproduce this on the main branch by running the following below.

mkdir -p "./vars/test"

cat << _EOF_ > "./vars/test/foo.yml"
---
CamelCaseIsBad: foo
ALL_CAPS_ARE_BAD_TOO: foo
_EOF_


cat << _EOF_ > "./vars/test/bar.yml"
---
CamelCaseIsBad: bar
ALL_CAPS_ARE_BAD_TOO: bar
_EOF_

ansible-lint --offline "./vars/test"

That said, when running the following in addition, this issue seems to disappear.

cat << _EOF_ > "./vars/test/bar.yml"
---
AnotherCamelCaseIsBad: bar
ANOTHER_ALL_CAPS_ARE_BAD_TOO: bar
_EOF_

ansible-lint --offline "./vars/test"

@marcandre-larochelle
Copy link
Author

@cavcrosby for the 2nd case you highlighted, it has to be exactly the same rule "description" being broken (which in the case of var naming requires to have the same variable name), otherwise the hash calculated will be different (the only thing not taking into account for the hash right now that we noticed was the filename, which causes the conflict between hashes if there are 2 rules broken with the same description, at the same line number, but in different files).

@cavcrosby
Copy link
Contributor

Ahh, I see what you mean. I'll look into opening a PR for this sometime soon.

cavcrosby added a commit to cavcrosby/ansible-lint that referenced this issue Aug 30, 2024
Setting the filename for match_error has been removed because this will
be done on MatchError instance creation via the __post_init__ method.
cavcrosby added a commit to cavcrosby/ansible-lint that referenced this issue Aug 30, 2024
Setting the filename for match_error has been removed because this will
be done on MatchError instance creation via the __post_init__ method.
cavcrosby added a commit to cavcrosby/ansible-lint that referenced this issue Sep 3, 2024
Setting the filename for match_error has been removed because this will
be done on MatchError instance creation via the __post_init__ method.
@audgirka audgirka removed the new Triage required label Sep 18, 2024
@audgirka audgirka moved this to In Progress in 🧰 devtools project board Sep 18, 2024
ssbarnea pushed a commit to cavcrosby/ansible-lint that referenced this issue Sep 19, 2024
Setting the filename for match_error has been removed because this will
be done on MatchError instance creation via the __post_init__ method.
@github-project-automation github-project-automation bot moved this from In Progress to Done in 🧰 devtools project board Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants