Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FALSE NEGATIVE] "S608 Possible SQL injection" #12044

Closed
mpolyakov-plutoflume opened this issue Jun 26, 2024 · 13 comments · Fixed by #13574
Closed

[FALSE NEGATIVE] "S608 Possible SQL injection" #12044

mpolyakov-plutoflume opened this issue Jun 26, 2024 · 13 comments · Fixed by #13574
Labels
bug Something isn't working good first issue Good for newcomers rule Implementing or modifying a lint rule

Comments

@mpolyakov-plutoflume
Copy link

Hello.
I'm very exited to migrate our codebase to ruff. However, while doing so I've noticed, that rule S608 works different from the corresponding B608.
It only triggers if SELECT is on the same line.
So SELECT * FROM {foo}.table is an error and

SELECT *
FROM {foo}.table

Is not.

Kind regards, Mikhail

  • B608
  • S608
  • sql injection
@AlexWaygood AlexWaygood added bug Something isn't working rule Implementing or modifying a lint rule labels Jun 26, 2024
@MichaReiser
Copy link
Member

I just tried and the rule works as expected for triple-quoted f-strings. Any chance that you're using a line continuation token?

https://play.ruff.rs/1c1c9715-2f69-48e1-9e31-f2edcef405ac

@mpolyakov-plutoflume
Copy link
Author

Than you for a quick reply.
I've added a couple of examples to https://play.ruff.rs/d3adeb3f-9657-4314-8d99-a45d426d6674

Basically, it does not work, when SELECT is on a new line or SELECT and FROM are on different lines.

@bricker
Copy link

bricker commented Aug 12, 2024

The rule seems to be sensitive to certain indentation. Here are a few examples where the rule is not triggered but it should be:

def sql():
    foo = "foo"

    f"""
SELECT {foo}
    FROM bar
    """
def sql():
    foo = "foo"

    f"""
    SELECT
        {foo}
    FROM bar
    """
def sql():
    foo = "foo"

    f"""
    SELECT {foo}
    FROM
        bar
    """

@dhruvmanila
Copy link
Member

The rule seems to be sensitive to certain indentation.

Yeah, I think the regex is not considering the case where there could be "one or more whitespace characters":

static SQL_REGEX: Lazy<Regex> = Lazy::new(|| {
Regex::new(r"(?i)\b(select\s.+\sfrom\s|delete\s+from\s|(insert|replace)\s.+\svalues\s|update\s.+\sset\s)")
.unwrap()
});

So, the \s should be \s+ instead.

https://regex101.com/r/UPYDXC/2

@MichaReiser
Copy link
Member

@dhruvmanila do you plan to PR or should we mark this as a good first issue?

@dhruvmanila dhruvmanila added the good first issue Good for newcomers label Aug 13, 2024
@DataEnggNerd
Copy link
Contributor

@dhruvmanila Shall I pick this issue?

@dhruvmanila
Copy link
Member

@DataEnggNerd For sure! Thanks

@DataEnggNerd
Copy link
Contributor

Need help here.
I have added \s+ before from but it is failing with examples given at #12044 (comment) during compilation, but being detected at regex101.

@DataEnggNerd
Copy link
Contributor

@MichaReiser I need a clarity here. As per my understanding based on the documentation here, every SQL statement which have string formatting should be reported with S608. Did I get it right?

@MichaReiser
Copy link
Member

@DataEnggNerd Yes, the rule ideally flags all SQL statements with as few false positives as possible.

The change mentioned in this issue is that we should re-visit Ruff's regex to allow optional whitespace between the different keywords. It's a non-goal to change how the rule detects SQL expressions significantly.

As reference, the Regex used by bandit

https://github.com/PyCQA/bandit/blob/4ac55dfaaacf2083497831294b2dcd7e679f8428/bandit/plugins/injection_sql.py#L74-L80

@DataEnggNerd
Copy link
Contributor

@MichaReiser Yes, I have started with modifying the regex first and faced some problem with the regex. @dhruvmanila mentioned there is something missing beyond the regex(you can refer closer PR linked to the issue), so i started looking into this. Apologies it is making more time to implement at code.

@MichaReiser
Copy link
Member

@DataEnggNerd

Playing with it in Regex101, the following should work: (?i)\b(select\s+.*\s+from) https://regex101.com/r/T8ojD0/1

@DataEnggNerd
Copy link
Contributor

@MichaReiser With the changes you suggested, along with the regex from bandit, unfortunately both are not helping.
I have added details in the draft PR - #13445?
Help me out if I am missing something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers rule Implementing or modifying a lint rule
Projects
None yet
6 participants