Support for non-Python post-write hooks #1275

lanzz · 2023-07-16T00:33:16Z

Describe the bug
I'm using Ruff to lint and reformat my code, and I'm trying to get it working as a post-write hook in Alembic, but Ruff is actually written in Rust and is installed as a compiled binary, so it isn't included in importlib metadata as a console-scripts entrypoint, and Alembic throws an "Could not find entrypoint console_scripts.ruff" error.

Expected behavior
There should be an out of the box write hook runner that can execute arbitrary binaries, and not just the console-scripts runner that only supports Python modules' entrypoints.

To Reproduce
Install Ruff:

$ pip install ruff

Confirm that the ruff binary is in path and can be executed:

$ ruff --version
ruff 0.0.278

In alembic.ini:

...
[post_write_hooks]
hooks = ruff
ruff.type = console_scripts
ruff.entrypoint = ruff
ruff.options = --fix REVISION_SCRIPT_FILENAME
...

Error

  Generating /srv/app/alembic/versions/2023-07-15_2341_ecc9e22f7cf5_.py ...  done
  Running post write hook {name!r} ...
  FAILED
  FAILED: Could not find entrypoint console_scripts.ruff

Versions.

OS: Debian 12.0
Python: 3.11.4
Alembic: 1.11.1
SQLAlchemy: 2.0.19
Database: irrelevant
DBAPI: irrelevant

Additional context

Have a nice day!

The text was updated successfully, but these errors were encountered:

zzzeek · 2023-07-16T02:18:58Z

OK. I see the use case here, though I wish "ruff" was better integrated in Python (for example, it can't run flake8 plugins, which is why I cant use it).

But since this has "path to executable" and "exec" in it, convince me nobody is going to report CVEs against this - let's be creative. What are they going to come up with ? "someone writes a program that generates alembic configs on the fly" - that would be silly. There are folks who literally do code searches looking for "exec" and "eval" calls in code so they can post frivolous CVEs. Just got an almost CVE the other day because someone was using a traceback tool that dumped all the variables in each frame in the traceback into the logs - so what happens when their database wont connect? the "password" variable is in the stackframe, yes it's cleartext because we have to send it to db.connect()! and we get the security email that SQLAlchemy is printing cleartext passwords in logs. It's really that bad nowadays.

Sorry for the rant! This has to be watertight against CVEs.

lanzz · 2023-07-16T09:25:46Z

Ruff does support lots of popular plugins' tests out of the box, but they're all ported to Rust for speed, as it's one of its main goals.

I do appreciate the concern about CVEs, but I'm really not in a position to do a proper full security audit of a feature like that. I just figured that if I'm asking for a feature it's only fair to offer an implementation, but I definitely don't have the credentials to take responsibility for any CVEs found there.

zzzeek · 2023-07-16T16:05:49Z

OK now that it's a new day let me restate what i mean. I was not asking you to take responsibility for CVEs found later - if I was, then we can just merge this and CVEs would be on you to fix :) . Instead, by contributing this code to Alembic, it will be on me to respond to CVEs. Once we merge this, while we appreciate more help from you if needed, technically you'd be done :).

So what I am asking is this. verbiage in response to this dialogue:

Interviewer: So what's your code contribution?
Interviewee: This feature allows one to specify the name of an arbitrary executable that will be run when alembic formats an autogenerate file
Interviewer: OK, I know nothing about alembic. Does this feature that invokes an arbitrary executable introduce a security risk?
Interviewee: no, because ...... (answer here will be, "no, because this is a directive that is placed into a configuration file that is only manipulated by the programmer, and only is used within development within the code-generation feature of Alembic. There is already the ability to name arbitrary Python modules which could just as easily include arbitrary execution)

Now it seems like I just answered my own question, which I did, however, since I have not been close to this code in some years I'm looking for a quick "what am I missing?" on that response. cc @CaselIT can you take a look?

CaselIT · 2023-07-16T16:21:51Z

The change seem mostly ok.
Not sure about the name, but that's a secondary concern.

Regarding possibility for cve, I'm not sure it's any worse that what we have now, since console script already uses subprocess

lanzz · 2023-07-16T16:27:41Z

I can see how my PR expands the security scope by introducing arbitrary execution — the existing console-scripts runner limits available execution vectors to only Python modules that are already installed and declare a console-scripts entrypoint, while my PR allows execution of truly arbitrary binaries, but as I said I don't have enough experience specifically in security audits to be able to assess the impact myself. One obvious attack vector I can see would be placing a binary somewhere higher in the PATH thus overriding the one intended by the maintainer of alembic.ini; another would be arbitrary binary execution in an environment where the maintainer of alembic.ini might not be authorized to execute arbitrary binaries. It is difficult for me to gauge the risks and consequences of those though.

zzzeek · 2023-07-16T16:59:02Z

One obvious attack vector I can see would be placing a binary somewhere higher in the PATH thus overriding the one intended by the maintainer of alembic.ini;

right, things like that. I think we should just document examples here like "/path/to/ruff" and also make sure in the test suite that's in #1276 we test an absolute path to the executable works

lanzz · 2023-07-17T19:28:18Z

I've added a unit test and documentation to the PR

sqla-tester · 2023-07-17T23:27:25Z

Mihail Milushev has proposed a fix for this issue in the main branch:

Implement new exec write-hook runner that will execute arbitrary binaries https://gerrit.sqlalchemy.org/c/sqlalchemy/alembic/+/4788

sqla-tester · 2023-08-22T18:21:30Z

Mihail Milushev has proposed a fix for this issue in the main branch:

Implement new exec write-hook runner that will execute arbitrary binaries https://gerrit.sqlalchemy.org/c/sqlalchemy/alembic/+/4788

lanzz added the requires triage New issue that requires categorization label Jul 16, 2023

lanzz mentioned this issue Jul 16, 2023

Implement new exec write-hook runner that will execute arbitrary binaries #1276

Closed

3 tasks

zzzeek added autogenerate - rendering use case not quite a feature and not quite a bug, something we just didn't think of and removed requires triage New issue that requires categorization labels Jul 16, 2023

zzzeek assigned sqla-tester and unassigned sqla-tester Jul 17, 2023

sqlalchemy-bot closed this as completed in 70463e9 Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for non-Python post-write hooks #1275

Support for non-Python post-write hooks #1275

lanzz commented Jul 16, 2023

zzzeek commented Jul 16, 2023

lanzz commented Jul 16, 2023

zzzeek commented Jul 16, 2023

CaselIT commented Jul 16, 2023

lanzz commented Jul 16, 2023

zzzeek commented Jul 16, 2023

lanzz commented Jul 17, 2023

sqla-tester commented Jul 17, 2023

sqla-tester commented Aug 22, 2023

Support for non-Python post-write hooks #1275

Support for non-Python post-write hooks #1275

Comments

lanzz commented Jul 16, 2023

zzzeek commented Jul 16, 2023

lanzz commented Jul 16, 2023

zzzeek commented Jul 16, 2023

CaselIT commented Jul 16, 2023

lanzz commented Jul 16, 2023

zzzeek commented Jul 16, 2023

lanzz commented Jul 17, 2023

sqla-tester commented Jul 17, 2023

sqla-tester commented Aug 22, 2023