Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for non-Python post-write hooks #1275

Closed
lanzz opened this issue Jul 16, 2023 · 9 comments
Closed

Support for non-Python post-write hooks #1275

lanzz opened this issue Jul 16, 2023 · 9 comments
Labels
autogenerate - rendering use case not quite a feature and not quite a bug, something we just didn't think of

Comments

@lanzz
Copy link
Contributor

lanzz commented Jul 16, 2023

Describe the bug
I'm using Ruff to lint and reformat my code, and I'm trying to get it working as a post-write hook in Alembic, but Ruff is actually written in Rust and is installed as a compiled binary, so it isn't included in importlib metadata as a console-scripts entrypoint, and Alembic throws an "Could not find entrypoint console_scripts.ruff" error.

Expected behavior
There should be an out of the box write hook runner that can execute arbitrary binaries, and not just the console-scripts runner that only supports Python modules' entrypoints.

To Reproduce
Install Ruff:

$ pip install ruff

Confirm that the ruff binary is in path and can be executed:

$ ruff --version
ruff 0.0.278

In alembic.ini:

...
[post_write_hooks]
hooks = ruff
ruff.type = console_scripts
ruff.entrypoint = ruff
ruff.options = --fix REVISION_SCRIPT_FILENAME
...

Error

  Generating /srv/app/alembic/versions/2023-07-15_2341_ecc9e22f7cf5_.py ...  done
  Running post write hook {name!r} ...
  FAILED
  FAILED: Could not find entrypoint console_scripts.ruff

Versions.

  • OS: Debian 12.0
  • Python: 3.11.4
  • Alembic: 1.11.1
  • SQLAlchemy: 2.0.19
  • Database: irrelevant
  • DBAPI: irrelevant

Additional context

Have a nice day!

@lanzz lanzz added the requires triage New issue that requires categorization label Jul 16, 2023
@zzzeek zzzeek added autogenerate - rendering use case not quite a feature and not quite a bug, something we just didn't think of and removed requires triage New issue that requires categorization labels Jul 16, 2023
@zzzeek
Copy link
Member

zzzeek commented Jul 16, 2023

OK. I see the use case here, though I wish "ruff" was better integrated in Python (for example, it can't run flake8 plugins, which is why I cant use it).

But since this has "path to executable" and "exec" in it, convince me nobody is going to report CVEs against this - let's be creative. What are they going to come up with ? "someone writes a program that generates alembic configs on the fly" - that would be silly. There are folks who literally do code searches looking for "exec" and "eval" calls in code so they can post frivolous CVEs. Just got an almost CVE the other day because someone was using a traceback tool that dumped all the variables in each frame in the traceback into the logs - so what happens when their database wont connect? the "password" variable is in the stackframe, yes it's cleartext because we have to send it to db.connect()! and we get the security email that SQLAlchemy is printing cleartext passwords in logs. It's really that bad nowadays.

Sorry for the rant! This has to be watertight against CVEs.

@lanzz
Copy link
Contributor Author

lanzz commented Jul 16, 2023

Ruff does support lots of popular plugins' tests out of the box, but they're all ported to Rust for speed, as it's one of its main goals.

I do appreciate the concern about CVEs, but I'm really not in a position to do a proper full security audit of a feature like that. I just figured that if I'm asking for a feature it's only fair to offer an implementation, but I definitely don't have the credentials to take responsibility for any CVEs found there.

@zzzeek
Copy link
Member

zzzeek commented Jul 16, 2023

OK now that it's a new day let me restate what i mean. I was not asking you to take responsibility for CVEs found later - if I was, then we can just merge this and CVEs would be on you to fix :) . Instead, by contributing this code to Alembic, it will be on me to respond to CVEs. Once we merge this, while we appreciate more help from you if needed, technically you'd be done :).

So what I am asking is this. verbiage in response to this dialogue:

Interviewer: So what's your code contribution?
Interviewee: This feature allows one to specify the name of an arbitrary executable that will be run when alembic formats an autogenerate file
Interviewer: OK, I know nothing about alembic. Does this feature that invokes an arbitrary executable introduce a security risk?
Interviewee: no, because ...... (answer here will be, "no, because this is a directive that is placed into a configuration file that is only manipulated by the programmer, and only is used within development within the code-generation feature of Alembic. There is already the ability to name arbitrary Python modules which could just as easily include arbitrary execution)

Now it seems like I just answered my own question, which I did, however, since I have not been close to this code in some years I'm looking for a quick "what am I missing?" on that response. cc @CaselIT can you take a look?

@CaselIT
Copy link
Member

CaselIT commented Jul 16, 2023

The change seem mostly ok.
Not sure about the name, but that's a secondary concern.

Regarding possibility for cve, I'm not sure it's any worse that what we have now, since console script already uses subprocess

@lanzz
Copy link
Contributor Author

lanzz commented Jul 16, 2023

I can see how my PR expands the security scope by introducing arbitrary execution — the existing console-scripts runner limits available execution vectors to only Python modules that are already installed and declare a console-scripts entrypoint, while my PR allows execution of truly arbitrary binaries, but as I said I don't have enough experience specifically in security audits to be able to assess the impact myself. One obvious attack vector I can see would be placing a binary somewhere higher in the PATH thus overriding the one intended by the maintainer of alembic.ini; another would be arbitrary binary execution in an environment where the maintainer of alembic.ini might not be authorized to execute arbitrary binaries. It is difficult for me to gauge the risks and consequences of those though.

@zzzeek
Copy link
Member

zzzeek commented Jul 16, 2023

One obvious attack vector I can see would be placing a binary somewhere higher in the PATH thus overriding the one intended by the maintainer of alembic.ini;

right, things like that. I think we should just document examples here like "/path/to/ruff" and also make sure in the test suite that's in #1276 we test an absolute path to the executable works

@lanzz
Copy link
Contributor Author

lanzz commented Jul 17, 2023

I've added a unit test and documentation to the PR

@zzzeek zzzeek assigned sqla-tester and unassigned sqla-tester Jul 17, 2023
@sqla-tester
Copy link
Collaborator

Mihail Milushev has proposed a fix for this issue in the main branch:

Implement new exec write-hook runner that will execute arbitrary binaries https://gerrit.sqlalchemy.org/c/sqlalchemy/alembic/+/4788

1 similar comment
@sqla-tester
Copy link
Collaborator

Mihail Milushev has proposed a fix for this issue in the main branch:

Implement new exec write-hook runner that will execute arbitrary binaries https://gerrit.sqlalchemy.org/c/sqlalchemy/alembic/+/4788

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autogenerate - rendering use case not quite a feature and not quite a bug, something we just didn't think of
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants