-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide audit hook events during package installation #8938
Comments
For anyone wanting to ask in the future, the API pip would need to use is |
I wonder if it makes sense to make this a pip feature, or should we have a PEP? I imagine a number of these audit hooks would make sense not just for pip to provide them, but for other tools too. However I've never really used the Python audit hooks stuff to know if that's a crazy idea or not. |
That line says to me that it should be a standard, as the relevant tools would quite probably want to be able to catch If it were to be a PEP, I'd propose something like "Standard Audit Events for Package Installations", and define what information tools should audit (and specifying that the auditing should be via the core Python APIs), leaving it to individual tools like pip to decide how to implement it. Beyond that, though, I don't really have much of an opinion, as I'm not that familiar with how or why auditing tools handle this sort of thing. |
Agreed that it would be very beneficial to have this standardized as PEP since other alternatives such as poetry could use the same mechanism. I personally don't have any experience in what the process of creating and proposing PEP is so I can't probably do it alone but I am more than happy to help such as researching the internals of other package managers so we can define the attribute set as universal as possible or anything else. Is there anything I can do to help or implement from my side? |
@RootLUG A good place to start would be figuring out what would be good points for the hook -- i.e. at which point in the process of handling a package that pip would call the audit hook. |
Alright so I took a look at pip internals and found that in my opinion, the ideal location for inserting the audit hook would be pip/src/pip/_internal/req/req_install.py Line 775 in facc1e5
There were also other locations that I found but haven't passed testing where I considered the following:
The last point was causing a little bit of trouble as I found some potential entry points where the hook could be inserted but failed that condition. To explain a little bit more; pip is run to install a package Tested using tox and the following matrix of environments: Are there any other conditions that I should check for at that location? |
This hook would be fired after we run |
Consider looking at #6607 (comment), for context on how pip handles packages. Basically everything marked "install", "legacy", "modern" or "develop-install" in that graph results in code execution (except for wheel installs). |
FWIW, I wouldn't worry too much about standardising audit events across tools. They're low-level enough that you should only worry about pip's. Name them |
Let's make these audit hooks pip-specific. If someone wants to pick this up, please say so here and let us know how you're thinking of implementing this! :) |
It would be great if the this hook also gets argument with a local path to the directory with sources that are (verbatim) going to be used for the following installation process. For example, if the installation requires extracting some archive and then installing the package from that, the argument would be path to these extracted sources. The reasons I am suggesting this:
|
Events are cheap, no reason not to raise them before/after each major operation. For anyone watching logs, it'll also help correlate the rush of filesystem events that get logged in between, so they can tell that they're related to a specific pip operation. |
What's the problem this feature will solve?
This feature will enable third party tools to intercept package installations that could provide features such as:
This is just an example list of features that the third party implementations can provide by listening to the installation audit hooks.
Describe the solution you'd like
pip can leverage the functionality as described in PEP 578 to fire (custom) sys audit hooks before installation takes place which is native in Python 3.8+ and aims to provide visibility into the Python process for external monitoring systems.
The audit hook (for example
"pip.install"
) should also pass metadata about the installation in the same manner (tuple of arguments) as the builtin python hooks are already doingProposed example of audit arguments:
(x, y, z, ...)
where x is the package being installed (name) andy
is the parent package which hasx
as the dependency,z
has a dependencyy
and so on up to the root level/packagesimplewheel-1.0-py2.py3-none-any.whl
Since not all installation/invocation methods provide the necessary attributes, those that are missing (for example hash of the package) should be replaced with None if not available. All of the proposed example metadata arguments are already available in the
InstallRequirement
class from which they can be extracted into the tuple and passed to the audit hook.These arguments should be sufficient for external monitoring tool/listening audit hook to make an informed decision about the installed package and prevent package installation by raising an exception inside the hook handler to prevent the installation of the package and any code execution e.g. running setup.py. As denoted in PEP, the exception raised inside the audit hook should not be catched by pip and just propagated further resulting in an unhandled exception, maybe including cleanup of temporary data created by pip?
There could be also other audit hooks fired by pip such as uninstall of a package or pip invocation itself (e.g.
pip.invoked
with sys.argv as audit tuple arguments)Alternative Solutions
entry points would allow for almost the same functionality however there might be few additional problems related to that. The first is speed as the entrypoints would need to be imported during pip invocation which could add to delay. Exceptions thrown during that time (entrypoint import) could also cause pip to crash if not handled properly. I believe the system audit hooks are superior as they fit nicely into the python ecosystem since that is the reason why the audit hooks were designed in the first place and avoid reinventing the wheel. Also, the cited PEP would provide better reference implementation over decisions such as the above-mentioned exception throwing.
Additional context
There were already similar tickets or discussions about providing a "plugin" or "hook" functionality that allows to extend the installation process or gives visibility/auditing into packages that are going to be installed. The closes feature probably being #1035
There are few distinctions in the previous discussion/feature request vs. firing an audit hook. The discussion In that ticket got steered how the signature verification should be correctly implemented since getting the cryptography right is difficult and the same could be argued about the audit hooks, however, they are not designed to provide security mechanisms or sandboxing but merely just visibility into the blackbox that Python is and the same principle can be applied to pip installing packages which is a de-facto default tool in all modern Python installations.
I understand the reluctance to provide a public API as that brings problems with maintainability. That could be improved or made better by selecting different kind of attributes that are passed as metadata to the audit hook and with a combination of version numbers future proof for any potential changes that might occur. Alternatively the maintainability problem could be resolved almost completely by just passing (<pip_version>, <pip._internal.req.InstallRequirement object instance>) as the hook arguments and leaving the extraction of the necessary information to the monitoring system itself.
The text was updated successfully, but these errors were encountered: