-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pip should not execute arbitrary code from the Internet #425
Comments
Indeed, thanks for the concise summary. In addition to TLS cert verification and package signature verification, we should also have an option to forbid downloading any off-PyPI sdist that isn't served by HTTPS. |
Ultimately, package signature verification is the main thing. If the bytes are properly signed and authenticated, the transport can be any old insecure crud and it shouldn't matter: the software being executed is the right software regardless of how it got there. |
if pip gets support for "wheel" (see this fork: https://github.com/qwcode/pip), we'd be doing this for wheels at some pt at least, since the wheel spec provides for it, but @dholth can speak to that better than me. wheel docs: http://wheel.readthedocs.org/en/latest/ |
The main reason not to rely on package signatures alone is that old signatures can be replayed. Defense in depth seems to be a reasonable idea when it comes to installing and updating code. A rather good overview of the entire nightmare was written by Cappos et al: |
Any progress on this? I'd really like to see this issue be given the highest priority given the recent attack on Rubygems.org. Package authors aren't going to sign their packages unless they know the installer supports it. |
+1 |
3 similar comments
+1 |
+1 |
+1 |
This is a little easier today than it was a year ago when we last talked about it. Pip no longer supports python 2.4, which caused much trouble. Python 2.5's SSL support is stoneage at best, but most users are on 2.6 or better. If pip included the backported hostname checking code from 3.2 (http://pypi.python.org/pypi/backports.ssl_match_hostname/3.2a3) and only validated certificates on python 2.6 and newer (the same way mercurial does), this might be possible with a relatively small patch. |
with that code, where does the CA bundle come from? Wouldn't pip also need something like certifi? It looks like you can't get root certs out of the box on 2.6+. |
Hello, I'm one of the pip maintainers. I don't claim to have the security expertise to lead this effort, but i'm certainly interested in helping anyone who's willing to attempt pull requests when it comes to the basics of code placement and writing tests. |
Hi. I'm writing a small subsystem that can be plugged into pip (but also into any other tools, including ruby word), that manages trust to stuff downloaded from the Internet. Ping me on twitter @ZYGOON, here on github (zyga) or irc (again zyga) if you are interested in helping out. |
distutils support package signing with GPG: http://docs.python.org/2/distutils/uploading.html It creates a PACKAGE.asc file that pip could potentially download and verify with gpg (adding a flag to pip, not by default). It won't solve the key management problem, but at least if you're interested you can get the gpg key of the developer(s) and add them to your keyring so the signature can be verified. That could be a good start. PyPI should then encourage packagers to sign the packages (may be including a "how to" for gpg newbies; see create key, make backup, create a revocation cert, make backup, potentially export the key to a keyserver, etc). Potentially it would be a good recommendation that the author_email from setup.py matches the gpg key email, so it can be checked by pip. |
@reidrac That is insufficient, for all it does it allows anyone to do a MITM attack by repackaging any software as "Joe User" that has a valid GPG signature (for that user). |
I've started working on a tool that could be integrated with pip (and other tools) to verify downloaded software. It does not require SSL or any trusted networking of any kind. Have a look and help me design and implement it: https://github.com/zyga/distrust |
With digital signatures you would probably want a system that trusts On Mon, Feb 4, 2013, at 08:37 AM, Zygmunt Krynicki wrote: [1]@reidrac That is insufficient, for all it does it allows anyone — Reply to this email directly or [2]view it on GitHub. References |
@dholth yes, this is exactly what distrust aims to implement |
@zyga There's no code in your repo, but as spec it looks interesting. Looks like a good answer for pip signature verification. |
Code is coming this evening, I'm still working on it and I'm busy doing my regular job stuff ATM |
The most engineered [Python] update security system is probably https://www.updateframework.com/ . It has a lot of interesting ideas, most importantly the ability to survive certain types of key compromises. |
+1 last PyCon (or the one before?) a speaker was going to show us how to intercept the pip communication via injecting a packet before pip could respond. I love the idea that all my pip packages are signed so I can use 3rd party repos or mirrors without worrying. |
@zyga please read about: SDSI/SPKI: http://crypto.stackexchange.com/questions/790/need-an-introduction-to-spki-or-spki-for-dummies Wheel signatures: http://www.python.org/dev/peps/pep-0427/#signed-wheel-files Did you know you can use the ssh-agent to do public key signing and verification? |
@dholth I read all of that quickly but I don't know which part of that I should find interesting. Correct me if I skipped something essential. Wheel signatures are good but they are in no way improving over the existing signatures for source tarballs. Note that I'm not implementing a crypto system or a certificate authority replacement as that is all not really solving the problem for software distribution (so what that code is signed if anyone can sign it). As for all the other things, how are they going to improve the situation? Code signing in itself is not useful for anything as anyone can sign everything. The idea I proposed builds a thin layer of trust semantics on top of the existing GPG system. Do you think I could reuse any of the tools you've mentioned to implement that faster/better/more correct? |
The wheel command is pretty much identical to what I've proposed but weaker as 1) It cannot take advantage of existing GPG identity network 2) has no support for improving trust to unsigned files. It's still interesting though as other ideas seem to match exactly to what I wrote |
Signed packages and using verified SSL by default are two separate issues. The former is more difficult to do (every developer has to sign their packages), whereas the latter, I'm honestly shocked doesn't happen. Even a simple one line fix of changing the default index to https://pypi.python.org/simple/ would go some way, but verifying SSL certificates is a must. |
Are there any all-python solutions for signing? That may make it more likely to work cross-platform without a lot of overhead (/me looks at windows). |
Even if we used certifi the cert is a cacert one, which IIRC is not in the Mozilla bundle https://bugzilla.mozilla.org/show_bug.cgi?id=215243#c158
|
My impression is that this ticket is too vaguely specified so that comments will grow endlessly, but the ticket will never be closed, or if it is, some people will complain that it should not be. So: I propose we replace this ticket with different tickets which are more specific and distinct, such as those for TLS verification, and those related to package signatures. If you think of more specific issues, please create more specific tickets and cross-link them here. I'm just a pip fan, not a core part of the community, so if anyone prefers not following my suggestion, please speak up. I'll start the ball rolling with: ticket #1035 with a package signature verification "hook" that could allow people to experiment and users to choose and opt-in to their preferred scheme. |
@dstufft Awesome. This is all really good to hear! Especially that 1.4 will allow opting out of pip's URL crawling via a flag, because then it can be forcefully enabled per project via requirements.txt. I was about one more email away from sending my bug report and POC to [email protected] for pip's crawling behaviour. If this is still helpful to push distros to update their packages, I can still do so. And, of course, I can give you/the PyPI team my audit notes and discussions of the current/recent CVEs. I am really glad that you all are working on more secure update mechanisms. Thanks a lot! :) Though, I must point out -- as others have on this ticket -- that the following are separate issues:
The first is happily completed. I have yet to re-audit it to confirm. The second...well, pip and PyPI both default to md5. In PyPI (please correct me if I missed something!), there doesn't seem to be a way to give SHA256 or similar-grade hash digests to package URLs. I was glad to find that there is now a "links" interface in PyPI for maintainers to control what URLs pip downloads from, though if there is a way to specify an alternative hash digest which is actually checked on the client end, I missed it. The brief version of why you should not use MD5:
And on to the short version of why you should not switch to SHA1:
For the third issue, code signature verification, there should be a way to specify which developer is allowed to sign which packages. see my above post on this ticket. @nejucomo I am happy to split my responses into multiple tickets, as the pypa devs see fit. I'm not a contributor either, and if there is some pattern to project management I haven't been able to decipher it. Don't wanna mess with their flow. :) References:
|
If you want to fire off a CVE for it I'll gladly include it in the release notes. I was going to figure out how to do it myself to be honest but I don't care how it happens :) As far as Hashes go, pip itself doesn't default to anything. It can use any hashing algorithm supplied by an url that is guarenteed to be in hashlib (notably this is md5, sha1, and any of the sha-2's). I added this I think in 1.2? Maybe 1.3 so that I could use sha256 hashes on Crate.io. The md5's come from PyPI itself and currently they are still md5's because of setuptools/easy_install which only support md5. As far as I'm aware it is not currently feasible to generate a second preimage attack against md5 (if you know of an attack that allows this please please tell me so I can use it to convince people we need to switch). This is another thing on my list of things I want to fix but I have currently put it on the back burner due to there being no preimage attack on md5 that I was aware of. As far as pip itself goes unless ultimately a different scheme than As far as package signatures go #1305 was recently opened and is probably the best place to talk about that currently. Again this was something on my list and was punted to deal with more pressing issues. I should probably also mention that any changes to the hash function or package signatures will likely need to go through distutils-sig and go through the bikeshedding contained within. |
I'm going to close this ticket as it has no clear goal. Pip no longer downloads things from PyPI without TLS. If there are specific deficiencies with what pip offers per item tickets should be opened for each one. |
Also could I point out that doing |
Using pip vs OS packages is a user choice with various trade offs to both sides of the argument. |
If your system doesn't have a package manager, why not use a virtualenv? If you are using a system with any kind of package manager, you should never If a system-wide package requires a python component, the dependency should be resolved with the package manager. If not, I don't think it is a personal choice, because it's a terrible habit and I've seen many systems get screwed up due to it by people who are making a "user choice" and have no idea the implications of what they are doing. If you are using, say, OSX for development, use a virtualenv, if you are creating a distributable package, then use brew (or whatever) to install Python dependencies (and if brew doesn't have the package, change that by making a brew package). For instance - I want to run uWSGI as a system-wide daemon. The version of uWSGI in my package manager (let's say Debian wheezy) is totally out of date. So, I create a virtualenv in /opt/uwsgi and install it there, then have my init script reference It's just being responsible. |
If my system doesn't have a package manager how am I supposed to install virtualenv ;) Also needing to activate a virtualenv for using command line tools is ugly. I don't know what Linux systems you use, but in my preferred Debian based ones |
Ok, makes sense - though I would explicitly check your distro does this first. But then, what's the problem with doing |
It's only available to my user :) Get's annoying when I'm bouncing between different accounts for different things. |
AFAIK, sudo pip install (like every other unchecked source of executable code) should not be run as root on an actual system. Debian mitigates around this with fakeroot and virtual containers for building packages. Files built are signed and checksummed. If pip dropped privs to the minimum it needs to install into a path in PATH and sys.path, it would still be a risk to execute sudo pip install. Could someone indicate to me how live.sysinternals.com is more or less of a risk than just 'sudo rm *'? Don't open (executable) email attachments. Don't sudo pip install. |
I like virtualenv and virtualenvwrapper alot. I like not having write permission to scripts that I execute often (e.g. --user). Does pip support --prefix? Does pip do snapshots/backups prior to installation? |
You don’t have to: just download virtualenv.py and run it. From that first venv, you can get pip, venvwrapper, etc. Yes, it’s per-user.
I add the bin directory of the bootstrap venv I describe above to my PATH. |
Also, perhaps one could avoid depending on pip --user (which I find irritating) by running |
For the sake of consistency, a |
I’m not sure consistency is an argument here. The venvs that I manage with virtualenvwrapper are used for my Python projects and can be deleted or re-created at all time, but the global/bootstrap venv I created in ~/.usr (or ~/local in your example) is not throw-away: I depend on having things installed in there (mostly scripts/programs, not libs). |
This is way OT. To each their own. I should have been more clear. I believe consistency to be an argument here because I like to consistently apply the same tools and processes to managing virtual environments. Commands like When I backup a virtualenv, I usually only need the |
I consider the discussion about user permissions and install locations when running pip to be orthogonal to this ticket, which is about TLS verification. So, I created #1169 to capture that orthogonal issue. |
When you 'pip install' something, it fetches the code from the internet, and then executes it. If you follow the advice of many projects and 'sudo pip install' something, pip executes that code from the internet as root.
pip does not do TLS certificate verification, nor does it do package signature verification, nor does it even do DNSSEC. There is no assurance whatsoever that the code being installed came from the intended source. The archetypical hipster hacker doing a 'pip install django' over some cafe's wifi will be pwned within seconds if the DNS for pypi.python.org happens to be spoofed.
I believe that this might be addressed by #402 but that deals with a bunch of other issues as well, and I felt there should be a report somewhere about this somewhat well-known deficiency in pip's download and update procedures.
The text was updated successfully, but these errors were encountered: