Skip to content
This repository has been archived by the owner on Jul 4, 2023. It is now read-only.

use sha256 by default #21654

Closed
wants to merge 1 commit into from
Closed

use sha256 by default #21654

wants to merge 1 commit into from

Conversation

camillol
Copy link
Contributor

@camillol camillol commented Aug 5, 2013

Because it's what's recommended now.

@adamv
Copy link
Contributor

adamv commented Aug 5, 2013

Not to be a jerk, but now there's two commits in the thing I wanted one commit for.

@MikeMcQuaid
Copy link
Member

suggest sha256 by default

Why? The attack is only theoretical? I haven't seen any collision generators.

@camillol
Copy link
Contributor Author

camillol commented Aug 5, 2013

@adamv can you cherry-pick commits when accepting a pull request?

@adamv
Copy link
Contributor

adamv commented Aug 5, 2013

Sure, but I'm "bad at it" and often lose authorship information.

@camillol
Copy link
Contributor Author

camillol commented Aug 5, 2013

You should make a script to help you do it right or something.

@adamv
Copy link
Contributor

adamv commented Aug 5, 2013

I'm not the one who put two changes in one pull request.

@camillol
Copy link
Contributor Author

camillol commented Aug 5, 2013

Does there even need to be a pull request at all? That's just a tool to let you know there is stuff I'd like you to pull. But the change you want is already isolated in a single commit in my public repository: you can just grab it whenever you want.

@camillol
Copy link
Contributor Author

camillol commented Aug 5, 2013

This pull request is now exclusively about the sha256 change. The debug stuff is in #21680.

@ghost
Copy link

ghost commented Aug 6, 2013

Nice one :) Wanted this for some time.

@MikeMcQuaid
Copy link
Member

This is a maintainer discussion I guess but due to reasons mentioned above I think this is unnecessary.

@andypiper
Copy link
Contributor

So does this change also suggest that we need to update docs on Formula building to recommend authors use shasum -a 256 <tarball> to generate checksums, and use sha256 in formulae instead of sha1?

@MikeMcQuaid
Copy link
Member

@andypiper If it gets merged, yep. I don't want it to.

@MikeMcQuaid
Copy link
Member

@mxcl @jacknagel @mistydemeo @samueljohn Thoughts?

@@ -98,11 +98,11 @@ def test_download_strategy

def test_verify_download_integrity_missing
fn = Object.new
checksum = @spec.sha1('baadidea'*5)
checksum = @spec.sha256('baadidea'*8)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on a side note the i in baadidea makes this an invalid hex value ... I don't know if that matters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it was a baddidea in the first place.

@samueljohn
Copy link
Contributor

Overall: I am fine with this as I have to copy/paste anyways, so there is not more work involved by switching to sha256 for new formulae.

@samueljohn
Copy link
Contributor

I cannot judge if this is really necessary - would tend to agree to Mike here. But on the other hand the few bytes more won't hurt much.

@jacknagel
Copy link
Contributor

I don't have particularly strong feelings either way, because simple checksums are not a security measure. They are for verifying data integrity, but have absolutely nothing to say about authenticity.

Further, in order for a checksum to become broken for the purpose of verifying data integrity, it has to be vulnerable to a second-preimage attack, not a simple collision attack. There aren't even any feasible preimage attacks on MD5 yet.

By using packages from Homebrew at all you are placing your trust in the provider of the data (upstream project) and the provider of the checksum (us). Actual security requires cryptographic signatures and a pre-established web of trust.

@mxcl
Copy link
Contributor

mxcl commented Aug 7, 2013

If we do this, please avoid what happened last time, ie. brew update reported that every formula in the whole repo was updated.

So, I'd say insist on sha256 for new submissions only, don't update.

And. What is the performance of this? I'd rather not if it means it'll take noticeable time to checksum boost for instance.

@mistydemeo
Copy link
Member

A little slower, hardly a bottleneck:

$ time /usr/local/opt/openssl/bin/openssl sha1 $(brew --cache boost)
SHA1(/Users/vlcice/Library/Caches/Homebrew/boost-1.54.0.tar.bz2)= 230782c7219882d0fab5f1effbe86edb85238bf4
/usr/local/opt/openssl/bin/openssl sha1 $(brew --cache boost)  0.15s user 0.03s system 97% cpu 0.181 total
$ time /usr/local/opt/openssl/bin/openssl sha256 $(brew --cache boost)
SHA256(/Users/vlcice/Library/Caches/Homebrew/boost-1.54.0.tar.bz2)= 047e927de336af106a24bceba30069980c191529fd76b8dff8eb9a328b48ae1d
/usr/local/opt/openssl/bin/openssl sha256 $(brew --cache boost)  0.37s user 0.03s system 98% cpu 0.401 total

@MikeMcQuaid
Copy link
Member

Given what @jacknagel says my position hardens further here a bit; I'm totally willing to be overruled here but as it stands I actively don't want this merged.

@camillol
Copy link
Contributor Author

camillol commented Aug 7, 2013

@mxcl yes, I wasn't suggesting that all existing formulas be switched to sha256. It can be done when the formula is updated for other reasons.

@jacknagel that's what people always say after they have had a security problem: we weren't trying to do security in the first place! :-) And there's truth to it, of course. But there is also such a thing as mitigation. If you trust that you're getting the right formula files from Homebrew, then the checksum makes sure that you're using the same upstream packages that the Homebrew maintainers used. And while you know that they didn't audit every single package for safety, at least there's an obstacle against things getting switched around under your nose.

The big hole right now is patches, of course. If any of the many external services that host patches is compromised, someone can slip anything into your code. Can they attack GitHub? Can they attack your own network? Yes, but you still want to put obstacles in their way and reduce the attack surface.

So I went and added checksums for patches. At that point I thought: is there any real reason to use SHA1, when we have SHA2 and that's what's being recommended for new applications? What is the point of using a hash that's begun showing issues when you have a better option at hand?

But @MikeMcQuaid said I should support SHA1 for consistency, since that's what used by default for formulas. Well, if that's the problem, why not make SHA256 the default for formulas instead?

Yes, we're not building a hardened security system here, and we make no guarantees that the packages have been audited, etc.. But it's not like we have to make things as insecure as possible to avoid liability. We can choose to make things a little harder to break, because it costs us nothing.

@jacknagel
Copy link
Contributor

As I said, I neither support nor oppose this patch, but I think arguing that this improves the security situation is disingenuous. There aren't even any theoretical preimage attacks on SHA1, and if there were, we'd be far more screwed because the integrity of every git repository in the world, including ours, would be compromised.

@MikeMcQuaid
Copy link
Member

That Git relies so heavily on SHA-1 and we rely so heavily on Git I think adds weight to this being somewhat pointless/overkill.

We can choose to make things a little harder to break, because it costs us nothing.

It doesn't cost nothing; it costs consistency, readability of formulae, and encouraging another change on our community with little benefit compared to the MD5 -> SHA-1 move.

@adamv @samueljohn @mistydemeo Can you chip in here?

@camillol
Copy link
Contributor Author

camillol commented Aug 8, 2013

@jacknagel what protects the git repository is not the SHA1 (it's truly only used for data integrity there), but GitHub's security. You're comparing apples with oranges.

@jacknagel
Copy link
Contributor

I said nothing about SHA1 providing repository security. I am speaking solely about data integrity.

If you're concerned about security, write a patch that adds a signature verification step. But stop pretending checksums can provide security. They can't.

@camillol
Copy link
Contributor Author

camillol commented Aug 8, 2013

I'd like to add that I don't consider this a huge deal; if I'm expending points here, or if I could choose to spend @mxcl's rare appearance on any discussion of my choosing, I certainly wouldn't waste it here. But @jacknagel is (involuntarily) spreading misinformation here.

@jacknagel, how do you think you apply a digital signature to a 20 gb file, internally? You sign a cryptographic hash of the file. The signature adds authentication for the hash (and non-repudiation, but we don't care about that), and the hash (if it has the right cryptographic properties) guarantees that the 20gb file is really the one that you signed.

Now, for homebrew's repository the situation is: write access is controlled so we trust that the data hasn't been tampered with, read access is authenticated (everyone's using HTTPS to pull from GitHub, right?) so we trust that it's the right data. The internal git checksums don't come into play in guaranteeing security.

For external resources the situation is: write access is completely outside our control, and we'd rather not have to trust it if possible. We have checksums, and since we trust our authenticated connection to GitHub, and we trust that GitHub authenticates people who write to the repository, we trust that those checksums are authenticated as well. Therefore we can rely on those checksums to ensure that the external, unauthenticated data is what it says it is. But only as long as the checksums themselves are not vulnerable, of course.

@jacknagel
Copy link
Contributor

how do you think you apply a digital signature to a 20 gb file, internally? You sign a cryptographic hash of the file. The signature adds authentication for the hash (and non-repudiation, but we don't care about that), and the hash (if it has the right cryptographic properties) guarantees that the 20gb file is really the one that you signed.

I'm aware of how this works. It doesn't change the fact that the checksum itself does not provide security. You need both a signature and pre-established trust in the provider of the signature.

Now, for homebrew's repository the situation is: write access is controlled so we trust that the data hasn't been tampered with, read access is authenticated (everyone's using HTTPS to pull from GitHub, right?) so we trust that it's the right data. The internal git checksums don't come into play in guaranteeing security.

I don't know what you're trying to prove. I never claimed this.

For external resources the situation is: write access is completely outside our control, and we'd rather not have to trust it if possible. We have checksums, and since we trust our authenticated connection to GitHub, and we trust that GitHub authenticates people who write to the repository, we trust that those checksums are authenticated as well. Therefore we can rely on those checksums to ensure that the external, unauthenticated data is what it says it is. But only as long as the checksums themselves are not vulnerable, of course.

In other words, a user can't trust the data alone, but if she pretends she can trust a checksum provided by a third-party, then she can trust the data. Well, I guess you must be right! Better write a paper, people are going to want to know about this.

The fact that she obtained the checksum from GitHub means nothing. For all she knows, I am inserting wrong checksums and conspiring with servers upstream from her to provide malicious tarballs that match those checksums.

Checksums alone are not a security measure. Switching from one hash function to another, sans any type of authentication, does not improve security.

@camillol
Copy link
Contributor Author

camillol commented Aug 8, 2013

Please stop resorting to mantras. I told you where the authentication comes in. Yes, the user has to trust you not to mess up the homebrew code. You may call into question the reasonableness of that trust, but still, it's better than having to trust you and any number of people that can mess with an external server that we are accessing with plain http.

@MikeMcQuaid
Copy link
Member

This PR is getting merged if multiple other maintainers feel strongly about it being included. Otherwise it will not. @jacknagel and I both understand the issues here sufficiently that new information will not change our mind. If you want to talk further take this to the mailing list.

@afb
Copy link
Contributor

afb commented Aug 11, 2013

If you are changing the default checksum to sha256 from sha1, you might want to consider a feature that allows using base-32 instead of base-16 for the checksum. That allows you to write the sha256 checksum as DX5RHTQPMFB74Z23KJP4TYLIVWZCCXC5LFS4T5LTA25ZSMLQSFHQ instead of 1dfb13ce0f6143fe675b525fc9e168adb2215c5d5965c9f57306bb993170914f which makes it shorter. I have some old brew code that does this, if you are interested. It's the same bits in both cases, in case you wondered...

Making the change in needed characters less:

sha1 '482e737739d946b7c8cbaf127d9ee9c148b999f5'
sha256 'DX5RHTQPMFB74Z23KJP4TYLIVWZCCXC5LFS4T5LTA25ZSMLQSFHQ'

@adamv
Copy link
Contributor

adamv commented Aug 11, 2013

I vote -1 on this whole thread.

@Homebrew Homebrew locked and limited conversation to collaborators Feb 17, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants