Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate hashes uses up disk space #544

Closed
graingert opened this issue Jul 24, 2017 · 1 comment
Closed

generate hashes uses up disk space #544

graingert opened this issue Jul 24, 2017 · 1 comment

Comments

@graingert
Copy link
Member

hash algorithms are (generally) designed so they take up constant space to calculate a hash.

currently the steps are:

  1. Download file
  2. stream file through hasher
  3. delete file
  4. extract hash from hasher midstate.

it should be:

  1. Stream download through hasher
  2. extract hash from hasher midstate.

This might need fixes upstream

suutari added a commit to suutari/prequ that referenced this issue Sep 2, 2017
The PyPIRepository._get_file_hash used to call unpack_url, when
generating the hash.  It only needed the side effect of the downloaded
package being left in the download directory and the unpacking part was
actually unnecessary.  Change it to just open the (local or remote)
package as a file object and hash the contents without unpacking.

This makes it faster and lighter, since unpacking consumes CPU cycles
and disk space, and more importantly, avoids problems which happen when
some distribution has a file with the same name as a directory in
another.  Unpacking both to packages to the same directory will then
fail.  E.g. matplotlib-2.0.2.tar.gz has a directory named LICENSE, but
many other packages have a file named LICENSE.

Fixes jazzband#512, jazzband#544
vphilippon added a commit that referenced this issue Sep 27, 2017
Hash packages without unpacking (Fixes #512 and #544)
@vphilippon
Copy link
Member

Fixed by #557

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants