Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding _base64_crc32c function for storage integrity checks. #692

Closed
wants to merge 1 commit into from

Conversation

dhermes
Copy link
Contributor

@dhermes dhermes commented Mar 9, 2015

@tseaver I have yet to do two things:

  • Add crcmod as a dependency in setup.py.
  • Add tests for the added method (wanted to figure out the depedency first)

crcmod is compiled C code and it should be optional. Is there a way to ask setuptools to try to install, but continue on if it fails?


This is work towards #547. Picks up where #611 left off.


Some relevant snippets I used:

(ADDED 3/11/2015): https://github.com/GoogleCloudPlatform/gsutil/blob/748c9da8497816dbf9d2044dc25ff16ecf6494f0/gslib/copy_helper.py#L769
https://github.com/GoogleCloudPlatform/gsutil/blob/748c9da8497816dbf9d2044dc25ff16ecf6494f0/gslib/hashing_helper.py#L99-L116
https://github.com/GoogleCloudPlatform/gsutil/blob/748c9da8497816dbf9d2044dc25ff16ecf6494f0/gslib/copy_helper.py#L771
https://github.com/GoogleCloudPlatform/gsutil/blob/748c9da8497816dbf9d2044dc25ff16ecf6494f0/gslib/copy_helper.py#L778

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Mar 9, 2015
@dhermes dhermes added the api: storage Issues related to the Cloud Storage API. label Mar 9, 2015
@tseaver
Copy link
Contributor

tseaver commented Mar 9, 2015

crcmod is compiled C code and it should be optional. Is there a way to ask setuptools to try to install, but continue on if it fails?

It already handles that, via the following in its own setup.py:

try:
    setup(**setup_dict)
except KeyboardInterrupt:
    raise
except:
    # If there are any compilation errors or there are no build tools available
    # for the extension module, delete the extension module and try to install
    # the pure Python version.
    del setup_dict['ext_modules']
    setup(**setup_dict)

Note that crcmod doesn't explicitly support Python 3.3 or 3.4.

@dhermes
Copy link
Contributor Author

dhermes commented Mar 9, 2015

D'oh! My bad for not checking. Great news though.

RE: Python 3.3/3.4 support

  • the module has not been updated since 2010
  • there is a gsutil mirror since the ship it in their binary
  • there is a somewhat official repo that was created in 2012, but commits still stop in 2010 (it doesn't look like the author "Ray Buvel" is in any of the commits)

I don't know if there is a formal way to "refresh" the package that is 4+ years old.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.06%) to 99.94% when pulling 48e1e0a on dhermes:add-crc32c-check into 83f92cc on GoogleCloudPlatform:master.

@dhermes
Copy link
Contributor Author

dhermes commented Mar 12, 2015

Running the generator is not so helpful:
https://gist.github.com/dhermes/1cecf85c706ef8491c9a

One of the biggest perks is the .update() and .hexdigest() interface of the Python interface.

import binascii
try:
import crcmod
except ImportError:

This comment was marked as spam.

@tseaver
Copy link
Contributor

tseaver commented Mar 12, 2015

Because crcmod is an "optional" dependency (conditionally imported, with an exception raised at the one point it is needed if it is not present), we should declare it via an "extra" in setup.py, e.g.:

setup(
    name='gcloud',
    ...
    extras_require={
        'storage_checksum': ['crcmod'],
    },

It needs to be in tox.ini as well.

@dhermes
Copy link
Contributor Author

dhermes commented Mar 12, 2015

@tseaver We need to determine what to do about Python 3 compatibility before moving too far forward.

It seems relying on this library from PyPI is not possible in the current state.

@tseaver
Copy link
Contributor

tseaver commented Mar 12, 2015

We could fork it to https://github.com/GoogleCloudPlatform/gcloud-python, I guess, and ensure that its tests support the versions we care about.

@dhermes
Copy link
Contributor Author

dhermes commented Mar 12, 2015

@cmcqueen Do you have any contact with the maintainers of crcmod? (I am inquiring due to your fork: https://bitbucket.org/cmcqueen1975/crcmod)

@cmcqueen
Copy link

I have had contact in the past, though that was several years ago (I contributed some changes to crcmod). I could look up my old e-mail correspondence.

@dhermes
Copy link
Contributor Author

dhermes commented Mar 12, 2015

Thanks for the quick reply! That would be great. We essentially just want to get Python 3.3. and Python 3.4 supported in the library without forking it. I don't know what happens to abandoned PyPI packages, but it seems something like crcmod2 might be necessary.

@tseaver
Copy link
Contributor

tseaver commented Mar 12, 2015

I reached out today to Ray Buvel (the author) via the SourceForge message UI

@dhermes
Copy link
Contributor Author

dhermes commented Mar 12, 2015

Thanks @tseaver. Maybe we could file an issue with https://github.com/pypa/pip or reach out to https://groups.google.com/forum/#!forum/pypa-dev (Python Packaging Authority).

@cmcqueen
Copy link

What is needed for 3.3 and 3.4 support? Do you mean just compile Windows binaries? I compiled the old ones I think, but only for 32-bit not 64-bit.

(I wish the Python foundation provided a service to compile Windows binaries for Python packages, since it's difficult to get set up to compile them, for both 32-bit and 64-bit. It would be really helpful.)

There is this unofficial repository of Python binary packages, including crcmod

@tseaver
Copy link
Contributor

tseaver commented Mar 13, 2015

@cmcqueen Yes, we'd like to have Windows users able to use gcloud and its dependencies on the Python versions we support, without needing to be able to build extensions. I found an explicit e-mail address today for Ray Buvel, and tried again (on the theory that the SF.net profile might have an out-of-date address).

@dhermes
Copy link
Contributor Author

dhermes commented Mar 23, 2015

@tseaver Did you ever hear from Ray Buvel?

@dhermes
Copy link
Contributor Author

dhermes commented Sep 17, 2015

@tseaver Did you ever hear from Ray Buvel?

@tseaver
Copy link
Contributor

tseaver commented Sep 17, 2015

@dhermes Nope, crickets.

@dhermes dhermes added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Jan 22, 2016
@lukesneeringer
Copy link
Contributor

Declaring bankruptcy on this PR, which is two years old and no activity for 18 months.

@dhermes dhermes deleted the add-crc32c-check branch March 16, 2017 23:02
@tseaver
Copy link
Contributor

tseaver commented Aug 6, 2018

@frankyn This is the ticket for our never-merged CRC32C support.

@frankyn
Copy link
Member

frankyn commented Aug 6, 2018

Thank you @tseaver!

@tseaver
Copy link
Contributor

tseaver commented Aug 6, 2018

crcmod is effectively abandoned (last release more than eight years ago). We might do better to use crccheck, which is pure Python, and supports modern Python versions (it is GPL v3, though).

@frankyn
Copy link
Member

frankyn commented Aug 9, 2018

So the open question is we need to define a package to use for crc32c checksum support?

@tseaver
Copy link
Contributor

tseaver commented Aug 9, 2018

@frankyn Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. cla: yes This human has signed the Contributor License Agreement. do not merge Indicates a pull request not ready for merge, due to either quality or timing.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants