What is the purpose of checksum and what can I expect #1509

ladidadida · 2024-01-23T12:28:45Z

ladidadida
Jan 23, 2024

Hi there,

very cool project and I am very happy I stumbled across it in the very right time.

I am implementing just another synchronization app and the fsspec already provides access to every file storage I could think of. Another benefit - at least for 10 secs - is that fsspec already provides a checksum method which might come handy when comparing files for different storage backends....

You might already guess my question: Why does the checksum behave completely different for different filesystem implementations? I would assume that this function should follow a base specification on all filesystems which makes it somehow interchangeable, but it seems not.

Is this expected? Or do you see a chance to update the behaviors to follow a common scheme?

For me this behavior is not a show stopper, I will just use another hashing algorithm.

And again, thanks for this awesome library.

BR ladi

Answered by martindurant

Jan 23, 2024

fsspec does not do any checksumming of its own. The method checksum could perhaps be better named "UID" or similar: it is a value based on whatever information the target storage provides for the path in question. That information might include read checksums (e.g., on S3 or GCS), but some will not (e.g., local filesystem); it also might return different information every time (e.g., HTTP responses).

In addition, some backends verify checksums on write, in which case fsspec does calculate the bytes as they are written.

View full answer

martindurant · 2024-01-23T16:40:36Z

martindurant
Jan 23, 2024
Maintainer

fsspec does not do any checksumming of its own. The method checksum could perhaps be better named "UID" or similar: it is a value based on whatever information the target storage provides for the path in question. That information might include read checksums (e.g., on S3 or GCS), but some will not (e.g., local filesystem); it also might return different information every time (e.g., HTTP responses).

In addition, some backends verify checksums on write, in which case fsspec does calculate the bytes as they are written.

1 reply

ladidadida Jan 23, 2024
Author

Hi @martindurant , thank you for the past reply.

This is what I expected. Thank you for clarification.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the purpose of checksum and what can I expect #1509

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

What is the purpose of checksum and what can I expect #1509

ladidadida Jan 23, 2024

Replies: 1 comment · 1 reply

martindurant Jan 23, 2024 Maintainer

ladidadida Jan 23, 2024 Author

ladidadida
Jan 23, 2024

Replies: 1 comment 1 reply

martindurant
Jan 23, 2024
Maintainer

ladidadida Jan 23, 2024
Author