Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add crc32c checksums to S3 Service #4533

Merged
merged 8 commits into from
Apr 26, 2024

Conversation

JWackerbauer
Copy link
Contributor

For #4527

@JWackerbauer JWackerbauer marked this pull request as draft April 25, 2024 20:41
@JWackerbauer
Copy link
Contributor Author

PutObject is working but Multipart upload fails with code: "InvalidRequest", message: "Checksum Type mismatch occurred, expected checksum Type: null, actual checksum Type: crc32c".

This has me scratching my head a bit... I tried to remove the checksum header from multipart upload requests, but then I get the error code: "InvalidRequest", message: "Content-MD5 OR x-amz-checksum- HTTP header is required for Put Part requests with Object Lock parameters" again.

Could it be that multipartuplaod only supports the Content-MD5 header?

@Xuanwo
Copy link
Member

Xuanwo commented Apr 26, 2024

This has me scratching my head a bit... I tried to remove the checksum header from multipart upload requests, but then I get the error code: "InvalidRequest", message: "Content-MD5 OR x-amz-checksum- HTTP header is required for Put Part requests with Object Lock parameters" again.

We need to specify x-amz-checksum-algorithm in CreateMultipartUpload: https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html#API_CreateMultipartUpload_RequestSyntax

@@ -821,6 +846,17 @@ pub struct OutputCommonPrefix {
pub prefix: String,
}

pub enum S3ChecksumAlgorithm {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the S3 prefix.

if let Some(checksum_algorithm) = self.checksum_algorithm.as_ref() {
let checksum = match checksum_algorithm {
S3ChecksumAlgorithm::Crc32c => {
BASE64_STANDARD.encode(crc32c::crc32c(body.to_vec().as_slice()).to_be_bytes())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't call body.to_vec() which invovles extra allocation and copy. Please call crc32c_append in loop instead.

BASE64_STANDARD.encode(crc32c::crc32c(body.to_vec().as_slice()).to_be_bytes())
ChecksumAlgorithm::Crc32c => {
let mut crc = 0u32;
body.clone()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This clone should be okay, since it does not do any allocation (?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this clone is Arc::clone.

Crc32c,
}
impl ChecksumAlgorithm {
pub fn to_header_key(&self) -> &str {
Copy link
Member

@Xuanwo Xuanwo Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using to_header_name()? And I think we can use HeaderName here directly.

req = req.header(
"x-amz-checksum-algorithm",
match checksum_algorithm {
ChecksumAlgorithm::Crc32c => "CRC32C",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can implement Display for ChecksumAlgorithm

@Xuanwo Xuanwo changed the title Add crc32c checksums to S3 Service feat: Add crc32c checksums to S3 Service Apr 26, 2024
Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks!

@Xuanwo Xuanwo marked this pull request as ready for review April 26, 2024 08:35
@Xuanwo Xuanwo merged commit b1f6eb0 into apache:main Apr 26, 2024
211 checks passed
@JWackerbauer
Copy link
Contributor Author

Awesome, Thank you too!

@@ -350,6 +350,7 @@ prometheus-client = { version = "0.22.2", optional = true }
tracing = { version = "0.1", optional = true }
# for layers-dtrace
probe = { version = "0.5.1", optional = true }
crc32c = "0.6.5"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dependency doesn't seem to be grouped cleanly like others.

Copy link
Member

@Xuanwo Xuanwo May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, would like to send a PR to fix this? Also, I think this dep should be hidden under services-s3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants