Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with multipart upload to Azure storage: BlockID not a valid base64 string #653

Closed
joaniegannon opened this issue Jun 17, 2024 · 4 comments

Comments

@joaniegannon
Copy link

Hello all,

I have s3proxy stood up to point to an Azure storage endpoint. All was well until I attempted to upload a relatively large mp4 file (about 400 MB) and the multipart upload would fail with the following error (with confidential info removed):

Cannot retry after server error, command is not replayable: [method=org.jclouds.azureblob.AzureBlobClient.public abstract void org.jclouds.azureblob.AzureBlobClient.putBlock(java.lang.String,java.lang.String,java.lang.String,org.jclouds.io.Payload)[XXXXXXXXXXX, XXXXXXXXXXX/666b2296638fe0de53a54633.mp4, AAK_IA==, [content=true, contentMetadata=[cacheControl=null, contentDisposition=null, contentEncoding=null, contentLanguage=null, contentLength=8388608, contentMD5=null, contentType=application/unknown, expires=null], written=false, isSensitive=false]], request=PUT https://XXXXXXXXXXX.blob.core.XXXXXXXXXXX.net/XXXXXXXXXXX/666b2296638fe0de53a54633.mp4?comp=block&blockid=AAK_IA%3D%3D HTTP/1.1]

After some further digging, I found out that Azure was sending back a 400 to the proxy complaining about the blockid query param not being base64 encoded:
</Message><QueryParameterName>blockid</QueryParameterName><QueryParameterValue>AAK_IA==</QueryParameterValue><Reason>Not a valid base64 string.</Reason></Error>

Indeed, that blockid AAK_IA== doesn't look like what I would expect for a base64 encoded string.


Repro Steps/Logs:

I cloned and stood up an instance of the latest version of s3proxy locally and ran the following command from s3 cli:

aws s3 cp 666b2296638fe0de53a54633/666b2296638fe0de53a54633.mp4 s3://XXXXXX/666b2296638fe0de53a54633/666b2296638fe0de53a54633.mp4 --endpoint-url=http://localhost:8050

Got the following response:

upload failed: 666b2296638fe0de53a54633/666b2296638fe0de53a54633.mp4 to s3://XXXXXXX/666b2296638fe0de53a54633.mp4 An error occurred (BadDigest) when calling the UploadPart operation (reached max retries: 4): Bad Request

Here's the relevant chunk of the debug trace from s3proxy (with confidential info removed)

[s3proxy] D 06-17 15:05:43.418 S3Proxy-Jetty-51 jclouds.headers:56 |::] >> PUT https://XXXXXXXXXXX.blob.core.XXXXXXXXXXX.net/XXXXXXXXXXX/666b2296638fe0de53a54633.mp4?comp=block&blockid=AAK_IA%3D%3D HTTP/1.1
[s3proxy] D 06-17 15:05:43.418 S3Proxy-Jetty-51 jclouds.headers:56 |::] >> x-ms-version: 2017-11-09
[s3proxy] D 06-17 15:05:43.418 S3Proxy-Jetty-51 jclouds.headers:56 |::] >> Date: Mon, 17 Jun 2024 19:05:40 GMT
[s3proxy] D 06-17 15:05:43.418 S3Proxy-Jetty-51 jclouds.headers:56 |::] >> Authorization: SharedKeyLite XXXXXXXXXXX:XXXXXXXXXXX
[s3proxy] D 06-17 15:05:43.418 S3Proxy-Jetty-51 jclouds.headers:56 |::] >> Content-Type: application/unknown
[s3proxy] D 06-17 15:05:43.418 S3Proxy-Jetty-51 jclouds.headers:56 |::] >> Content-Length: 8388608
[s3proxy] D 06-17 15:05:44.907 S3Proxy-Jetty-51 o.j.h.i.JavaUrlHttpCommandExecutorService:56 |::] Receiving response 235085911: HTTP/1.1 400 Value for one of the query parameters specified in the request URI is invalid.
[s3proxy] D 06-17 15:05:44.907 S3Proxy-Jetty-51 jclouds.headers:56 |::] << HTTP/1.1 400 Value for one of the query parameters specified in the request URI is invalid.
[s3proxy] D 06-17 15:05:44.907 S3Proxy-Jetty-51 jclouds.headers:56 |::] << x-ms-version: 2017-11-09
[s3proxy] D 06-17 15:05:44.907 S3Proxy-Jetty-51 jclouds.headers:56 |::] << Server: Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0
[s3proxy] D 06-17 15:05:44.907 S3Proxy-Jetty-51 jclouds.headers:56 |::] << x-ms-error-code: InvalidQueryParameterValue
[s3proxy] D 06-17 15:05:44.907 S3Proxy-Jetty-51 jclouds.headers:56 |::] << x-ms-request-id: 8a72448e-801e-0070-7ee9-c05393000000
[s3proxy] D 06-17 15:05:44.907 S3Proxy-Jetty-51 jclouds.headers:56 |::] << Date: Mon, 17 Jun 2024 19:05:42 GMT
[s3proxy] D 06-17 15:05:44.907 S3Proxy-Jetty-51 jclouds.headers:56 |::] << Content-Type: application/xml
[s3proxy] D 06-17 15:05:44.907 S3Proxy-Jetty-51 jclouds.headers:56 |::] << Content-Length: 415
[s3proxy] D 06-17 15:05:44.907 S3Proxy-Jetty-51 jclouds.wire:56 |::] << "[0xef][0xbb][0xbf]<?xml version="1.0" encoding="utf-8"?><Error><Code>InvalidQueryParameterValue</Code><Message>Value for one of the query parameters specified in the request URI is invalid.[\n]"
[s3proxy] D 06-17 15:05:44.907 S3Proxy-Jetty-51 jclouds.wire:56 |::] << "RequestId:8a72448e-801e-0070-7ee9-c05393000000[\n]"
[s3proxy] D 06-17 15:05:44.907 S3Proxy-Jetty-51 jclouds.wire:56 |::] << "Time:2024-06-17T19:05:42.9571502Z</Message><QueryParameterName>blockid</QueryParameterName><QueryParameterValue>AAK_IA==</QueryParameterValue><Reason>Not a valid base64 string.</Reason></Error>"
[s3proxy] W 06-17 15:05:44.907 S3Proxy-Jetty-51 o.j.a.s.h.AzureStorageClientErrorRetryHandler:74 |::] Cannot retry after server error, command is not replayable: [method=org.jclouds.azureblob.AzureBlobClient.public abstract void org.jclouds.azureblob.AzureBlobClient.putBlock(java.lang.String,java.lang.String,java.lang.String,org.jclouds.io.Payload)[XXXXXXXXXXX, XXXXXXXXXXX/666b2296638fe0de53a54633.mp4, AAK_IA==, [content=true, contentMetadata=[cacheControl=null, contentDisposition=null, contentEncoding=null, contentLanguage=null, contentLength=8388608, contentMD5=null, contentType=application/unknown, expires=null], written=false, isSensitive=false]], request=PUT https://XXXXXXXXXXX.blob.core.XXXXXXXXXXX.net/XXXXXXXXXXX/666b2296638fe0de53a54633.mp4?comp=block&blockid=AAK_IA%3D%3D HTTP/1.1]
[s3proxy] D 06-17 15:05:44.908 S3Proxy-Jetty-51 o.gaul.s3proxy.S3ProxyHandler:2971 |::] sendSimpleErrorResponse: 400 BadDigest Bad Request {}

Let me know if there is any more info that would be helpful to provide (configs, ect.) or if there is anything else I should try. I appreciate your time and effort :)

@twick00
Copy link

twick00 commented Jul 26, 2024

I can confirm that it seems to fail for all cases where the blockId contains - or _ with Azure. Seems like azure likes strict base64 encoding instead of base64url encoding.

I solved locally by forking and replacing the special handling of azureblob here. All Azure deployments since 2019 have had a new max blob size of 4000MiB so it doesn't really make a difference anymore to separate them. Maybe a flag to enable legacy handling?

Unfortunately, this will still fail once it gets to partNumber 248 (which is AAAA-A==) so this is just a temp solution but works for me for up to ~1.2gig uploads.

I can do a bit of work to support more but ultimately the fix lies with jclouds and it's use of BaseEncoding.base64Url here, which uses - and _.

@twick00
Copy link

twick00 commented Jul 31, 2024

After doing more investigation I believe the issue lies with Azure on this. Everything I've read indicates that blockid should accept base64url encoded strings despite the fact that - and _ are erroring.

I have reported this issue to Azure and hopefully they will resolve this in their end. Unfortunately, this API will remain practically unusable until Azure fixes it unless we can think of some other way to handle it.

Edit: just wanted to comment that Azure responded saying that the blockid property doesn't accept base64url encoding. It only accepts base64 encoding, which then needs to be url encoded. There is a difference.

@gaul
Copy link
Owner

gaul commented Aug 6, 2024

apache/jclouds#208 will address this. For now you can override jclouds.version with 2.7.0-SNAPSHOT and build S3Proxy to include this fix.

@gaul
Copy link
Owner

gaul commented Oct 7, 2024

You might also try the new azureblob-sdk provider tracked by #606 which works today in master.

@gaul gaul closed this as completed Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants