Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Upload API can not handle base64 binary correctly. #18368

Closed
webcrawls opened this issue Jan 23, 2022 · 2 comments
Closed

File Upload API can not handle base64 binary correctly. #18368

webcrawls opened this issue Jan 23, 2022 · 2 comments
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/bug

Comments

@webcrawls
Copy link

webcrawls commented Jan 23, 2022

Gitea Version

1.15.10

Git Version

git version 2.30.2 (Docker host)

Operating System

Linux ns541627 5.10.0-10-amd64 #1 SMP Debian 5.10.84-1 (2021-12-08) x86_64 GNU/Linux (Docker host)

How are you running Gitea?

Gitea is ran in a Docker container using the gitea/gitea:latest tag, with docker-compose. Gitea is then reverse-proxied through an Nginx container.

Database

SQLite

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

https://gist.github.com/kadenscott/7dcb2446e2ada216408b8239ea52787a

Description

I have created a script to automate the uploading of files using the FIle Upload API. My script PUT's to the content location where I want to upload the file, with the results of base64 -w 0 {file} being set to content in the JSON payload.

This is the exact endpoint I am PUTing to: https://try.gitea.io/api/swagger#/repository/repoUpdateFile

When uploading these files, however, the file is completely mangled with &#65533 being added tons of times. The files are inflated by 4-5x because of this, and are entirely corrupted.

You can check out the repository here with some corrupted files: https://git.kaden.sh/kaden/test_65533/

This is the script I am using: https://gist.github.com/kadenscott/4f21f26e603e6ec67a178bc86bb3d9c6
(I have also tried writing this in Python, where the same issue occurs.)

These files I am uploading are Java jar files with UTF-8 encoding. When viewing the uploaded log file, on line 21 it says it detected windows-1252 encoding, which is most definitely not what I'm using.

It appears the issue only occurs for certain files. In the linked repository, the arcovia-models.jar file is uploaded fine, and the Gitea logs detect encoding as utf-8 (fast) which is good. So, this only applies to the arcovia-chat.jar file for some reason.

Let me know if you need any more information. This issue has been plaguing me off and on for the past week and I'd love to help fix it!

Screenshots

No response

@wxiaoguang
Copy link
Contributor

It seems a bug in Gitea, Gitea doesn't handle uploaded binary data correctly.

content := opts.Content
if bom {
content = string(charset.UTF8BOM) + content
}
if encoding != "UTF-8" {
charsetEncoding, _ := stdcharset.Lookup(encoding)
if charsetEncoding != nil {
result, _, err := transform.String(charsetEncoding.NewEncoder(), content)
if err != nil {
// Look if we can't encode back in to the original we should just stick with utf-8
log.Error("Error re-encoding %s (%s) as %s - will stay as UTF-8: %v", opts.TreePath, opts.FromTreePath, encoding, err)
result = content
}
content = result
} else {
log.Error("Unknown encoding: %s", encoding)
}
}

@wxiaoguang wxiaoguang changed the title File Upload API mangling base64 contents with #65533's. File Upload API can not handle base64 binary correctly. Jan 23, 2022
@wxiaoguang wxiaoguang added type/bug issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented labels Jan 23, 2022
@delvh
Copy link
Member

delvh commented Jul 8, 2023

Is it working now, or why was this issue closed?

silverwind pushed a commit that referenced this issue Jul 12, 2023
…25828)

Related issue: #18368

It doesn't seem right to "guess" the file encoding/BOM when using API to
upload files.

The API should save the uploaded content as-is.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/bug
Projects
None yet
Development

No branches or pull requests

3 participants