Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support compression for Actions logs #31761

Merged
merged 25 commits into from
Aug 9, 2024

Conversation

wolfogre
Copy link
Member

@wolfogre wolfogre commented Aug 2, 2024

Support compression for Actions logs to save storage space and bandwidth. Inspired by #24256 (comment)

The biggest challenge is that the compression format should support seekable. So when users are viewing a part of the log lines, Gitea doesn't need to download the whole compressed file and decompress it.

That means gzip cannot help here. And I did research, there aren't too many choices, like bgzip and xz, but I think zstd is the most popular one. It has an implementation in Golang with zstd and zstd-seekable-format-go, and what is better is that it has good compatibility: a seekable format zstd file can be read by a regular zstd reader.

This PR introduces a new package zstd to combine and wrap the two packages, to provide a unified and easy-to-use API.

And a new setting LOG_COMPRESSION is added to the config, although I don't see any reason why not to use compression, I think's it's a good idea to keep the default with none to be consistent with old versions.

LOG_COMPRESSION takes effect for only new log files, it adds .zst as an extension to the file name, so Gitea can determine if it needs decompression according to the file name when reading. Old files will keep the format since it's not worth converting them, as they will be cleared after #31735.

image

@wolfogre wolfogre added type/feature Completely new functionality. Can only be merged if feature freeze is not active. topic/gitea-actions related to the actions of Gitea labels Aug 2, 2024
@wolfogre wolfogre added this to the 1.23.0 milestone Aug 2, 2024
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Aug 2, 2024
@pull-request-size pull-request-size bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Aug 2, 2024
@github-actions github-actions bot added modifies/go Pull requests that update Go code modifies/dependencies labels Aug 2, 2024
modules/actions/log.go Outdated Show resolved Hide resolved
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Aug 6, 2024
@silverwind
Copy link
Member

silverwind commented Aug 7, 2024

And a new setting LOG_COMPRESSION is added to the config, although I don't see any reason why not to use compression, I think's it's a good idea to keep the default with none to be consistent with old versions.

Why not enable compression by default? Can the log reading code handle a mix of compressed and uncompressed data? If so, I see no reason for not defaulting to it.

@wolfogre
Copy link
Member Author

wolfogre commented Aug 8, 2024

Can the log reading code handle a mix of compressed and uncompressed data?

Yes, it works fine with that.

If so, I see no reason for not defaulting to it.

TBH, I also don't see a strong reason for no compression by default. I cannot think of any case where enabling compression would have a negative impact. But I have some concerns:

  1. zstd-seekable-format-go is not so popular. It needs more time to prove it's stable enough. If it has critical bugs like memory leaks or deadlocks, it will have a lesser effect to Gitea when LOG_COMPRESSION is set to none by default.
  2. I'm not familiar with tiny servers like Raspberry Pi, or other servers with limited memory and busy CPU, so I have no idea if compression could cause a negative impact for them.

So my idea is to introduce LOG_COMPRESSION in this PR, and set it to zstd by default in another one when we are sure it works well.


I opened a issue #31801 in case we forget to do that.

@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Aug 8, 2024
@lunny lunny added the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Aug 8, 2024
@wolfogre wolfogre merged commit 33cc583 into go-gitea:main Aug 9, 2024
26 checks passed
@wolfogre wolfogre deleted the feature/compress_actions_log branch August 9, 2024 02:10
@GiteaBot GiteaBot removed the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Aug 9, 2024
@wolfogre
Copy link
Member Author

wolfogre commented Aug 9, 2024

I almost forget to update https://gitea.com/gitea/docs.

So please help review:

  1. https://gitea.com/gitea/docs/pulls/50, which updates the docs.
  2. Add label docs-update-needed for PRs that modify app.example.ini #31810, in case I forget again (maybe a label can still be easily overlooked, but it's better than having none).
  3. Fix typo for LOG_COMPRESSION in ini #31809, it fixes a typo in this PR, and it can be used to test the new labeler rule.

lunny pushed a commit that referenced this pull request Aug 9, 2024
Follow #31761

---------

Co-authored-by: silverwind <[email protected]>
zjjhot added a commit to zjjhot/gitea that referenced this pull request Aug 11, 2024
* giteaofficial/main:
  Show lock owner instead of repo owner on LFS setting page (go-gitea#31788)
  Move repository visibility to danger zone in the settings area (go-gitea#31126)
  [skip ci] Updated translations via Crowdin
  Add types to various low-level functions (go-gitea#31781)
  Add warning message in merge instructions when `AutodetectManualMerge` was not enabled (go-gitea#31805)
  Show latest run when visit /run/latest (go-gitea#31808)
  Fix typo for `LOG_COMPRESSION` in ini (go-gitea#31809)
  Add label `docs-update-needed` for PRs that modify `app.example.ini` (go-gitea#31810)
  Fix `IsObjectExist` with gogit (go-gitea#31790)
  Support compression for Actions logs (go-gitea#31761)
  Add issue comment when moving issues from one column to another of the project (go-gitea#29311)
  [skip ci] Updated translations via Crowdin
  Fix RPM resource leak (go-gitea#31794)
lunny pushed a commit that referenced this pull request Sep 9, 2024
Close #31801. Follow #31761.

Since there are so many benefits of compression and there are no reports
of related issues after weeks, it should be fine to enable compression
by default.
@go-gitea go-gitea locked as resolved and limited conversation to collaborators Nov 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. modifies/dependencies modifies/go Pull requests that update Go code size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. topic/gitea-actions related to the actions of Gitea type/feature Completely new functionality. Can only be merged if feature freeze is not active.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants