-
-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A solution for podman containers max log size #100
Comments
Hey, Thanks and excellent write up. I would put this in own folder called container-common or something like DNS common. Also, make sure to update the README.md to describe it and maybe even highlight it in red and update other README's if needed. If you can get the script in, I can help with that as well. Thanks! |
Hello John, I have raised a PR #102 with proposed changes. Let me know if any additional commits are required or wanted. Thanks PK |
* Added container-common Initial release of container-common section that includes setting a limit of container log size any container can have, to prevent filling up UDM storage with excessive logging. * Update README.md Clarified description of max log size Co-authored-by: TRUPaC <[email protected]>
* Added container-common Initial release of container-common section that includes setting a limit of container log size any container can have, to prevent filling up UDM storage with excessive logging. * Update README.md Clarified description of max log size Co-authored-by: TRUPaC <[email protected]>
Merged!@ |
Sometimes the logs can take over 20GB in few weeks. By setting 1GB as log_size_max should avoid situation that we are out of the disk few times per week. The feature has been added into the podman containers.conf file in podman 2.2.0 release [1], but on Centos 7, version is below 2.2.0. According to the libpod.conf man [2], that option should be also available in podman 1.6.4, but it is located in libpod.conf file. More info [3]. [1] https://github.com/containers/podman/releases/tag/v2.2.0 [2] https://manpages.debian.org/unstable/podman/libpod.conf.5.en.html [3] unifi-utilities/unifios-utilities#100 Change-Id: Ic6d01e11606c9526d1880583876d76c4415250ac
The service logs after a while can be really huge. This change is limiting log file size to 1GB. The feature has been added into the podman containers.conf file in podman 2.2.0 release [1], but on Centos 7, version is below 2.2.0. According to the libpod.conf man [2], that option should be also available in podman 1.6.4, but it is located in libpod.conf file. More info [3]. [1] https://github.com/containers/podman/releases/tag/v2.2.0 [2] https://manpages.debian.org/unstable/podman/libpod.conf.5.en.html [3] unifi-utilities/unifios-utilities#100 Depends-On: https://softwarefactory-project.io/r/c/software-factory/sf-ci/+/28529 Change-Id: Ia6071e5214644bdd126cf696cd437c140fa95c94
Depends-On: https://softwarefactory-project.io/r/c/software-factory/sf-ci/+/31675 Depends-On: https://softwarefactory-project.io/r/c/software-factory/sf-ci/+/31690 Here are the stashed commits from the common 3.8.3 tag. git format-patch -N 20d7af3..origin/3.8 git am *.patch There was some conflicts that have been fixed manually. Remove Opensearch Dashboards autologin feature After moving to Keycloak, such feature is not required. Fixes - After d/s upgrade - logprocessing clean of old components - opensearch-dashboard and opensearch use CA chain ca-trust - add sf_purgelogs_additional_params vars (mount addtional volume) Set host network binding for some services and contenerized tools Almost all containers that we are starting in Software Factory are using host binding. Render zuul_api_url as python list The logscraper tool gets zuul_api_url parameter as a list and there can be multiple values provided. Change url path for Opensearch Dashboards The new URL will not use autologin feature. Add condition to verify that stdout item exists The item might not exists when infrastructure is updated each time when Software Factory is released. sf-keycloak: quote passwords in parameters Passwords may include special characters that break command lines. Add option gerrit_use_truststore Enable increase innodb_log_file_size and innodb_buffer_pool_size After increasing parameters, some queries performed by Zuul are working faster. This change is mostly helpful for those Zuul deployments, where some scripts are making a complicated query with many job_name variables to Zuul web to receive latest build results and the SQL "inner join" takes long time. Ensure backup dir exists; change backup host After changing service name from Kibana to Opensearch Dashboards, when the arch.yaml file was not updated to new values, the backup directory for opensearch-dashboards service might not be available on the host. Use new mysql container version Depends-on: https://softwarefactory-project.io/r/c/containers/+/27429 Adding conditional for zuul-web check on grafana postconfig stage Add debug flag for purgelogs; remove :Z flag for log dir in purgelogs The log directory might have a lot of files, so restarting the purgelogs script might take ages until the SELinux labeling is done. Also added debug flag parameter into the purgelogs service to see removal progress logs. Logserver trailing slash fix This change fixes the trailing slash problem raised by OSP CI team. The issue is due to requests not working when made to logserver without an ending trailing slash. Mount MariaDB cache dir Without mounting the cache dir, the container delta overlay dir might be very big. Change retention policy in influxdb; increase buffer This commit fixes various issues related to the telegraf and influxdb errors: Metric buffer overflow; 831 metrics have been dropped Also changed retention policy to wipe data after 4 weeks. Update purgelogs container image The new purgelogs container image will provide log messages about its progress. config-repo: Pull centos image from quay rather than registry.centos.org registry.centos.org seems down, investigation pending. This breaks config-update jobs, which rebuild containers defined in the config repo. In the meantime, switch to quay.io for pulling. zuul-web: mount /var/lib/zuul/ When a connection requires a SSH key, it is stored in /var/lib/zuul/.ssh - which isn't exposed to zuul-web, resulting in errors when the configuration is loaded. Use zuul-executor-ubi-sf38 to benefit last managesf release See https://softwarefactory-project.io/cgit/containers/commit/images-sf/3.8?id=87dea1ceae4719e48193e85a8bc7fdfd5553216f Set log_size_max size for podman logs The service logs after a while can be really huge. This change is limiting log file size to 1GB. The feature has been added into the podman containers.conf file in podman 2.2.0 release [1], but on Centos 7, version is below 2.2.0. According to the libpod.conf man [2], that option should be also available in podman 1.6.4, but it is located in libpod.conf file. More info [3]. [1] https://github.com/containers/podman/releases/tag/v2.2.0 [2] https://manpages.debian.org/unstable/podman/libpod.conf.5.en.html [3] unifi-utilities/unifios-utilities#100 Depends-On: https://softwarefactory-project.io/r/c/software-factory/sf-ci/+/28529 Use managesf-sf38 last container image; drop encoding parameter in managesf The "encoding" parameter is raising an error on starting managesf service. Ensure nodepool services are restarted when config files is updated Nodepool services must be restarted when labels are added zuul/nodepool: bump to the latest version (10.0.0) This change sets the ansible_root zuul.conf variable to avoid ansible installation on startup. Also bump MariaDB version 10.5 because of the renaming index feature (needed for Zuul DB Migration) not available in 10.3. Depends-On: https://softwarefactory-project.io/r/c/containers/+/31361 Depends-On: https://softwarefactory-project.io/r/c/software-factory/sf-ci/+/31362 Depends-On: https://softwarefactory-project.io/r/c/containers/+/31412 Provided fixes to enable mariadb upgrade from 10.3 to 10.5 Running the sfconfig --upgrade is then required. Depends-On: https://softwarefactory-project.io/r/c/software-factory/sf-ci/+/31390 arch allinone - add missing zuul-merger component Update sf-gerrit to latest build 3.7.8-2 was built somewhat recently[1] and addresses a couple of CVEs. [1] https://quay.io/repository/software-factory/gerrit-sf38?tab=tags Add --golden-tests feature to validate generated playbooks This change enables testing the deployment playbooks without installing sf-config. Run with: PYTHONPATH=$(pwd) python3 ./sfconfig/cmd.py \ --golden-tests ./refarch-golden-tests/ \ --arch ./refarch/softwarefactory-project.io.yaml \ --config ./defaults/sfconfig.yaml --share $(pwd) Remove unused host_public_url facts This change remove a fact that is no longer used. Sort the /etc/hosts alias to avoid random update This change ensures the /etc/hosts is defiened in a fixed order Combine zuul-executor and zuul-merger hosts in the generated deployment playbook This change improves the deployment process by combining the common host into a single target so that the roles can be applied in parallel Setup user_namespaces before the restore tasks When restoring a backup on a fresh instance, make sure that the userns is configured to ensure the container can be created correctly. Do not use the zuul_wrapper for restore tasks When restoring a backup on a fresh instance, the zuul_wrapper command does not exist. Restore zookeeper lib ownership after a restore This change ensure the zookeeper setup is correct after restore. Revert "Combine zuul-executor and zuul-merger hosts in the generated deployment playbook" Change-Id: I1742905336af06de3d35814413932f7558317036
Problem description
Today I ran out of disk space on UDM Pro's 12.2G storage area that negatively impacted UDM's original functionality.
Upon investigation, the disk was consumed by custom container logs (homebridge, hoobs) that reached 8GB+:
The catalyst of the log file to grow was an error printed by one of the Homebridge addons that went unnoticed for a while.
Our custom containers are set up without any logs rotation and the default setting is to grow unlimited.
It is a matter of time when each one of us using custom containers will get a disk full unless a solution is added to limit the max log size a container can have and thus maintain a stable disk usage footprint.
Research
As per documentation of podman and this issue, an ability to configure max log size per container was added in podman 2.2.0, while as of today, UDM Pro runs podman 1.6.1.
There was a reference to changing the setting for all containers via containers.conf file. This documentation covers log_size_max property.
However, podman 1.6.1 does not yet support containers.conf file - it was first mentioned in release notes of podman 1.9.0
I have investigated the source code of podman 1.6.1 and traced down the default setting to be coming from config file libpod.conf:
property:
The content of this config file is reset on reboot, so an extra early on_boot.d script allowed to change default before the custom containers start.
You can verify that the setting has been applied by looking for conmon parameters
I have verified that the log is getting truncated by setting up a limit of 10 kilobytes. It is not a rotation, but a truncation.
unifi-os container does not write anything in the log, so UniFi is not suffering from the lack of log size limits. Only custom additional containers.
Proposal
Set a limit to the containers log size.
Would the above be a strategic approach worth a Pull Request?
If so, what would be the right place in this repo to place a generic script like this that impacts all containers?
The text was updated successfully, but these errors were encountered: