-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add docker_storage_to_overlay2 role #5216
Conversation
state: started | ||
|
||
- name: Check if storage driver is already overlay2 | ||
shell: docker info | grep --ignore-case 'storage driver:.*overlay2' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try really hard to not use shell
. It's a very, very powerful module, which security wise is a bit scary.
More information from the openshift-ansible best practices guide:
https://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#The-Ansible-command-module-SHOULD-be-used-instead-of-the-Ansible-shell-module
Could we instead use command
?
Something like:
- name: Check if storage driver is already overlay2
command: docker info
register: storage_driver_check
changed_when: False
ignore_errors: yes
- name: Transition storage driver to overlay2
when: storage_driver_check.stdout | lower | search('storage driver:.*overlay2')
block:
[... snip ...]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ha, thanks for the link. I'll rework to avoid shell
.
with_items: "{{ service_names }}" | ||
|
||
- name: Check if storage driver is now overlay2 | ||
shell: docker info | grep --ignore-case 'storage driver:.*overlay2' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, consider switching to command
instead of shell
.
state: absent | ||
with_items: | ||
- "EXTRA_STORAGE_OPTIONS" | ||
- "EXTRA_DOCKER_STORAGE_OPTIONS" # Deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use options like dm.basesize
that I believe should work on both overlay and devicemapper.
Maybe allow us to pass these options in?
We need this option to be set (and possibly others).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dm.anything
is devicemapper-specific, according to the docs.
The Docker daemon choked on it when I tried to pass it for overlay2
:
$ systemctl status -l docker
...
... fatal_msg="Error starting daemon: error initializing graphdriver: overlay2: Unknown option dm.basesize\n"
... Failed to start Docker Application Container Engine.
...
The only documented option for overlay2 is overlay2.override_kernel_check
, which we don't need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, interesting. So are we losing the ability to specify basesize?
If so, that's not good. What will keep a single container from eating up all of our docker storage volume?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rhatdan: ^^ Can you answer this?
I'm iterating on your initial devicemapper-to-overlay2 playbook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@twiest Did some Googling on your question and found this Moby PR.
The PR enables disk quota limits to be set per-container by way of a --storage-opt size
option when using the overlay2 backend on an XFS filesystem:
e.g. docker run --storage-opt size=3g -it fedora /bin/bash
But the catch is the XFS filesystem must be mounted with project quota (pquota
) support instead of the noquota
default. And I don't see a way to inject that mount option by way of container-storage-setup
.
Does this help at all, or is a limit-per-container a non-starter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbarnes that is the functionality we want, but we'd need to ensure that whatever is starting the containers supports that. In this case, that'd mean openshift-node / CRI-O. As far as the pquota mount option, we'll need to have the container storage volume mounted via the fstab I assume, so we can work with that, I think.
@mrunalp Does CRI-O support setting --storage-opt size=3g
on container start? We need a way to limit the amount of space a container can take on the container storage volume.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Submitted projectatomic/container-storage-setup#255 to see if we can get the pquota
option into container-storage-setup
.
The docker-storage-setup.service
unit is what actually mounts the volume during boot, and I'd prefer to solve this in the tooling, but we might have to hack /etc/fstab
as an interim solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok, I didn't realize that.
I agree with you that it'd be better to fix in the tooling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@twiest : I found what we need! Add overlay2.size daemon storage-opt
But this feature was only just added in June and isn't available where we need it to be. I emailed you and @rhatdan about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbarnes that's great, good find!
# | ||
|
||
- name: Transition Docker storage driver to overlay2 | ||
hosts: "tag_Name_{{ cli_tag_name }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is this needs to use the openshift-ansible inventory format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I had no idea how to write that part. Was copying off some other adhoc playbook.
Can you elaborate what openshift-ansible inventory format looks like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this should either be hosts: nodes
or we'd have to initialize the groups and then it'd be hosts: oo_nodes_to_config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, let me clarify, if we leave it in adhoc hosts: nodes
is the right thing to do. If we expect this to be supported for OCP use we should initialize the groups in playbooks/byo/openshift-node/docker_storage_to_overlay2.yml
and move the logic to playbooks/common/openshift-node/docker_storage_to_overlay2.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thanks for clarification, I'll make it hosts: nodes
for now.
⬆️ Fixup commit addresses @twiest's comments. I managed to avoid the Other miscellaneous tweaks here too. I'm still not sure what to do with the |
I took a closer look at the final state of the system after running this playbook, and found that The contents of Reading the https://github.com/projectatomic/container-storage-setup code, it looks like the setup tool is looking for a couple additional parameters for
Update: Looks like the best way to set these is via |
2nd fixup commit adds
Does that look like the configuration we want for |
FTR, matched the generated configuration to what a subject-matter-expert recommends. |
@nalind PTAL |
⬆️ Squashed the fixup commits and rebased on
|
⬆️ Added a test for |
/test tox |
⬆️ Squashed fixups and rebased again... |
# | ||
|
||
- name: Transition Docker storage driver to overlay2 | ||
hosts: nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking -- this will also update the driver on the masters, right? Ops would like the storage driver configurations to be consistent across masters and nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dak1n1 Ah okay, wasn't clear to me what to use for hosts
.
Should this then be:
hosts:
- masters
- nodes
?
# XXX Is there no easier way to do this? | ||
- name: Get {{ item.key }} version | ||
command: rpm --query --queryformat="%{version}-%{release}" "{{ item.key }}" | ||
args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a module we prefer to use for this, repoquery
. For example:
- name: Get {{ item.key }} version
repoquery:
name: "{{ item.key }}"
query_type: installed
show_duplicates: True
register: package_version
And I believe the data you want would be in package_version.results.versions.available_versions_full
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This is more of a "nice to have" feature, btw. It doesn't have to hold up the stg testing.)
# Note, this makes no attempt to migrate existing containers/images. | ||
# It momentarily stops docker.service and uses "atomic storage reset" | ||
# to wipe all local Docker storage. | ||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed this comment. How about if this playbook was a role instead, so Ops could call it after we migrate containers onto new nodes? Then we could control how many hosts have their storage wiped at once, and make sure users' apps remain running on other nodes. (Otherwise this would be too much of an outage to do in production). @mwoodson @jupierce thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sdodson Do you guys have an existing framework we could plug destructive roles like this into? For example, it would drain 10% of nodes at a time, and then run whatever role you need. Like a yum update+reboot role, or this docker storage reset role.
(I have something that I use for that, but I'm supposed to check upstream before making custom playbooks).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sdodson unping. Turns out we have approval to just use my playbook as a wrapper for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dak1n1 Your playbook just wraps the group evaluation or have you moved this into a role?
command: atomic storage reset | ||
|
||
- name: Capture storage setup configuration | ||
command: cat "{{ storage_setup_path }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another "nice to have"... Ops likes everything to be in module form if possible, rather than commands. The slurp
module will do what you need for this task. https://docs.ansible.com/ansible/latest/slurp_module.html
@dak1n1 ⬆️ Addressed all the "nice to haves". Awaiting guidance on whether/how to convert this to a role, like you mentioned above. |
that: ansible_distribution_version | version_compare("7.4", "ge") | ||
msg: "The OverlayFS file system only works with SELinux in {{ ansible_distribution }} 7.4 or later." | ||
|
||
- name: Get "{{ item.key }}" version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap these two tasks in a block that only fires when a quota is specified so that they can migrate to overlayfs without quotas if they don't meet the version requirements?
- block:
when: disk_quota|length > 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea
⬆️ Implemented @sdodson's suggestion. Working now on converting this all to a role so it can be wrappered for ops. |
Please weigh in on this. This is the migration from device-mapper to overlay for docker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking good. Added a few notes.
@@ -0,0 +1,3 @@ | |||
--- | |||
container_storage_lvname: "docker-root-lv" | |||
container_storage_lvsize: "100%FREE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Googling this gives some weird results!
|
||
- name: Capture storage setup configuration | ||
slurp: | ||
src: "{{ storage_setup_path }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
service_names: [] | ||
|
||
optional_service_names: | ||
- "docker.service" # } One or the other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No change needed but openshift_facts does set openshift.docker.service_name
which will expand to either docker.service
or container-enginer.service
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Nice, that's handy. Also see my question above about docker info
.
|
||
- name: Start docker.service | ||
service: | ||
name: docker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
openshift.docker.service_name
might be used here if container-engine
is also supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, about that. What would be the equivalent to docker info
when using the container-engine
system container?
In my playbook that uses this role, I run docker info
at the beginning and extract the "Storage Driver" value. If it's already "overlay2"
I skip the whole thing so the playbook is idempotent. I'd like to preserve that feature when using container-engine
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docker info
should be the same. docker
client is installed and will use container-engine
service.
I really want to see this make it into the installer. This would make use of overlay, which is becoming common, much easier. |
service: | ||
name: docker | ||
name: "{{ openshift.docker.service_name }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
optional_service_names: | ||
- "docker.service" # } One or the other | ||
- "container-engine.service" # } must be present | ||
docker_storage_to_overlay2_optional_service_names: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that when origin
is installed the service is called origin-node.service
You can use {{ openshift.common.service_type }}
to get the proper one.
General question: Does Ansible (or |
@mbarnes here's the task we use for this else where, the waiting for the api you don't need. |
@mbarnes Check out |
@giuseppe PTAL as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should support overlay2, but I don't think we should support it in an ad-hoc manner. I think this should be a supported storage option in-line with how we support other storage options.
We should not do any hard-removing of existing storage. This should be an up-front decision, and if a node needs to go from one docker storage type to another, that node needs to be reconfigured from scratch.
Let's hold off on this, and pick back up when we start work on 3.8.
@@ -0,0 +1,3 @@ | |||
--- | |||
dependencies: | |||
- openshift_facts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hoping to refactor all of the docker portions out of openshift_facts. That almost shipped this cycle, but slipped due to feature freeze.
@@ -0,0 +1,19 @@ | |||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're stripping out the docker portions of openshift_facts.
I agree, this doesn't need to be it's own role.
Just to add context here, I'm writing this for the OpenShift Online operations team as a prerequisite for moving to CRI-O, and for the Online clusters I wrappered this role with a playbook that drains each node before doing the storage conversion. I was planning on trying to adapt that playbook for inclusion here, unless you think it's the wrong approach. |
@@ -0,0 +1,19 @@ | |||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my intention is to remove everything openshift.docker and remove all openshift.x variables from the docker roles.
@@ -0,0 +1,32 @@ | |||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference is to put things in defaults/main.yml unless there's a specific reason not to.
I think it's best to leave that kind of playbook in contrib or a related repo. I don't think we should have playbooks that wipe away containers and storage as it just opens the door for users to misapply those playbooks. This would put a large burden of support on these plays to build tons of safeguards and any regressions could have catastrophic effects. I would assume that these nodes need to be drained, configured, and re-added, so that seems like someone might as well rebuild the node or add new nodes and retire the old ones. |
@mbarnes let @michaelgugino know when you are ready to have this re-reviewed. |
I think I fixed or commented on everything mentioned, so it's ready for re-review. I still think it makes sense to merge the I wasn't privy to all the discussions about this last week. Is current thinking that just the role is being considered, or should I resume trying to add some node-draining wrapper playbook like what I'm currently using for Openshift Online? |
https://trello.com/c/IGCV4VRH/561-add-devicemapper-to-overlayfs-migration-playbook card for CL team to track getting this into 3.9 in a supported manner. |
I added this topic as a discussion item for this week's architecture meeting. |
Don't forget to explicitly invite @mbarnes to the meeting! |
What's the status of this PR post meeting? |
reviewing my notes we said we'd put in a card to deliver this for 3.9 in a supported manner. https://trello.com/c/IGCV4VRH/561-3-add-devicemapper-to-overlayfs-migration-playbook |
@sdodson ok thanks! |
Returns the current Docker storage driver according to "docker info". Sets fact: "docker_storage_driver"
This role changes the Docker storage driver to overlay2. Note, this makes no attempt to migrate existing containers/images. It momentarily stops docker.service and uses "atomic storage reset" to wipe all local Docker storage.
@mbarnes: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
I'd recommend closing this as its been over a year. |
Master branch is closed! A major refactor is ongoing in devel-40. Changes for 3.x should be made directly to the latest release branch they're relevant to and backported from there. |
This adds an adhoc playbook to change the Docker storage driver to overlay2.
Note, this makes no attempt to migrate existing containers/images. It momentarily stops
docker.service
and usesatomic storage reset
to wipe all local Docker storage.Tested on:
Some necessary changes for OverlayFS to support SELinux security labels landed in RHEL 7.4, so I had the playbook do a version check for
RedHat
/CentOS
distributions. Presumably this will work on CentOS 7.4 once it's released.