Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cope with atomic upgrade overwriting /etc/sysconfig/docker-storage-setup #271

Merged
merged 9 commits into from
May 22, 2018

Conversation

rhvgoyal
Copy link
Collaborator

Our current default driver is "devicemapper" and recently it was changed to "overlay2". Normally it works as over upgrades rpms don't overwrite /etc/sysconfig/docker-storage-setup. (It is not a config(noreplace) file).

But atomic does not understand it and overwrites this file over upgrade. That means we are left with a scenario where a devicemapper thin pool is already configured but /etc/sysconfig/docker-storage-setup
has STORAGE_DRIVER=overlay2.

This is a corner case. Deal with it. During start, bring up thin pool if one is already configured. And during reset, continue to reset if there is a configured thin pool.

There are two call sites calling same logic and soon I am going to
introduce third one. So it makes sense to move this logic in a function
and call that function from every callsite. No functionality change.

Signed-off-by: Vivek Goyal <[email protected]>
Right now we continue processing after bringing up existing thin pool.
Only reason to continue proessing seems to be that we might want to
enable/disable auto thin pool extension.

So call thin pool extension logic after bringing up thin pool and return.

This will allow us to replace this logic with a helper function and
at the same time it is easier to understand.

Signed-off-by: Vivek Goyal <[email protected]>
Add an helper to figure out if thin pool is managed by container-storage-setup
or not. 

Signed-off-by: Vivek Goyal <[email protected]>
This will allow us to call this logic early when storage driver has changed
in atomic due to upgrade issues. Also makes it little easier to read code.

Signed-off-by: Vivek Goyal <[email protected]>
Ideally, STORAGE_DRIVER should not change while storage is configured. If
this changes, we detect this and give info message and exit.

Atomic upgrade does not have mechanism to handle /etc/sysconfig/* files
which are created/modified by rpm posttrans sections. As long as rpm 
drops a file, it is treated as equivalent of "config(noreplace)" directive
of rpm spec files and that file overwrites the previous file over upgrade.
(assuming user did not touch the previous /etc/sysconfig/docker-storage-setup).

Recently we changed default storage driver to overlay2.  And over atomic
upgrade, previous config file is overwritten with new content with
overlay2 driver. If there is already configured thin pool, container
storage setup does not like this and exits and does not wait for thin
pool to come up. And that means docker might not find thin pool and 
error out.

I think this is a deficiency of atomic upgrade mechanism that it can't
handle files dropped by %posttrans section of spec file and treats
any file dropped by rpm as %config(noreplace).

For now, let us create a hack path where we bring up thin pool if it
is already configured (found in /etc/sysconfig/docker-storage), even
if storage driver has changed in /etc/sysconfig/docker-storage-setup. 

Signed-off-by: Vivek Goyal <[email protected]>
If there is already configured storage, parse these options little early.
We want to use it in reset storage path as well soon.

Signed-off-by: Vivek Goyal <[email protected]>
During reset we look at /etc/sysconfig/docker-storage-setup and figure
out which driver is in use and reset thin pool accordingly.

But now atomic upgrade seems to overwrite /etc/sysconfig/docker-storage-setup
and changes STORAGE_DRIVER=overlay2. That means reset path is broken.

If /etc/sysconfig/docker-storage is still around (and not corrupt), look
at it and find out configured thin pool and associated VG and reset
that. If that information is not present then look at
/etc/sysconfig/docker-storage-setup and continue as usual.

Signed-off-by: Vivek Goyal <[email protected]>
/etc/sysconfig/docker-storage-setup is overwritten over atomic upgrade
with STORAGE_DRIVER=overlay2. Make sure service still runs over reboot
and exits with status code 0.

Signed-off-by: Vivek Goyal <[email protected]>
atomic upgrades overwrite /etc/sysconfig/docker-storage-setup with
new driver STORAGE_DRIVER=overlay2. Make sure old thin pool can still
be reset, despite this change.

Signed-off-by: Vivek Goyal <[email protected]>
@rhvgoyal
Copy link
Collaborator Author

cc @cgwalters @dustymabe @rhatdan

@cgwalters
Copy link
Member

The unhappiness with how ostree handles this in the commit messages aside, this all looks good to me.

(Remember, terminology gets tricky here because today you can do e.g. rpm-ostree override replace ./container-storage-setup.x86_64.rpm - and that still has the same semantics around %posttrans. It's RPM - but handled in a new way. Among other advantages over yum for example, it's still fully transactional. Hence, I tend to use the term "classic" or "yum-managed" systems, and "ostree"/rpm-ostree (or atomic) for the latter.)

@rhatdan
Copy link
Member

rhatdan commented May 22, 2018

LGTM. but this is a lot of changes.

@rhatdan rhatdan merged commit bc81071 into projectatomic:master May 22, 2018
@rhvgoyal
Copy link
Collaborator Author

@cgwalters I don't mind new way of doing things as long as it is well published what one can do. Problem is that we are used to do things rpm way and suddenly come to know that what we can do in rpm, is not compatible with rpm-ostree stuff. And suddenly we are hacking to code to work it around after users have reported issues.

To me this all should have been analyzed and addressed when we were switching driver to overlay2. I am not sure why did we let it go and did not think more about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants