Skip to content
This repository has been archived by the owner on Mar 28, 2018. It is now read-only.

Devicemapper storage #997

Merged
merged 23 commits into from
Jun 22, 2017
Merged

Devicemapper storage #997

merged 23 commits into from
Jun 22, 2017

Conversation

amshinde
Copy link
Contributor

Check for devicemapper storage and use the underlying block device instead of overlay file system.

@amshinde amshinde force-pushed the devicemapper-storage branch 6 times, most recently from f984063 to 34ec45f Compare June 18, 2017 18:01
@amshinde
Copy link
Contributor Author

Will be adding in more unit tests later.

src/util.c Outdated
// Legacy naming scheme for virtio block devices
// Reference : https://github.com/torvalds/linux/blob/master/drivers/block/virtio_blk.c#L478
static int
virtblk_name_format(char *prefix, int index, char *buf, int buflen)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if prefix or buf are NULL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added checks for these

src/util.c Outdated
}

// Legacy naming scheme for virtio block devices
// Reference : https://github.com/torvalds/linux/blob/master/drivers/block/virtio_blk.c#L478
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - I see that you've copied this verbatim (with a very minor tweak to appease a compiler warning?) Don't we atleast need a copyright attribution here for the FSF?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@as per the discussion on this, I have updated the reference to include the git sha

Copy link

@grahamwhaley grahamwhaley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice series @amshinde A few minor comments.

@@ -13,10 +13,6 @@ memory-backend-file,id=mem0,mem-path=@IMAGE@,size=@SIZE@
@KERNEL@

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit message s/diectory/directory/

src/mount.c Outdated
struct cc_oci_mount *m;

if (! config ) {
return false;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning 'false' in a func that returns a pointer feels odd - would a 'NULL' be more appropriate? @jodh-intel , do you have a preferred way in this codebase?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, as I see we do a NULL at the error: exit - so, probably change this one to NULL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this needs to be NULL. Fixing this.

src/mount.c Outdated
goto error;
}

g_warning("Rootfs bind mounting done");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silly question - should this be a warning? Seems we did what we wanted to - do we have the option of an info or similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be a debug message, will fix it

src/pod.c Outdated
@@ -32,6 +32,7 @@
#include "process.h"
#include "proxy.h"
#include "state.h"
#include "mount.h"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query: @jodh-intel do we tend to keep the includes alphabetical?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the order here

@@ -22,6 +22,7 @@
#include <errno.h>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the commit header:

  • missing space 'container.The'
  • missing space 'above.The'
  • missing space 'passthis'
  • fractured line '^name$'

Question: in the path '{runtime_dir}/cid/workload/cid/rootfs' - I wonder if there is a reason for the two cid parts - maybe it has something to do with pods, maybe I'll find a note in the code. Noting here before I forget.

src/mount.c Outdated
}

if (stat(path, &cur_stat) == -1) {
return NULL;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a g_warning for this one?

src/mount.c Outdated
*/

file = setmntent(PROC_MOUNTS_FILE, "r");
if ( !file) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bracket whitespace inconsistency?

@@ -229,6 +229,11 @@ cc_oci_get_config_and_state (gchar **config_file,
(*state)->procsock_path,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ' , ' whitespace in commit message, and this one is 'tests' and the previous one is 'test' :-)

src/util.c Outdated
@@ -740,6 +745,55 @@ get_random_bytes(uint num) {
return buf;
}

// Legacy naming scheme for virtio block devices

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll note, these are the only C++ style comments in this file ;-)

src/util.c Outdated
char *p;
int unit;

p = end - 1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, there is a mini mind melting algo - if we'd written it, I'd be asking for an explanatory comment - but, as it is a clone of the 'standard', that's fine.

@chavafg
Copy link
Contributor

chavafg commented Jun 19, 2017

qa-failed

Rejected with PullApprove

@chavafg
Copy link
Contributor

chavafg commented Jun 19, 2017

Got this functional test error:

not ok 3 exec false
# (from function `verify_runtime_dirs' in file tests/functional/common.bash, line 161,
#  from function `testcontainer' in file tests/functional/common.bash, line 227,
#  in test file tests/functional/exec.bats, line 53)
#   `testcontainer "$container_id" "running"' failed
# Running: timeout 5 /home/cloud/go/src/github.com/01org/cc-oci-runtime/tests/functional/../../cc-oci-runtime  --debug --root /tmp/tmp.DexA6eLXsi-cor-test --global-log /run/cc-oci-runtime/cc-oci-runtime.log --hypervisor-log-dir /tmp/tmp.JwlAyuE8r2-cor-test --shim-path "/home/cloud/go/src/github.com/01org/cc-oci-runtime/cc-shim" --proxy-socket-path "/home/cloud/go/src/github.com/01org/cc-oci-runtime/tests/functional/proxy.sock" run -d --bundle /var/lib/oci/bundle tests_id
# this is a test
# README
# README.rst
# common.bash
# common.bash.in
# data
# exec.bats
# help.bats
# kill.bats
# pause.bats
# proxy.sock
# run-functional-tests.sh
# run-functional-tests.sh.in
# start.bats
# state.bats
# version.bats
# README
# README.rst
# common.bash
# common.bash.in
# data
# exec.bats
# help.bats
# kill.bats
# pause.bats
# proxy.sock
# run-functional-tests.sh
# run-functional-tests.sh.in
# start.bats
# state.bats
# version.bats
# exit code: 0
# expected exit code: 0
# Running: /home/cloud/go/src/github.com/01org/cc-oci-runtime/tests/functional/../../cc-oci-runtime  --debug --root /tmp/tmp.DexA6eLXsi-cor-test --global-log /run/cc-oci-runtime/cc-oci-runtime.log --hypervisor-log-dir /tmp/tmp.JwlAyuE8r2-cor-test --shim-path "/home/cloud/go/src/github.com/01org/cc-oci-runtime/cc-shim" --proxy-socket-path "/home/cloud/go/src/github.com/01org/cc-oci-runtime/tests/functional/proxy.sock" list --format table | grep tests_id | grep running
# exit code: 0
# expected exit code: 0
# SUCCESS
# Test logs: /tmp/tmp.DexA6eLXsi-cor-test
# Global log file: /run/cc-oci-runtime/cc-oci-runtime.log
# Hypervisor log dir: /tmp/tmp.JwlAyuE8r2-cor-test
# bats status: 0
# bats output (lines array contains 7 lines):
# lines[0]: 'console.sock'
# lines[1]: 'ga-ctl.sock'
# lines[2]: 'ga-tty.sock'
# lines[3]: 'hypervisor.sock'
# lines[4]: 'process.sock'
# lines[5]: 'state.json'
# lines[6]: 'workload'
# BUNDLE_DIR: /var/lib/oci/bundle
# ROOTFS_DIR: /var/lib/oci/bundle/rootfs

Seems like lines[6]: 'workload' is the issue:

[ "${lines[6]}" = "" ] 

from https://github.com/01org/cc-oci-runtime/blob/master/tests/functional/common.bash.in#L161

@dlespiau
Copy link
Contributor

dlespiau commented Jun 19, 2017

This is quite a large change. Should we open master to 2.2.x and branch out a 2.1.x stable branch?

@grahamwhaley
Copy link

@dlespiau ack from me - good idea on the version bump

@jcvenegas
Copy link
Contributor

Running checkcommits version 0.0.1 (commit 2f18bd36aedcb97213dc794be2d5fb1306f9015e)
Detected TravisCI Environment
Found 21 commits between commit db9196b and branch master
Checking commit 4e96669
Checking commit e9ddabc
Checking commit 867e277
Checking commit 93b13a3
Checking commit 56bc092
Checking commit 33c8aba
Checking commit 00189f1
Checking commit 2035ae6
Checking commit 9b66cf2
Checking commit 9097610
Checking commit f97694e
Commit f97694e: single-line "Signed-off-by" body not permitted

Instead of having static options for the 9pfs directory
in hypervisor.args file, pass them dynamically.
We want to have the ability to change the storage options.

Signed-off-by: Archana Shinde <[email protected]>
@chavafg
Copy link
Contributor

chavafg commented Jun 19, 2017

qa-failed

@amshinde amshinde requested a review from sameo June 19, 2017 21:06
@chavafg
Copy link
Contributor

chavafg commented Jun 19, 2017

qa-failed

if (! volume) {
/* bind mount container rootfs. We do this for regular
* container as well for a pod sandbox.
*/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you're also bind mounting the rootfs for pod containers. I don't think this is correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change that to include cc_pod_is_vm() check. Although this bind-mounting change with the added check doesnt seem to play well with crio. Still taking a look at this.

* \return \c Newly allocated buffer on success, NULL otherwise.
**/
uint8_t *
get_random_bytes(uint num) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You really want to make that one a wrapper on top of getramdom.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sameo Seems like this is new system call introduced in 3.17, would not be available in older kernels. The system call itself was added very recently in glibc: https://lwn.net/Articles/711013/.
I think reading from /dev/urandom is fairly standard way to get random data.

}

fsmap_arr = json_array_new();
for (l = config->oci.mounts; l && l->data; l = g_slist_next (l)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is only handling volumes, or both volumes and rootfs ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the volumes. The oci.mounts should contain only volumes right? config->rootfs_mounts should contain just the rootfs mounts if I have understood correctly.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For pods, that's correct. I see you changed that logic with that PR, and that sounds reasonable to me.

src/mount.c Outdated
return false;
}

if (! cc_is_devicemapper(major, minor)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make that one a little more generic, and have a cc_is_blockdevice() that could eventually also support other block device based graph drivers ?

src/mount.c Outdated
* \return \c true on success, else \c false.
*/
gboolean
cc_oci_check_for_devicemapper(struct cc_oci_config *config)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make that one a little more generic and call it e.g. cc_oci_rootfs_is_block_device()

src/util.c Outdated
@@ -740,6 +745,65 @@ get_random_bytes(uint num) {
return buf;
}

// Legacy naming scheme for virtio block devices
// Reference : https://github.com/torvalds/linux/blob/master/drivers/block/virtio_blk.c#L478
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specific file happens to not have any explicit copyright owners, so we should not add any attribution to our file.
For people to check who wrote this code, please replace that comment with:

// Reference: linux/blob/master/drivers/block/virtio_blk.c @ <SHA1>

From there a git log will tell who wrote the code before we copied it. Additionally, potential future copyright owners will not be able to claim ownership of this code.

@sameo
Copy link

sameo commented Jun 20, 2017

@dlespiau The version bump makes sense, this is a significant change. As for the branch, opening a stable branch means this is something we'll have to maintain. I'd rather not be doing this.

@amshinde amshinde force-pushed the devicemapper-storage branch 3 times, most recently from e460237 to 05fc7bb Compare June 21, 2017 02:17
@amshinde
Copy link
Contributor Author

@sameo I have addressed your other review comments.

@sameo
Copy link

sameo commented Jun 21, 2017

LGTM

Approved with PullApprove

@chavafg
Copy link
Contributor

chavafg commented Jun 21, 2017

qa-failed

1 similar comment
@chavafg
Copy link
Contributor

chavafg commented Jun 21, 2017

qa-failed

@chavafg
Copy link
Contributor

chavafg commented Jun 21, 2017

qa-passed

Approved with PullApprove

@jcvenegas
Copy link
Contributor

@jodh-intel could you review this PR again ?

These will be needed to unmount the rootfs while stopping
the container.

Signed-off-by: Archana Shinde <[email protected]>
Now that we bind mount the container rootfs to a host directory,
pass that to hyperstart.

Signed-off-by: Archana Shinde <[email protected]>
This array tells hyperstart where to mount the mounts passed
in the shared directory.

Signed-off-by: Archana Shinde <[email protected]>
In case of a standalone container, unmount the rootfs from
the shared hyperstart directory.

Signed-off-by: Archana Shinde <[email protected]>
We perform the mounts on the hyperstart shared directory
on the host namespace. So do not try to join the mount
namespace while unmounting.

Signed-off-by: Archana Shinde <[email protected]>
We need to update the rootfs mount path and the workload dir
from the state file that were saved while creating the container.

Signed-off-by: Archana Shinde <[email protected]>
This function returns the major and minor numbers of the
underlying device on which the file resides.

Signed-off-by: Archana Shinde <[email protected]>
The function checks whether a device identified by its
major and minor numbers is a devicemapper block device.

Signed-off-by: Archana Shinde <[email protected]>
Given a path, the function returns the mount point to which
the path belongs. This is required for getting the device name
to which a file system path belongs.

Signed-off-by: Archana Shinde <[email protected]>
Given a mountpoint, the function returns the device name that
has been mounted at the mount point and the file system type
of the mount.

Signed-off-by: Archana Shinde <[email protected]>
If a block device is found, we store the device name of the block
device found. We also store the file system type and index.
We are currently using an incrementing counter for the index.
When we add support for CRIO with hot plugging drives, so that
containers with block drives can be added after the VM is started,
we should store the global index in a file and retrieve
it from there.

The index is needed to derive the drive name that we pass to qemu.
We need to pass the drive name and file system type to hyperstart.

Signed-off-by: Archana Shinde <[email protected]>
These will be needed while starting a container.

Signed-off-by: Archana Shinde <[email protected]>
We check for block storage for standalone container.
If block device is detected, we pass that to qemu.
Skip bind-mounting the rootfs in that case.

Signed-off-by: Archana Shinde <[email protected]>
For hyperstart to use the drive passed on qemu command line,
we pass the drive name and the file system type of the block
device. The drive name needs to be passed as image field
in the newcontainer json.
The drive name is deduced based on the index at which the drive is
passed to qemu.

Signed-off-by: Archana Shinde <[email protected]>
We no longer check for namespace and return true while
unmounting. Change test according to this.

Signed-off-by: Archana Shinde <[email protected]>
Now that workload_dir is required, fix the current state tests.

Signed-off-by: Archana Shinde <[email protected]>
The runtime directory should now include "workloads"
directory. Fix functional tests to address this.

Signed-off-by: Archana Shinde <[email protected]>
Copy link
Contributor

@jcvenegas jcvenegas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is needed to update hyperstart (clearcontainers/hyperstart@2846708) in cc image before merge this change

docker cp is not expected to work while using block device
instead of the overlay file system.
Skip docker cp test in case of devicemapper storage driver.

Signed-off-by: Archana Shinde <[email protected]>
@chavafg
Copy link
Contributor

chavafg commented Jun 21, 2017

qa-passed

@jcvenegas
Copy link
Contributor

just tested with hyperstart changes, tested that now dnf works with devicemapper, after the image is updated this.

@jcvenegas jcvenegas mentioned this pull request Jun 22, 2017
25 tasks
@chavafg
Copy link
Contributor

chavafg commented Jun 22, 2017

qa-passed

@chavafg
Copy link
Contributor

chavafg commented Jun 22, 2017

I have tested this on Ubuntu, Fedora, Centos and ClearLinux with both overlay and devicemapper as the docker storage driver and no regressions found.

lgtm

Approved with PullApprove

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants