Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove lockc-runc-wrapper #93

Merged
merged 11 commits into from
Nov 19, 2021
Merged

Conversation

vadorovsky
Copy link
Member

@vadorovsky vadorovsky commented Nov 4, 2021

This change consists of several changes which overall allow to remove
lockc-runc-wrapper and any necessity of configuring containerd for
integration with lockc:

  • Using fanotify to monitor runc processes and register containers.
  • Using uprobes for registering containers and processes in BPF maps.
    Doing BPF operations from the wrapper had the weird issue - by
    registering the wrapper process, we were in fact preventing the same
    lockc-runc-wrapper process from doing any more BPF map modifications.
    Using uprobes has also an another advantage - it will allow rootless
    containers to work.
  • Allowing access to /run directory inside container filesystem. It's
    used both by popular applications (like nginx) and by kubelet to store
    the network namespace (/run/netns).
  • Adding containerd v2 CRI-related cgroup mount directories as allowed.
  • Proper cleanup of old containers and processes.

Majority of uprobes code on the userspace side comes from
bpfcontain-rs.

Fixes: #94
Fixes: #52
Fixes: #92
Signed-off-by: Michal Rostecki [email protected]
Signed-off-by: William Findlay [email protected]

Copy link
Collaborator

@willfindlay willfindlay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Just a few general requests from me, but nothing prohibitive.

lockc/src/bpf/lockc.bpf.c Show resolved Hide resolved
lockc/src/uprobe_ext.rs Outdated Show resolved Hide resolved
lockc/src/uprobe_ext.rs Show resolved Hide resolved
@willfindlay
Copy link
Collaborator

Hmm, CI appears to be failing? Have you bumped the libbpf-rs version lately?

@vadorovsky
Copy link
Member Author

I can't reproduce the CI failure locally with the same dapper cargo build command, weird. :/ But indeed it looks as some internal problem with libbpf-rs and libbpf-sys.

@vadorovsky vadorovsky force-pushed the fix-dir branch 2 times, most recently from f811483 to 1aba170 Compare November 5, 2021 09:18
@willfindlay
Copy link
Collaborator

I can't reproduce the CI failure locally with the same dapper cargo build command, weird. :/ But indeed it looks as some internal problem with libbpf-rs and libbpf-sys.

I've also had similar problems where libbpf-rs fails to build that were basically about a mismatch between libbpf-sys and the system libbpf. Are you using the vendored libbpf or system libbpf? I believe it's gated with a feature flag in libbpf-rs.

@vadorovsky vadorovsky force-pushed the fix-dir branch 2 times, most recently from dc2110b to b18968b Compare November 8, 2021 15:53
@vadorovsky vadorovsky changed the title bpf: Fix problems with container starts Remove lockc-runc-wrapper Nov 8, 2021
@vadorovsky vadorovsky force-pushed the fix-dir branch 12 times, most recently from 8e8f873 to 2c6b259 Compare November 11, 2021 08:28
@vadorovsky
Copy link
Member Author

@willfindlay @mjura @flavio It's ready for review. Sorry that it took me so long and that this pull request is so huge., but I wanted to make sure that everything works and is documented.

Reviewing it per commit should be easier than looking at the full diff.

Copy link
Collaborator

@mjura mjura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's merge this change and then add more features

Copy link
Collaborator

@flavio flavio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I don't have enough knowledge to understand the low level parts.

I left some minor comments

contrib/terraform/libvirt/cloud-init/common.tpl Outdated Show resolved Hide resolved
docs/src/architecture.md Outdated Show resolved Hide resolved
docs/src/build/README.md Outdated Show resolved Hide resolved
docs/src/build/README.md Outdated Show resolved Hide resolved
docs/src/install/docker.md Outdated Show resolved Hide resolved
docs/src/install/docker.md Outdated Show resolved Hide resolved
docs/src/install/kubernetes.md Outdated Show resolved Hide resolved
docs/src/terraform/libvirt.md Outdated Show resolved Hide resolved
lockc/Cargo.toml Show resolved Hide resolved
lockc/src/bin/lockcd.rs Show resolved Hide resolved
Copy link
Collaborator

@nickgerace nickgerace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR! This is one the biggest ones I've seen in awhile lol.

password: ${{ secrets.GITHUB_TOKEN }}
-
name: Build and push development container image
if: ${{ startsWith(github.ref, 'refs/heads/') }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this conditional necessary? I think it's for tags only, which would make it necessary, but I am trying to discern what this does 😅 . Probably my own misunderstanding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mjura ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will clarify with him later.

Comment on lines +10 to +46
if [ -z "${TR_MASTER_IPS}" ] || [ -z "${TR_USERNAME}" ]; then
error '$TR_MASTER_IPS $TR_USERNAME must be specified'
exit 1
fi

sleep 5

CILIUM_VERSION=$(curl -s https://api.github.com/repos/cilium/cilium/releases/latest | jq -r '.tag_name' | sed -e 's/^v//')

info "### Run following commands to bootstrap Kubernetes cluster:\\n"

i=0
for MASTER in $TR_MASTER_IPS; do
if [ $i -eq "0" ]; then
ssh -o 'StrictHostKeyChecking no' -l ${TR_USERNAME} ${MASTER} /bin/bash <<-EOF
set -eux
sudo kubeadm init --cri-socket /run/containerd/containerd.sock --control-plane-endpoint ${MASTER}:6443 --upload-certs | tee kubeadm-init.log
mkdir -p /home/${TR_USERNAME}/.kube
sudo cp /etc/kubernetes/admin.conf /home/${TR_USERNAME}/.kube/config
sudo chown ${TR_USERNAME}:users /home/${TR_USERNAME}/.kube/config
helm repo add cilium https://helm.cilium.io/
helm install cilium cilium/cilium --version ${CILIUM_VERSION} --namespace kube-system
EOF

export KUBEADM_MASTER_JOIN=`ssh -o 'StrictHostKeyChecking no' -l ${TR_USERNAME} ${MASTER} tail -n12 kubeadm-init.log | head -n3`
export KUBEADM_WORKER_JOIN=`ssh -o 'StrictHostKeyChecking no' -l ${TR_USERNAME} ${MASTER} tail -n2 kubeadm-init.log`
else
ssh -o 'StrictHostKeyChecking no' -l ${TR_USERNAME} ${MASTER} /bin/bash <<-EOF
set -eux
sudo ${KUBEADM_MASTER_JOIN}
mkdir -p /home/${TR_USERNAME}/.kube
sudo cp /etc/kubernetes/admin.conf /home/${TR_USERNAME}/.kube/config
sudo chown ${TR_USERNAME}:users /home/${TR_USERNAME}/.kube/config
EOF
fi
((++i))
done
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this get executed in cloud-init for terraform? Curious what happens when it exits with a non-zero status code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is set -eux everywhere, any non-zero status code in any line should fail the script and then the terraform provisioner should fail as well. Deployment should stop with an error.

Copy link
Member Author

@vadorovsky vadorovsky Nov 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, I'm not proud of that script, I would be happy if some more devops-oriented person could help with having some proper salt formulas or anything which would hurt eyes less. Or I might eventually do that in future.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for development-only right? If so, I think this is more than fine enough for now. Looks good to me!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's only for development. On all the other clusters, lockc is going to be installed with helm.

Comment on lines +4 to +5
with Docker. In order to do that, we need to install `lockcd` binary and a
systemd unit for it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: is there a lockcd rpm?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, but I'm thinking about it. I would wait for #49 though - removing libbpf and LLVM dependency will make packaging much easier. I already started working on aya rewrite, I'm hoping to have some PR next week.

Comment on lines +8 to +11
let log_level = match env::var("LOCKC_DEBUG") {
Ok(_) => LevelFilter::Debug,
Err(_) => LevelFilter::Info,
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is a boolean setter, not a level?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was assuming it's a level. Was my assumption wrong?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ask because I think you always set the enum to either Debug or Info based on whether or not that ENV's value is non-empty or exists. Thus, it seems like it behaves like a bool (true/false for debug mode)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exposing it as a bool env variable was my intention.

Comment on lines 84 to 100
if annotations.contains_key(ANNOTATION_CONTAINERD_LOG_DIRECTORY) {
// containerd doesn't expose k8s namespaces directly. They have
// to be parsed from the log directory path, where the first
// part of the filename is the namespace.
let log_directory = &annotations[ANNOTATION_CONTAINERD_LOG_DIRECTORY];
debug!(
"detected k8s+containerd container with log directory {}",
log_directory
);
let log_path = std::path::PathBuf::from(log_directory);
let file_name = log_path.file_name().unwrap().to_str().unwrap();
let mut splitter = file_name.split('_');
let namespace = splitter.next().unwrap().to_string();

return Ok((ContainerType::KubernetesContainerd, namespace));
// containerd
} else if annotations.contains_key(ANNOTATION_CONTAINERD_SANDBOX_ID) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For readability, it could be a good idea to split the contains key checks into another function and then return an enum (or other primitive) to match off of. This is purely something to think about and I'd be fine with keeping it as is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I will try

}
}

Ok((ContainerType::Unknown, String::from("")))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Containing an Option<String> as value None instead of empty String could be better depending on your preference

}
}
ContainerAction::Create => {
let container_id = container_id_o.unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we ok with the risk of unwrapping here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every runc create and runc delete command has to come with container ID. I never got any panic here. But on the other hand, maybe returning some descriptive error could be better here, just in case...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever you think is best! I usually like to wrap an error or return it since panic traces can be a bit harder to deal with

check_uprobe_ret(ret)?;
}
ContainerAction::Delete => {
let container_id = container_id_o.unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

lockc/src/runc.rs Outdated Show resolved Hide resolved
vadorovsky and others added 11 commits November 19, 2021 01:42
This change consists of several changes which overall allow to remove
lockc-runc-wrapper and any necessity of configuring containerd for
integration with lockc:

* Using fanotify to monitor runc processes and register containers.
* Using uprobes for registering containers and processes in BPF maps.
  Doing BPF operations from the wrapper had the weird issue - by
  registering the wrapper process, we were in fact preventing the same
  lockc-runc-wrapper process from doing any more BPF map modifications.
  Using uprobes has also an another advantage - it will allow rootless
  containers to work.
* Allowing access to /run directory inside container filesystem. It's
  used both by popular applications (like nginx) and by kubelet to store
  the network namespace (/run/netns).
* Adding containerd v2 CRI-related cgroup mount directories as allowed.
* Proper cleanup of old containers and processes.

Majority of uprobes code on the userspace side comes from
bpfcontain-rs.

Fixes: lockc-project#94
Fixes: lockc-project#52
Fixes: lockc-project#92
Signed-off-by: Michal Rostecki <[email protected]>
Signed-off-by: William Findlay <[email protected]>
We've observed issues and long timeouts while building the VM image with
guestfs. Using just terraform for provisioning seems to be faster.

Signed-off-by: Michal Rostecki <[email protected]>
Signed-off-by: Michal Jura <[email protected]>
This change adds a multi-stage Dockerfile which is used both for
building the project and for the final VM image.

Signed-off-by: Michal Rostecki <[email protected]>
Signed-off-by: Michal Jura <[email protected]>
Signed-off-by: Michal Rostecki <[email protected]>
Enable buildx and everything necessary for plugins to work.

Signed-off-by: Michal Rostecki <[email protected]>
To make templating of unit files working with DESTDIR, we need to
differentiate between

* full bindir, unitdir, sysconfdir - which contains DESTDIR prefix for
  installation of target files; DESTDIR is something using by RPM
  packers or producing tarballs to move targets somewhere else in the
  filesystem before making a package
* relative bindir, unitdir, sysconfdir - which doesn't contain DESTDIR
  prefix and is used to template values like `{{ bindir }}` into real
  values; those should be relative towards DESTDIR

Signed-off-by: Michal Rostecki <[email protected]>
Add a new subcommand `bintar` which produces a .tar.gz archive with
lockcd binary and systemd unit. It's going to be used as a release
artifact and as a way of distributing lockc to virtual machines when
deploying without Kubernetes.

Signed-off-by: Michal Rostecki <[email protected]>
To make them easier to grep.

Signed-off-by: Michal Rostecki <[email protected]>
That should fix problems with showing images in Github Pages.

Signed-off-by: Michal Rostecki <[email protected]>
Signed-off-by: Michal Rostecki <[email protected]>
@vadorovsky
Copy link
Member Author

Comments should be addressed now. Thanks everyone for review!

@vadorovsky vadorovsky merged commit 77bf7e6 into lockc-project:main Nov 19, 2021
@vadorovsky vadorovsky deleted the fix-dir branch November 19, 2021 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants