Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

network configuration hotplug support #113

Closed
egernst opened this issue Mar 23, 2018 · 55 comments
Closed

network configuration hotplug support #113

egernst opened this issue Mar 23, 2018 · 55 comments
Assignees
Labels
feature New functionality

Comments

@egernst
Copy link
Member

egernst commented Mar 23, 2018

Continuation of discussion from containers/virtcontainers#241

cc/ @mcastelino @amshinde @sboeuf @bergwolf @laijs @guangxuli @miaoyq

@egernst
Copy link
Member Author

egernst commented Mar 23, 2018

cc/ @WeiZhang555

@sboeuf
Copy link

sboeuf commented Mar 23, 2018

Ok I'll try to summarize the status of the previous discussion here.

We want to be able to support docker network connect. This command is not part of the OCI lifecycle, meaning it does not call into our kata-runtime to perform this operation.
Instead, it simply adds the new network to the network namespace of the container/pod. For this reason, we need a process running inside the network namespace to detect any change about interfaces and routes inside the namespace. That's the first part of what we need.

Now the second part is the API itself which does not currently allow us for adding a new interface or route. This is already part of the scope as you can see here (also referenced here). This means someone could already start working on this if this is important for you guys (@guangxuli @miaoyq).

Now, to get back to the detection part, I think we should make sure this can be disabled in case we don't need it (k8s case) since this extra binary (even if this is a reexec from the library itself) will increase the memory footprint for each pod. Since this is only a problem that we'll face with Docker (one container per pod), the kata-shim representing the container process could handle this directly without pulling the whole virtcontainers package if we were exposing a special command at the CLI level (something like kata-runtime add-network ...).

@bergwolf @laijs @WeiZhang555 I'd like your input about this !

@WeiZhang555
Copy link
Member

@sboeuf This sounds feasible.

Now, to get back to the detection part, I think we should make sure this can be disabled in case we don't need it (k8s case) ...

I like it :)

the kata-shim representing the container process could handle this directly without pulling the whole virtcontainers package if we were exposing a special command at the CLI level (something like kata-runtime add-network ...).

kata-shim works as a "detecter" (optionally), and when new interface is added into its namespace, it will invoke kata-runtime add-network to add a new interface into qemu, am I right? If so this sound quite good.

And a CNI plugin can also make use of kata-runtime add-network. Everything makes sense, I like your whole idea 👍

@WeiZhang555
Copy link
Member

@sboeuf

the kata-shim representing the container process could handle this directly without pulling the whole virtcontainers package if we were exposing a special command at the CLI level (something like kata-runtime add-network ...).

I just realized one problem runv once met one minute ago. Kata-shim is representing one container process(init or exec), we need to make sure the detector part will still work after one kata-shim exits.
There are some ways we can do is:

  1. first kata-shim works as detector and send kata add-network once a nic detected: this is not workable, because once the container exits with first kata-shim, the POD can has other containers living, this will make docker network connect losing functions. Unless we have some kind of election mechanism to make sure one new detector is elected once old network detector died.
  2. every kata-shim works as detector to make sure one kata-shim exits won't disable CNM network hotplug: this is resource inefficient, and need to sync up among all kata-shim processes.
  3. do this inside kata-proxy, as kata-proxy is always alongside VM, so it's rational to put network detector inside kata-proxy: problem is with vsock we don't have a kata-proxy.

@amshinde
Copy link
Member

If we need to support scenarios other than docker, we really need to have a separate lightweight process monitoring the network namespace for network changes. This would then invoke the kata-runtime library directly with a call such as kata-runtime add-interface or kata-runtime update-route.

@sboeuf
Copy link

sboeuf commented Mar 29, 2018

@WeiZhang555

  1. first kata-shim works as detector and send kata add-network once a nic detected: this is not workable, because once the container exits with first kata-shim, the POD can has other containers living, this will make docker network connect losing functions. Unless we have some kind of election mechanism to make sure one new detector is elected once old network detector died.

I agree this would not work for other cases than Docker (only one container per pod), and this could be solved by introducing a different binary for this purpose only. But my question is, do we need to support other cases than Docker here ? Are we expecting CRI cases to be able to hotplug some more network that our shim should also detect ?

2.every kata-shim works as detector to make sure one kata-shim exits won't disable CNM network hotplug: this is resource inefficient, and need to sync up among all kata-shim processes.

If we're handling only the Docker case, the only shim who needs to monitor is the one representing the container process. We are sure this shim won't terminate before, meaning it won't miss anything if a docker network connect happens during the container's lifetime.

  1. do this inside kata-proxy, as kata-proxy is always alongside VM, so it's rational to put network detector inside kata-proxy: problem is with vsock we don't have a kata-proxy.

You've answered yourself, this is not a suitable design as we always have to take into account that we might not have a proxy process.

@kata-containers/runtime

To summarize a bit, here are two questions I have and each of them has two options:

  • How do we want the "monitoring" process (shim or separate process) to call into add network capability of Kata ?

    • Either we reexec from virtcontainers itself, meaning the process will be able to call directly into the Kata API sandbox.AddNetwork(). This involves a bigger memory footprint since we will have the monitoring process being as large as the runtime process, but in this case the process stays alive for the whole pod lifetime.
    • Or the monitoring process would be calling into the runtime path .../.../kata-runtime add-network name=... ip=..., which would allow for a much simpler and lightweight process (written in C probably), that would only need to know about the runtime path. The drawback of this approach is that we would be spawning a new process from the monitoring process itself.
  • Do we expect other cases than Docker to have a monitoring process running in the network namespace ?

    • If we don't, let's go with the specific implementation, and directly handle this through the kata-shim. No overhead, and minimal changes from virtcontainers design perspective.
    • If we do, then the only option IMO is a dedicated monitoring process (I don't want to sync kata-shims together so that they know who is monitoring). This process should be as tiny as possible and that's why I was thinking about C code for this. And this would involve a little bit more changes in virtcontainers to properly spawn/handle this monitoring process.

@kata-containers/runtime We really need some input here before we take a decision and we go ahead with the implementation.

@bergwolf
Copy link
Member

Speaking of network monitoring, is the monitor process going to do a busy-polling or is there some notification mechanism that would let docker network connect notify runtime that there is a network change?

@WeiZhang555
Copy link
Member

@bergwolf

github.com/vishvananda/netlink has a good subscribe and notification mechanism for this, and you already has this though it's obsolete :) Check this: https://github.com/hyperhq/runv/blob/master/cli/network.go#L501-L536

@sboeuf

Do we expect other cases than Docker to have a monitoring process running in the network namespace ?

Hold on, we might get diverged on "what is Docker case". If we are planning to support libnetwork(CNM) model, don't we need to support "docker run --net container:xxxx" to share two containers network namespace? In other words, don't we want to support composing a POD from docker's client? This was the missing part of CC initially in my opinion, and my hope is we can enhance and support this later.

@egernst
Copy link
Member Author

egernst commented Mar 30, 2018

@WeiZhang555 -- I think we should discuss this in architecture committee on Monday.

Hold on, we might get diverged on "what is Docker case". If we are planning to support libnetwork(CNM) model, don't we need to support "docker run --net container:xxxx" to share two containers network namespace? In other words, don't we want to support composing a POD from docker's client? This was the missing part of CC initially in my opinion, and my hope is we can enhance and support this later.

Regarding docker network connect: this could be achieved in CC (this is how I did multi-interface early setup for a VNF like chain in docker), but the original veths would be unconfigured in the namespace, so connectivity of original container would be lost if you were connecting to a runc container's network with a Clear Container. In other words, I didn't use this for shared-netns between containers, but as a workaround to not being able to do hotplug!) AFAIU, if we moved to something like a TC based solution, this should work. @mcastelino -- agreed? I think this is seperate from the hot-plug discussion (such as network connect).

@sboeuf
Copy link

sboeuf commented Mar 30, 2018

@egernst @WeiZhang555 yes I think this is a separate discussion (not related to detecting a network being hotplugged without notice) since this can be tied to an OCI lifecycle event (docker run translating into kata-runtime create + start in that case).

@WeiZhang555
Copy link
Member

@egernst @sboeuf Yes, we can discuss this in Monday's arch meeting.

@egernst egernst added the release-gating Release must wait for this to be resolved before release label Apr 2, 2018
@egernst
Copy link
Member Author

egernst commented Apr 2, 2018

@guangxuli @miaoyq - can you confirm you are planning on adding this feature?

caoruidong added a commit to caoruidong/runtime that referenced this issue May 7, 2018
Fixes kata-containers#113
Refactor generate interface and route, add network hotplug interface

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue May 7, 2018
Fixes kata-containers#113
Refactor generate interface and route, add network hotplug interface

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue May 7, 2018
Fixes kata-containers#113
Refactor generate interface and route, add network hotplug interface

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue May 7, 2018
Fixes kata-containers#113
Refactor generate interface and route, add network hotplug interface

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue May 7, 2018
Fixes kata-containers#113
Refactor generate interface and route, add network hotplug interface

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue May 7, 2018
Fixes kata-containers#113
Refactor generate interface and route, add network hotplug interface

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue May 7, 2018
Fixes kata-containers#113
Refactor generate interface and route, add network hotplug interface

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue May 8, 2018
Fixes kata-containers#113
Refactor generate interface and route, add network hotplug interface

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue May 9, 2018
Fixes kata-containers#113
Refactor generate interface and route, add network hotplug interface

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue May 9, 2018
Fixes kata-containers#113
Refactor generate interface and route, add network hotplug interface

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 2, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 2, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 7, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 9, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 9, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 10, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 12, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 12, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 13, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 15, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
sboeuf pushed a commit to caoruidong/runtime that referenced this issue Aug 15, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 16, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 16, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 16, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
@WeiZhang555
Copy link
Member

@bergwolf can you add your great diagram in some document? I think it will be helpful to everyone who want to play with our CNI solution.

@grahamwhaley
Copy link
Contributor

If you have some 'source' for the diagram (that is, something like SVG, and not just PNG of JPG), then that would be excellent ;-) iirc, you may have created the diagram via some website? If there is no 'source' then, maybe a footnote referencing the site. Basically, it would be good if others could later edit the diagram etc. if need be ;-)

@jodh-intel
Copy link
Contributor

caoruidong added a commit to caoruidong/runtime that referenced this issue Aug 16, 2018
add UTs for network hotplug related fuctions

Fixes kata-containers#113

Signed-off-by: Ruidong Cao <[email protected]>
@egernst egernst added networking feature New functionality labels Aug 16, 2018
@WeiZhang555
Copy link
Member

WeiZhang555 commented Aug 17, 2018

@grahamwhaley I think @bergwolf already gave the source: LINK

:-)

@grahamwhaley
Copy link
Contributor

Ah, I see @WeiZhang555 @bergwolf - so, the input is a plain text file? If so, can we add that plain text file to the repo as the image source alongside the png to include in the markdown doc?
@jcvenegas - is that the same tool/site you used to make some flow/state diagrams btw? Let's ensure we converge on one tool and text source format for flow (UML etc.) diagrams if so.
thx!

@bergwolf
Copy link
Member

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature New functionality
Projects
None yet
Development

No branches or pull requests

10 participants