Network configuration hotplug support #161

egernst · 2018-04-02T19:39:25Z

From @mcastelino on May 9, 2017 21:13

Currently the network interfaces are auto-detected at container/POD launch. This does not allow use cases like docker network connect, where network interfaces can be dynamically added to the container.
This is currently the only way to support multiple network interfaces with a docker container (outside of swarm).

Also firewall rules and routes may be added to the container post creation.

Note: The runtime is a passive component. Hence hotplug needs to be implemented by an active component running in the network namespace of the container/POD.

To implement network hotplug support

Post network auto detection the shim needs to monitor the namespace for changes to the network setup.
This includes
- interface creation/configuration
- route add/delete
- firewall rules setup
QEMU supports QMP based device hotplug
The (hyperstart) agent inside the VM will need to receive the updated configuration
- Hence the shim will need a interface into the CC proxy to send updated configuration
Note: This is different from today where all setup is through the runtime -> proxy -> hyperstart

The first step in this implementation would be to not implement dynamic network hotplug, but to perform all device attaches to QEMU via QMP. This will allow for the hotplug of any device in the future.

Copied from original issue: containers/virtcontainers#241

The text was updated successfully, but these errors were encountered:

egernst · 2018-04-02T19:39:27Z

From @miaoyq on February 12, 2018 2:19

@mcastelino Is there the exact plan for this feature?

In the case of k8s + docker + cc, we also need the network configuration hotplug supporting. I have created pods successfully via k8s + docker + cc locally, but the network of pod creation failed, because we can't set up the network for an active VM created by cc.

Looking forward to this feature.

egernst · 2018-04-02T19:39:27Z

From @guangxuli on March 6, 2018 9:57

@mcastelino Do we have any plan for this feature? If need some extra help feel free to contact us. :)

egernst · 2018-04-02T19:39:28Z

From @mcastelino on March 6, 2018 17:10

@guangxuli @amshinde is planning to add this feature.

egernst · 2018-04-02T19:39:29Z

@guangxuli -- is this something that you can take a look at implementing? This would help enhance virtcontainers greatly and we'd appreciate your contribution here. @mcastelino @amshinde and I can help review and guide as needed. WDYT?

egernst · 2018-04-02T19:39:29Z

From @sboeuf on March 6, 2018 19:2

@guangxuli it'd be so great if you could contribute for such a feature. From a high level perspective, I think we need to extend the API in api.go since the need for a new network is not tied to a pod state step or a container state step. Something like:

func AddNetwork(podID string) error {
        if podID == "" {
		return errNeedPodID
	}

	lockFile, err := rwLockPod(podID)
	if err != nil {
		return err
	}
	defer unlockPod(lockFile)

	p, err := fetchPod(podID)
	if err != nil {
		return err
	}

	// Add the new network
	return p.addNetwork()
}

And you will need to appropriately implement:

func (p *Pod) addNetwork() error {
        ...
}

so that it will call into the hypervisor interface to hotplug some new network interface into the existing network namespace.

egernst · 2018-04-02T19:39:30Z

From @amshinde on March 6, 2018 21:40

After taking a closer look at this and discussing this with @sboeuf, I think the following design could be a possible solution:

Have the shim monitor the network namespace for any changes in the network. (The shim would have to maintain an initial state of the network namespace)
In case a new network interface is added, instead of the shim communicating with the agent directly as this would mean shim would have to do a lot of heavy lifting, have the shim invoke the runtime passing in the container-id and the new network interface name.
The runtime would then fetch the state for the container and invoke the appropriate virtcontainers API such as AddNic. (We need to introduce new virtcontainer API calls such as AddNic/DeleteNic, AddRoute/ DeleteRoute)
Virtcontainers would then handle hotplugging tap/macvtap interface to the VM.(Or set up mirroring rule in case of tc)

@sameo @mcastelino What do you think of the above approach?

egernst · 2018-04-02T19:39:30Z

From @sboeuf on March 6, 2018 21:47

Thanks for summarizing our discussion here @amshinde.
This is really the best thing we could come up with since the creation of a new network interface is not triggered by a regular call into the runtime...

egernst · 2018-04-02T19:39:31Z

From @amshinde on March 6, 2018 21:52

As a side note, we need to eventually move towards hot plugging all network devices including at startup : containers/virtcontainers#665

egernst · 2018-04-02T19:39:32Z

From @mcastelino on March 6, 2018 21:55

@amshinde We need the ns monitoring to handle "connect" right? We can still move to a hotplug model for the actual network connection first. This may help parallelize some of the flow in the runtime.

egernst · 2018-04-02T19:39:33Z

From @sboeuf on March 6, 2018 22:21

@mcastelino It'd be great to have the hotplug being done also for the simple case of adding our first interface when creating the pod, but this is a separate issue IMO. The problem with adding additional network interfaces is that this does not come from a direct call through the runtime. There is no command that translates from docker network connect. That's why you need some sort of monitoring to detect the addition of a new interface. Once this is detected, we cannot let the shim handle things since this would put a lot of knowledge and complexity inside the shim, while this can be easily handled by the runtime. That's why an extra command cc-runtime add-interface could be useful to be called from the shim, leaving the complexity/implementation inside virtcontainers.
@sameo WDYT ?

egernst · 2018-04-02T19:39:34Z

From @guangxuli on March 7, 2018 2:0

@mcastelino @egernst @sboeuf @amshinde thanks for all your responses. okay, we will go deep into the details of the implementation of shim/runtime. etc in a few days. TBH, we aren't familiar with the code process details, :) even so we would do our best effort to driven this feature and participate to the code implementation.
@miaoyq FYI.

egernst · 2018-04-02T19:39:34Z

From @mcastelino on March 7, 2018 2:13

@sboeuf yes the single biggest challenge will be the fact that docker network connect does not have an associated OCI Lifecycle event, hence we will need an active component to monitor the network connection. However as this a pod level event, should the shim be the one monitoring it. If the shim is the one monitoring it, this this becomes very docker centric. Ideally the "sandbox" should be the entity monitoring for pod level events.

egernst · 2018-04-02T19:39:35Z

From @miaoyq on March 7, 2018 3:13

If the shim is the one monitoring it, this this becomes very docker centric. Ideally the "sandbox" should be the entity monitoring for pod level events.

@mcastelino We can create shim process with a parameters that will tell shim process whether monitor network connection or not. Only the shim that belongs to the first container of the pod monitors the network connection.

egernst · 2018-04-02T19:39:35Z

From @sboeuf on March 7, 2018 16:41

@mcastelino you're right, I forgot that we're falling into a container semantic instead of a pod semantic here. This will work well for the Docker case, but this would not work for a k8s case.
@miaoyq The solution you're proposing here does not really work because we cannot assume that the first container will last the entire lifespan of the pod.

Seems like we need a dedicated watcher (separate process) to monitor the network namespace from a pod level.

egernst · 2018-04-02T20:14:32Z

closing as duplicate (copied over for reference). Handled by #113

…internal-error main: Display full stacktrace on internal error

egernst assigned amshinde Apr 2, 2018

egernst added backlog labels Apr 2, 2018

egernst mentioned this issue Apr 2, 2018

Network configuration hotplug support containers/virtcontainers#241

Closed

egernst closed this as completed Apr 2, 2018

egernst removed the backlog label Apr 2, 2018

zklei pushed a commit to zklei/runtime that referenced this issue Jun 13, 2019

Merge pull request kata-containers#161 from jodh-intel/stacktrace-on-…

1ea323e

…internal-error main: Display full stacktrace on internal error

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Network configuration hotplug support #161

Network configuration hotplug support #161

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

Network configuration hotplug support #161

Network configuration hotplug support #161

Comments

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018

egernst commented Apr 2, 2018