Skip to content
This repository has been archived by the owner on Apr 3, 2018. It is now read-only.

Network configuration hotplug support #241

Closed
mcastelino opened this issue May 9, 2017 · 15 comments
Closed

Network configuration hotplug support #241

mcastelino opened this issue May 9, 2017 · 15 comments
Assignees

Comments

@mcastelino
Copy link
Collaborator

Currently the network interfaces are auto-detected at container/POD launch. This does not allow use cases like docker network connect, where network interfaces can be dynamically added to the container.
This is currently the only way to support multiple network interfaces with a docker container (outside of swarm).

Also firewall rules and routes may be added to the container post creation.

Note: The runtime is a passive component. Hence hotplug needs to be implemented by an active component running in the network namespace of the container/POD.

To implement network hotplug support

  • Post network auto detection the shim needs to monitor the namespace for changes to the network setup.
    This includes
    - interface creation/configuration
    - route add/delete
    - firewall rules setup
  • QEMU supports QMP based device hotplug
  • The (hyperstart) agent inside the VM will need to receive the updated configuration
    - Hence the shim will need a interface into the CC proxy to send updated configuration
    Note: This is different from today where all setup is through the runtime -> proxy -> hyperstart

The first step in this implementation would be to not implement dynamic network hotplug, but to perform all device attaches to QEMU via QMP. This will allow for the hotplug of any device in the future.

@miaoyq
Copy link
Contributor

miaoyq commented Feb 12, 2018

@mcastelino Is there the exact plan for this feature?

In the case of k8s + docker + cc, we also need the network configuration hotplug supporting. I have created pods successfully via k8s + docker + cc locally, but the network of pod creation failed, because we can't set up the network for an active VM created by cc.

Looking forward to this feature.

@guangxuli
Copy link

@mcastelino Do we have any plan for this feature? If need some extra help feel free to contact us. :)

@mcastelino
Copy link
Collaborator Author

@guangxuli @amshinde is planning to add this feature.

@egernst
Copy link
Collaborator

egernst commented Mar 6, 2018

@guangxuli -- is this something that you can take a look at implementing? This would help enhance virtcontainers greatly and we'd appreciate your contribution here. @mcastelino @amshinde and I can help review and guide as needed. WDYT?

@sboeuf
Copy link
Collaborator

sboeuf commented Mar 6, 2018

@guangxuli it'd be so great if you could contribute for such a feature. From a high level perspective, I think we need to extend the API in api.go since the need for a new network is not tied to a pod state step or a container state step. Something like:

func AddNetwork(podID string) error {
        if podID == "" {
		return errNeedPodID
	}

	lockFile, err := rwLockPod(podID)
	if err != nil {
		return err
	}
	defer unlockPod(lockFile)

	p, err := fetchPod(podID)
	if err != nil {
		return err
	}

	// Add the new network
	return p.addNetwork()
}

And you will need to appropriately implement:

func (p *Pod) addNetwork() error {
        ...
}

so that it will call into the hypervisor interface to hotplug some new network interface into the existing network namespace.

@amshinde
Copy link
Collaborator

amshinde commented Mar 6, 2018

After taking a closer look at this and discussing this with @sboeuf, I think the following design could be a possible solution:

  1. Have the shim monitor the network namespace for any changes in the network. (The shim would have to maintain an initial state of the network namespace)
  2. In case a new network interface is added, instead of the shim communicating with the agent directly as this would mean shim would have to do a lot of heavy lifting, have the shim invoke the runtime passing in the container-id and the new network interface name.
  3. The runtime would then fetch the state for the container and invoke the appropriate virtcontainers API such as AddNic. (We need to introduce new virtcontainer API calls such as AddNic/DeleteNic, AddRoute/ DeleteRoute)
  4. Virtcontainers would then handle hotplugging tap/macvtap interface to the VM.(Or set up mirroring rule in case of tc)

@sameo @mcastelino What do you think of the above approach?

@sboeuf
Copy link
Collaborator

sboeuf commented Mar 6, 2018

Thanks for summarizing our discussion here @amshinde.
This is really the best thing we could come up with since the creation of a new network interface is not triggered by a regular call into the runtime...

@amshinde
Copy link
Collaborator

amshinde commented Mar 6, 2018

As a side note, we need to eventually move towards hot plugging all network devices including at startup : #665

@mcastelino
Copy link
Collaborator Author

@amshinde We need the ns monitoring to handle "connect" right? We can still move to a hotplug model for the actual network connection first. This may help parallelize some of the flow in the runtime.

@sboeuf
Copy link
Collaborator

sboeuf commented Mar 6, 2018

@mcastelino It'd be great to have the hotplug being done also for the simple case of adding our first interface when creating the pod, but this is a separate issue IMO. The problem with adding additional network interfaces is that this does not come from a direct call through the runtime. There is no command that translates from docker network connect. That's why you need some sort of monitoring to detect the addition of a new interface. Once this is detected, we cannot let the shim handle things since this would put a lot of knowledge and complexity inside the shim, while this can be easily handled by the runtime. That's why an extra command cc-runtime add-interface could be useful to be called from the shim, leaving the complexity/implementation inside virtcontainers.
@sameo WDYT ?

@guangxuli
Copy link

guangxuli commented Mar 7, 2018

@mcastelino @egernst @sboeuf @amshinde thanks for all your responses. okay, we will go deep into the details of the implementation of shim/runtime. etc in a few days. TBH, we aren't familiar with the code process details, :) even so we would do our best effort to driven this feature and participate to the code implementation.
@miaoyq FYI.

@mcastelino
Copy link
Collaborator Author

@sboeuf yes the single biggest challenge will be the fact that docker network connect does not have an associated OCI Lifecycle event, hence we will need an active component to monitor the network connection. However as this a pod level event, should the shim be the one monitoring it. If the shim is the one monitoring it, this this becomes very docker centric. Ideally the "sandbox" should be the entity monitoring for pod level events.

@miaoyq
Copy link
Contributor

miaoyq commented Mar 7, 2018

If the shim is the one monitoring it, this this becomes very docker centric. Ideally the "sandbox" should be the entity monitoring for pod level events.

@mcastelino We can create shim process with a parameters that will tell shim process whether monitor network connection or not. Only the shim that belongs to the first container of the pod monitors the network connection.

@sboeuf
Copy link
Collaborator

sboeuf commented Mar 7, 2018

@mcastelino you're right, I forgot that we're falling into a container semantic instead of a pod semantic here. This will work well for the Docker case, but this would not work for a k8s case.
@miaoyq The solution you're proposing here does not really work because we cannot assume that the first container will last the entire lifespan of the pod.

Seems like we need a dedicated watcher (separate process) to monitor the network namespace from a pod level.

@egernst
Copy link
Collaborator

egernst commented Apr 2, 2018

This issue was moved to kata-containers/runtime#161

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants