Skip to content

Commit

Permalink
fix: kill the task processes when cleaning up stale task
Browse files Browse the repository at this point in the history
The bug was triggered by `containerd` crash (restart), in this case
runner receives an error as if the process exited.
Runner tries to restart the container, but as the container is still
running, attempt to delete the task would fail.

With this change Talos always tries to kill the running container and
waits for the container to terminate.

The error message when the bug was triggered looks like:

```
service[kubelet](Waiting): Error running Containerd(kubelet), going to restart forever: failed to clean up task "kubelet": task must be stopped before deletion: running: failed precondition
```

Signed-off-by: Andrey Smirnov <[email protected]>
  • Loading branch information
smira committed Sep 12, 2022
1 parent 14a79e3 commit 3a67c42
Showing 1 changed file with 19 additions and 1 deletion.
20 changes: 19 additions & 1 deletion internal/app/machined/pkg/system/runner/containerd/containerd.go
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ func (c *containerdRunner) Close() error {

// Run implements runner.Runner interface
//
//nolint:gocyclo
//nolint:gocyclo,cyclop
func (c *containerdRunner) Run(eventSink events.Recorder) error {
defer close(c.stopped)

Expand All @@ -137,6 +137,24 @@ func (c *containerdRunner) Run(eventSink events.Recorder) error {
// attempt to clean up a task if it already exists
task, err = c.container.Task(c.ctx, nil)
if err == nil {
var s <-chan containerd.ExitStatus

s, err = task.Wait(c.ctx)
if err != nil {
return fmt.Errorf("failed to wait for the task %q: %w", c.args.ID, err)
}

err = task.Kill(c.ctx, syscall.SIGKILL, containerd.WithKillAll)
if err != nil && !errdefs.IsNotFound(err) {
return fmt.Errorf("failed to kill the task %q: %w", c.args.ID, err)
}

select {
case <-s:
case <-c.stop:
return nil
}

if _, err = task.Delete(c.ctx); err != nil {
return fmt.Errorf("failed to clean up task %q: %w", c.args.ID, err)
}
Expand Down

0 comments on commit 3a67c42

Please sign in to comment.