-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
daemon.ContainerStop(): fix for a negative timeout #36874
Conversation
Created for the sake of CI, will add more later. |
Codecov Report
@@ Coverage Diff @@
## master #36874 +/- ##
=========================================
Coverage ? 34.97%
=========================================
Files ? 614
Lines ? 45641
Branches ? 0
=========================================
Hits ? 15965
Misses ? 27587
Partials ? 2089 |
integration/container/stop_test.go
Outdated
// a timeout works as documented, i.e. in case of negative timeout | ||
// waiting is not limited (issue #35311). | ||
func TestStopContainerWithTimeout(t *testing.T) { | ||
t.Parallel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes for a flakey test whennused with the main integration daemon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please elaborate why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it is running in the main daemon.
The test setup takes a snapshot of the current state and then at the end of the test cleans up anything that was not in the original state. But if there are multiple tests doing this cleanup they will conflict with each other, clean up stuff they shouldn't be, etc.
integration/container/stop_test.go
Outdated
} | ||
|
||
for _, d := range testData { | ||
name := container.WithName(t.Name() + "_" + strconv.Itoa(d.timeout)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use "t.Run" so the different cases run as separate tests. This makes it easiee to know which case failed (if there is a failure)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thanks for suggestion
The unit test is checking that setting of non-default StopTimeout works, but it checked the value of StopSignal instead. Amazingly, the test was working since the default StopSignal is SIGTERM, which has the numeric value of 15. Fixes: commit e66d210 ("Add config parameter to change ...") Signed-off-by: Kir Kolyshkin <[email protected]>
OK this is no longer WIP. CI failure in experimental is flaky test |
@cpuguy83 can you PTAL? All your comments were acted upon |
integration/container/stop_test.go
Outdated
d := d | ||
name := t.Name() + "_" + strconv.Itoa(d.timeout) | ||
t.Run(name, func(t *testing.T) { | ||
t.Parallel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think each one of these is treated as a separate test, so they all get grouped together at the end of the test run with the rest of the parallel tests. So I'm not sure this is any safer than before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. Surely I might be wrong, but according to the golang pkg/testing documentation, such tests are only run in parallel with each other:
Subtests can also be used to control parallelism. A parent test will only complete once all of its subtests complete. In this example, all tests are run in parallel with each other, and only with each other, regardless of other top-level tests that may be defined:func TestGroupedParallel(t *testing.T) { for _, tc := range tests { tc := tc // capture range variable t.Run(tc.Name, func(t *testing.T) { t.Parallel() ... }) } }
Also, I got the idea from similar tests like this one:
moby/integration/container/create_test.go
Lines 41 to 46 in fe2d3a1
for _, tc := range testCases { | |
tc := tc | |
t.Run(tc.doc, func(t *testing.T) { | |
t.Parallel() | |
_, err := client.ContainerCreate(context.Background(), | |
&container.Config{Image: tc.image}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cpuguy83 PTAL ^^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this does seem to work.
1. As daemon.ContainerStop() documentation says, > If a negative number of seconds is given, ContainerStop > will wait for a graceful termination. but since commit cfdf84d (PR moby#32237) this is no longer the case. This happens because `context.WithTimeout(ctx, timeout)` is implemented as `WithDeadline(ctx, time.Now().Add(timeout))`, resulting in a deadline which is in the past. To fix, don't use WithDeadline() if the timeout is negative. 2. Add a test case to validate the correct behavior and as a means to prevent a similar regression in the future. 3. Fix/improve daemon.ContainerStop() and client.ContainerStop() description for clarity and completeness. 4. Fix/improve DefaultStopTimeout description. Fixes: cfdf84d ("Update Container Wait") Signed-off-by: Kir Kolyshkin <[email protected]>
Minor change:
- name := t.Name() + "_" + strconv.Itoa(d.timeout)
- t.Run(name, func(t *testing.T) {
+ t.Run(strconv.Itoa(d.timeout), func(t *testing.T) {
t.Parallel()
- id := container.Run(t, ctx, client, testCmd, container.WithName(name))
+ id := container.Run(t, ctx, client, testCmd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
CI failure in TestAPISwarmLeaderElection is #32673 |
not sure why I had it as [WIP] -- it's definitely not ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐯
16 days since last CI run... but this is probably fine. |
As daemon.ContainerStop() documentation says,
but since commit cfdf84d (PR #32237) this is no longer the case.
This happens because
context.WithTimeout(ctx, timeout)
is implementedas
WithDeadline(ctx, time.Now().Add(timeout))
, resulting in a deadlinewhich is in the past.
To fix, don't use WithDeadline() if the timeout is negative.
A test case is added to validate the correct behavior and as a means
to prevent a similar regression in the future.
This is a replacement of #35418, and should fix #35311.