Skip to content
This repository has been archived by the owner on May 24, 2022. It is now read-only.

[test_runner] Unify wait routines #122

Closed
wants to merge 9 commits into from
Closed

[test_runner] Unify wait routines #122

wants to merge 9 commits into from

Conversation

marcoguerri
Copy link
Contributor

The pipeline code was affected by some duplication in the shutdown
control path. This patch removes the duplication, and introduces
only two wait routines in the shutdown control path: waitTargets
and waitControl.

The logic of the test runner is not modular enough to accomodate the
need for implementing a cleanup pipeline. This revision refactors the
test runner so that the concept of pipeline can be re-used. The eventual
goal is to deploy two pipelines, one for testing and one for cleanup.
This revision further simplifies the routing logic and removes the
dependency of the pipeline over the list of targets. The latter is
important to implement pipelines that might have a variable number
of targets coming in that cannot be predicted beforehand.
After extracting the pipeline logic, the routing logic was still part
of the pipeline and very monolithic. This patch extracts the routing
logic into a dedicated structure and splits it into routeIn logic
and routeOut logic, which makes it unit-testable.
Pipeline and routing logic are now split into different files.
Unit tests for the routing logic have been added.
This patch adds support for implementing a cleanup pipeline, which runs
cleanup steps, after the main test pipeline. One of the requirements of
ConTest has always been that it should support a "cleanup" phase, and
this adds support for it.

The cleanup pipeline is described in the same was test steps have been
described so far: and additional "cleanup" section in the test descriptor
is deserialized to implement a pipeline which runs the required cleanup
steps.
The interface of TestSteps so far has dictated that targets should be
forwarded to two different channels for success and failure. This makes
several parts of ConTest logic relatively complicated. In order to
simplify the framework, this braking change reduces the number out
output channels that a test step needs to handle to only one. The object
returned by the step might include error information for targets which
failed in the test step.
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 16, 2020
@codecov
Copy link

codecov bot commented Jun 16, 2020

Codecov Report

Merging #122 (87f1507) into master (cc4a4dc) will increase coverage by 0.98%.
The diff coverage is 71.34%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #122      +/-   ##
==========================================
+ Coverage   63.14%   64.12%   +0.98%     
==========================================
  Files          51       58       +7     
  Lines        2846     3097     +251     
==========================================
+ Hits         1797     1986     +189     
- Misses        813      865      +52     
- Partials      236      246      +10     
Flag Coverage Δ
integration 62.95% <71.12%> (+1.09%) ⬆️
unittests 20.29% <16.38%> (+4.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/cerrors/cerrors.go 0.00% <0.00%> (-66.67%) ⬇️
pkg/jobmanager/jobmanager.go 66.95% <ø> (+6.14%) ⬆️
pkg/jobmanager/status.go 0.00% <0.00%> (ø)
pkg/pluginregistry/errors.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/target/target.go 100.00% <ø> (ø)
plugins/targetlocker/inmemory/inmemory.go 66.32% <ø> (ø)
plugins/targetlocker/noop/noop.go 78.57% <ø> (ø)
plugins/teststeps/echo/echo.go 0.00% <0.00%> (ø)
pkg/runner/job_runner.go 48.12% <20.00%> (ø)
pkg/job/job.go 56.52% <23.07%> (-43.48%) ⬇️
... and 33 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc4a4dc...c53e426. Read the comment docs.

The pipeline code was affected by some duplication in the shutdown
control path. This patch removes the duplication, and introduces
only two wait routines in the shutdown control path: waitTargets
and waitControl.
Copy link
Contributor

@xaionaro xaionaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll continue later.

Comment on lines -2 to -4
storage2345,10.10.10.123
machinelearning1234,192.168.123.231
uselesshost10,10.0.0.10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity:

Why have you removed this lines? :)

Comment on lines -23 to -39
},
{
"TargetManagerName": "CSVFileTargetManager",
"TargetManagerAcquireParameters": {
"FileURI": "hosts02.csv",
"MinNumberDevices": 2,
"MaxNumberDevices": 4,
"HostPrefixes": [
]
},
"TargetManagerReleaseParameters": {
},
"TestFetcherName": "URI",
"TestFetcherFetchParameters": {
"TestName": "RackProvisioning",
"URI": "test_samples/randecho.json"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity:

Why have you removed this lines? :)

}
}
]
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extreme nitpicking:

An extra space.

// error. Termination is signalled via terminate channel.
func (p *pipeline) waitTargets(terminate <-chan struct{}, completedCh chan<- *target.Target) error {
// emitStepEvent emits a failure event if a step fail
func (p *pipeline) emitStepEvent(result *stepResult) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I interpret this name differently from what it does: I interpret that it sends an event on either success or failure, while actually it sends an event only on failure. BTW, why we do not send successes? To reduce amount of rows in DB?

completedTargetError error
)
// waitControl reads results coming from result channels (for steps and routing blocks)
// until a timeout occurrs. The error handling is different depending on whether
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking:

A typo: occurrsoccurs.

Further simplification in the test runner. Routing blocks can become
responsible for waiting for targets directly, so that the pipeline
itself can just wait on routing blocks and their results.
@marcoguerri
Copy link
Contributor Author

marcoguerri commented Jan 4, 2021

This was supposed to be closed for the reason mentioned in #207

@marcoguerri marcoguerri closed this Jan 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants