-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: improve workflow builtin actions #3105
Feat: improve workflow builtin actions #3105
Conversation
Codecov ReportAttention: Patch coverage is Additional details and impacted files📢 Thoughts on this report? Let us know! |
4b3f302
to
b5f9fbe
Compare
} else { | ||
log_file.log_info("=> sub-operation not yet launched").await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This log entry has been removed, because confusing when the sub-operation is actually a builtin operation.
// As not finalized, the builtin state is sent back | ||
// to the builtin operation actor for further processing. | ||
let builtin_state = new_state.clone(); | ||
Ok(self.builtin_command_dispatcher.send(builtin_state).await?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be remove in the future. Indeed, all the builtin-operation actors are currently dealing with 2 steps while things are actually done in a single step (the schedule state for restart and software update; the executing state for log_upload, config_snapshot and config_update).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The config actor still does it in two steps, where the tedgeUrl
generation happens in the scheduled
state and the rest in executing
, although I can't remember why we chose to do it that way.
.with_status(CommandStatus::Executing); | ||
.with_status(CommandStatus::Successful); | ||
software_box.send(software_list_response.into()).await?; | ||
|
||
mqtt_box | ||
.assert_received([MqttMessage::new( | ||
&Topic::new_unchecked("te/device/main///cmd/software_list/1234"), | ||
r#"{"status":"executing"}"#, | ||
r#"{"status":"successful"}"#, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 tests has been updated, watching for a successful status and no more an executing status.
Indeed, publishing the executing state is done by the workflow engine according to the workflow definition and no more by the internal actor.
/// Trigger a built-in operation | ||
/// | ||
/// ```toml | ||
/// operation = "<builtin:operation-name>" | ||
/// on_exec = "<state>" | ||
/// ``` | ||
BuiltInOperation(OperationName, ExecHandlers), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a first step, I introduced a specific BuiltInOperation
case along the existing Operation
case.
I will consider to the merge the two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We decided to go against this idea: a builtin operation can only be invoked from the same operation.
Robot Results
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@didier-wenzek I've played around with the proposal and it works well. I've been able to create a custom log_upload
handler with a conditional pre-processing step and everything worked as expected.
So I'm happy with the direction of the PR.
i.e. there is no `te/+/+/+/+/cmd/builtin:<operation-name>/+` command topic. | ||
- The builtin versions are only internal triggered by the agent from a customized workflow. | ||
- On a fresh install of %%te%%, the `<operation-name>` is simply a copy of `builtin:<operation-name>`. | ||
- This copy is not materialized on the file system, but created in memory by the agent when no customized version is found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Materializing them as done for devie_profile
would be a good step for users to discover and extend those built-in operation workflows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I will push a follow up PR for that.
- By convention, this file is named after the operation name, as in `/etc/tedge/operations/software_update.toml`. | ||
However, this is not mandatory: the operation is determined by the `operation` property, e.g. `operation = "software_update"`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR, but I've always wondered why we didn't tie it to the workflow name, as that would have easily prevented conflicting workflow definitions as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not. In practice, this what we do. I will push a follow up PR to discuss this.
- A user-provided workflow can invoke any builtin operations as well as the custom versions. | ||
- `operation = "<operation-name>"` triggers the custom-version of the operation workflow. | ||
- `operation = "builtin:<operation-name>"` triggers the operation ignoring any customization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A user might wonder when/why they should use one over the other. So, we might wanna highlight that only the former can be used from workflows of other operations and the latter is limited to the workflow of the same type only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now clearer with 6b37ae3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The feature is working as expected. I just have one remark regarding the decision to deprecate the old builtin action syntax, which might still be simpler for the end-user.
pub fn adapt_builtin_request(&self, command_state: GenericCommandState) -> GenericCommandState { | ||
match self { | ||
OperationAction::BuiltInOperation(_, _) => { | ||
command_state.update(GenericStateUpdate::scheduled()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One might wonder why not GenericStateUpdate::init_payload()
here, although at the impl level it makes no difference (since the builtin actors react to both init
and scheduled
messages the same way).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The status of the builtin request could be set to init, yes.
However, splitting hairs:
GenericStateUpdate::init_payload()
cannot be used here as this is an empty payload that still needs to be populated with the actual user request.- All the builtin commands do no nothing on the init state, simply proceeding to executing.
To simplify this, we can refactor all the operation actors to have a single do-it action.
// As not finalized, the builtin state is sent back | ||
// to the builtin operation actor for further processing. | ||
let builtin_state = new_state.clone(); | ||
Ok(self.builtin_command_dispatcher.send(builtin_state).await?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The config actor still does it in two steps, where the tedgeUrl
generation happens in the scheduled
state and the rest in executing
, although I can't remember why we chose to do it that way.
on_success = "postprocess" # on_success & on_error can be customized for builtin actions | ||
|
||
[postprocess] | ||
script = "/usr/bin/log_upload.sh ${.payload.status}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I understand that we want to deprecate action = "builtin"
in favour of the newly proposed operation = "builtin:<operation_name>"
style, seeing the simplicity of this syntax:
[executing]
action = "builtin"
on_success = "postprocess" # on_success & on_error can be customized for builtin actions
[postprocess]
script = "/usr/bin/log_upload.sh ${.payload.status}"
on_success = "successful"
...makes me wonder if this is still a better choice, during this API exploration phase, as the new version contains the additional boilerplate (the await_operation
state), as follows:
[executing]
operation = "builtin:software_update" # trigger the built-in behavior for software update
on_exec = "await_operation"
[await_operation]
action = "await-operation-completion" # awaiting the builtin operation to complete
on_success = "postprocess"
[postprocess]
script = "/usr/bin/log_upload.sh ${.payload.status}"
on_success = "successful"
From the user's perspective, the await_operation
state addition doesn't add much value. And since we have plans to get rid of the await
states in the future anyway, I'm wondering if it's better not to deprecate the old syntax to keep things simpler for now.
Since even the decision to overload the operation
keyword for builtin operations might also get revisited while we introduce more granular builtin actions like download
, file-copy
etc, I'm wondering if we can keep the simpler action = "builtin"
syntax itself for another release, and introduce the operation = "builtin:<operation_name>"
style only once we're fully confident that it is the way to go?
For this PR, nothing changes except hiding the newly added operation syntax from the toml_config
level and keeping the newly added await handlers for the old builtin
action. With that, the user still gets the new capability to add post-processing steps to a builtin action with minimal config changes as well. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully follow you.
action = "builtin"
is not deprecated yet: this is still working, but discouraged in the documentation.operation = "builtin:<operation_name>
is really nicer from a user perspective. This is a personal point of view, but with some evidence: the documentation has been far simpler to write and there is less traps in using it.- We are now in a position to make implicit the
action = "await-operation-completion"
, removing some burden to the user. - I see smaller-grain actions like
download
,file-copy
... as a nice idea we really have to work on. But, I don't see any those as full-fledged operations (with an MQTT topic, capabilities, retain messages, states ...). Hence, no risk of confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant was, since action = "builtin"
is still supported (although to-be deprecated), there are currently two ways for the user to attach post-processing logic to builtin operation workflows:
- The recommended approach with
operation = "builtin:<operation>
as follows:
[executing]
operation = "builtin:log_upload"
on_exec = "await_operation"
[await_operation]
action = "await-operation-completion"
on_success = "postprocess"
[postprocess]
script = "/usr/bin/log_upload.sh ${.payload.status}"
on_success = "successful"
- The discouraged/legacy approach of
action = "builtin"
, with the newly supported await handlers:
[executing]
action = "builtin"
on_success = "postprocess"
[postprocess]
script = "/usr/bin/log_upload.sh ${.payload.status}"
on_success = "successful"
From the external user's perspective, the second one with just 2 states looks much simpler, as the additional await_operation
state in the first approach is just boilerplate which adds no value to the user (unless I'm missing something here). The only additional thing that they get by using operation
is the input
and input_script
support, which is also less useful here IMO, as there is little scope for input manipulation when the main operation and the builtin operation are of the same type (since the same input can be passed as-is).
Since both syntaxes are providing the same functionality and the latter being minimal, I was just wondering why even expose the new operation = "builtin:<operation>
API, when the existing action = "builtin"
API (with the newly added support for await actions) does the same job with less boilerplate. Or is there some additional benefit with the operation
approach that I'm missing?
PS: Talking purely about the external facing TOML level API/syntax here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the external user's perspective, the second one with just 2 states looks much simpler, as the additional
await_operation
state in the first approach is just boilerplate which adds no value to the user (unless I'm missing something here).
Yes, you are missing something. 3 steps are already required in the second case. Indeed, the scheduled
step must be invoked as well as the executing step.
The only additional thing that they get by using
operation
is theinput
andinput_script
support, which is also less useful here IMO, as there is little scope for input manipulation when the main operation and the builtin operation are of the same type (since the same input can be passed as-is).
This was the plan when starting this PR, but has been abandoned meantime. As a builtin:<operation>
can only apply to the current operation there is no point to tune the input.
Or is there some additional benefit with the
operation
approach that I'm missing?
Using action = "builtin"
is really error prone unless you stick to a strict template.
96517ea
to
79cf19b
Compare
Oops, I have to rename this commit f27bf17 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question about the tests and small nits, overall LGTM.
|
||
```toml | ||
operation = "software_update" # an operation for which tedge-agent provides an implementation | ||
["init"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unnecessary quotes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
TomlOperationAction::Operation(operation) => match operation.strip_prefix("builtin:") { | ||
None => { | ||
let handlers = TryInto::<ExecHandlers>::try_into((input.handlers, defaults))?; | ||
let input_script = input.input_script; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: reads shorter + jump to definition
in IDE actually jumps to our try_from
definition, instead of a stub try_into
impl in stdlib
let input_script = input.input_script; | |
let handlers = ExecHandlers::try_from((input.handlers, defaults))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed clearer!
Fixed by 7ad7db4
} | ||
Some(builtin_operation_name) => { | ||
let handlers = TryInto::<ExecHandlers>::try_into((input.handlers, defaults))?; | ||
Ok(OperationAction::BuiltInOperation( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok(OperationAction::BuiltInOperation( | |
let handlers = ExecHandlers::try_from((input.handlers, defaults))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by 7ad7db4
@@ -173,6 +181,17 @@ impl TryFrom<(TomlOperationState, DefaultHandlers)> for OperationAction { | |||
handlers, cmd_output, | |||
)) | |||
} | |||
"builtin" => { | |||
let exec_handlers = TryInto::<ExecHandlers>::try_into(( | |||
input.handlers.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input.handlers.clone(), | |
let exec_handlers = ExecHandlers::try_from(( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by 7ad7db4
ExecHandlers::builtin_default(), | ||
))?; | ||
let await_handlers = TryInto::<AwaitHandlers>::try_into(( | ||
input.handlers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input.handlers, | |
let await_handlers = AwaitHandlers::try_from(( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by 7ad7db4
Should Contain ${software_list[0]} "tedge" | ||
Should Contain ${software_list[0]} "postprocess" | ||
Should Contain ${software_list[0]} "done" | ||
Execute Command tedge mqtt pub --retain te/device/main///cmd/software_list/robot-456 '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: The agent is supposed to clean-up the command, shouldn't we check that it does instead of manually cleaning in the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No the agent do nothing once the command is successful or failed. This is the requester who is supposed to clean-up the command, once processed the final outcome.
This commit introduces no changes to the user. This is only a preparation step to replace builtin actions, which usage was very restricted and specific, with builtin operations that can be used as any other operations. Signed-off-by: Didier Wenzek <[email protected]>
Signed-off-by: Didier Wenzek <[email protected]>
Signed-off-by: Didier Wenzek <[email protected]>
The new name is more appropriate, now that these handlers are used not only for background scripts but also to trigger a sub operation or a builtin action. Signed-off-by: Didier Wenzek <[email protected]>
Signed-off-by: Didier Wenzek <[email protected]>
Signed-off-by: Didier Wenzek <[email protected]>
Signed-off-by: Didier Wenzek <[email protected]>
Signed-off-by: Didier Wenzek <[email protected]>
Signed-off-by: Didier Wenzek <[email protected]>
Signed-off-by: Didier Wenzek <[email protected]>
Signed-off-by: Didier Wenzek <[email protected]>
79cf19b
to
7ad7db4
Compare
This is an alternative PR to improve the way builtin operations are customized.
Proposed changes
Improve the definition of operation workflows, so one can trigger a builtin operation the same way as for a sub operation. The aim is to make things easier to understand with less concepts, more consistent (while a single way to trigger an operation) and more flexible (ability to define specific state on success and on error).
OperationAction::BuiltinOperation
BuiltIn
action withon_success
,on_error
andon_exec
handlers.builtin:<operation>
builtin
keyword can still be used for backward compatibilityLet the user execute any builtin operation from any workflowAbility to build the builtin operation input from the calling operation as for sub operationsAbility to inject excerpts of the builtin operation output into the calling operation as for sub operationsTypes of changes
Paste Link to the issue
#3014
Checklist
cargo fmt
as mentioned in CODING_GUIDELINEScargo clippy
as mentioned in CODING_GUIDELINESFurther comments