-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix update execution message discarded. refs #8616 #8627
Fix update execution message discarded. refs #8616 #8627
Conversation
Colleagues, any security objections? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try to summarize to see if I understand correctly:
In order to schedule the execute a command on another node, this command has to be sent from a parent node to be accepted. This command references some config object which is then asynchronously updated with the result. However, this update performs a different permission check: the object must be accessible from the zone that sends the update. You can execute the command on any object though.
So a possible workaround should be to create a dummy host/service that is in the zone of the endpoint executing the command.
This PR would now introduce a new relationship between endpoint and host objects if they share the same name, allowing this update nonetheless. An alternative to this could probably be to store the command endpoint with the pending execution result and use this for update permission checks.
Hi @julianbrost, thanks for your feedback, we like the idea of storing the command endpoint in the execution object and we have implemented it. It can be tested with the same test above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please squash your commits.
3552c5d
to
76c2e76
Compare
76c2e76
to
102f14f
Compare
102f14f
to
97b0d14
Compare
Seems like I've overlooked something when suggesting to change the permission check in this way: The permission check would be fine, but the object on which the execution is updated won't exist on intermediate satellites, so this won't help. This would be a general problem when decoupling the execution endpoint from the object's zone completely. Endpoint names in the following examples refer to this cluster structure:
So in the most extreme case, the API would currently accept a command for the host object This could be fixed but would need major changes in how the results are returned (would have to be forwarded up the cluster without relying on any local objects). But I think this isn't even a use-case that you'd need but you only have this discrepancy in the object's zone because you're using the command endpoint feature. So I see two possibilities how this could be fixed without completely overthrowing how the feature works:
So please let me know what you think about this and which options would fit your needs so that we can figure out together how to proceed. |
4534b0c
to
3c1835d
Compare
Hi @julianbrost sorry for the delay, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @julianbrost sorry for the delay,
Well, I think I can't complain about that and have to say sorry for the delayed review myself.
@cla-bot check |
3c1835d
to
18bc625
Compare
C'mon Julian, you've even not resolved your threads not to mention not approved. |
18bc625
to
0eb5fec
Compare
0eb5fec
to
730730c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Describe the big picture. Then vs. now. At best with diagrams. The can be hand drawn and photographed if you don't wanna deal with digital tools.
|
730730c
to
912fdb9
Compare
There's already enough information in the description and comment history.
lib/icinga/clusterevents.cpp
Outdated
return Empty; | ||
} | ||
|
||
if (origin->FromZone && !origin->FromZone->CanAccessObject(command_endpoint->GetZone())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apropos, I consider zoneA->CanAccessObject(zoneB) kinda confusing. Doesn’t this -according to CanAccessObject() code- query for a child/parent relationship?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of
origin->FromZone->CanAccessObject(command_endpoint->GetZone())
is it better to use
command_endpoint->GetZone()->isChildOf(origin->FromZone)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is what you wanna check for, yes. IMAO it's more descriptive. But let's Julian cross-check first..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the return path for the response. The request was sent to the endpoint command_endpoint
so this is supposed to check whether the response is coming from (directly or indirectly) that endpoint. That should be checked by both variants but with IsChildOf()
that might be a bit more obvious (especially given that CanAccessObject()
would also do GetZone()
internally, but that wouldn't do the same, greetings to #8740).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now.
@Al2Klimov Do you want to have another look at this before merging? The change is basically storing the execution endpoint in the slot for the result (where the deadline is already stored for example) and then changes to the message forwarding and related permission checks to actually make it work.
"Zone '" + endpointZone->GetName() + "' cannot access checkable '" + checkable->GetName() + "'." | ||
); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK this makes sense to me...
<< ": Endpoint " << params->Get("endpoint") << " does not exist"; | ||
return Empty; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far so good...
@@ -1303,7 +1377,7 @@ Value ClusterEvents::UpdateExecutionsAPIHandler(const MessageOrigin::Ptr& origin | |||
updateMessage->Set("method", "event::UpdateExecutions"); | |||
updateMessage->Set("params", params); | |||
|
|||
listener->RelayMessage(origin, Zone::GetLocalZone(), updateMessage, true); | |||
listener->RelayMessage(origin, checkable, updateMessage, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
return Empty; | ||
} | ||
|
||
if (origin->FromZone && !command_endpoint->GetZone()->IsChildOf(origin->FromZone)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only problem I see are messages from the own zone, but then -I guess- !!origin->FromZone wouldn’t be true. (JsonRpcConnection::MessageHandler())
executedMessage->Set("method", "event::ExecutedCommand"); | ||
executedMessage->Set("params", executedParams); | ||
|
||
listener->RelayMessage(nullptr, nullptr, executedMessage, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh? nullptr, nullptr? Not...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I haven't put much thought into these parameters here as the existing event::ExecutedCommand
message already used the same ones.
However, are you sure about origin? This is sending a response that originates here, so with the origin of the incoming message, that might prevent the response from being sent back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I tested, I agree with Julian, using origin, checkable
the message is not sent back and the execution remains in the pending state
"a9086343-4da7-44ae-8fd6-5f16641cbd5e" = {
deadline = 1681388550.850034
endpoint = "agent"
pending = true
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. 😅 🐘
@@ -1303,7 +1377,7 @@ Value ClusterEvents::UpdateExecutionsAPIHandler(const MessageOrigin::Ptr& origin | |||
updateMessage->Set("method", "event::UpdateExecutions"); | |||
updateMessage->Set("params", params); | |||
|
|||
listener->RelayMessage(origin, Zone::GetLocalZone(), updateMessage, true); | |||
listener->RelayMessage(origin, checkable, updateMessage, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... origin, checkable? Not even origin? Please explain.
lib/icinga/clusterevents.cpp
Outdated
Host::Ptr host = Host::GetByName(params->Get("host")); | ||
if (!host) { | ||
Log(LogWarning, "ClusterEvents") | ||
<< "Discarding 'execute command' message " << executionUuid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of indentation...
lib/icinga/clusterevents.cpp
Outdated
checkableName += "!" + params->Get("service"); | ||
|
||
Log(LogWarning, "ClusterEvents") | ||
<< "Discarding 'execute command' message " << executionUuid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... wouldn’t hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear: I asked you for splitting commits by things of what you changed since master (which you did by now). IMAO this doesn’t contradict my previous squash request which targeted fixes of the changeset itself. Apropos: Please squash not everything, but just the indentation fixes into the respective commit(s?) the misindentation comes from. So you get two commits again, but both of them w/o misindentation – and you're finally (😅) done.
…but checkable has command_endpoint specified
dd114c4
to
c5c1792
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only whitespace changes since my last review, so that one is pretty much still valid.
Hi, this is our proposal for fixes #8616
Configuration for test
/etc/icinga2/conf.d/test.conf
Use this configuration on master, satellite and agent
Master
/etc/icinga2/zones.conf
Satellite
/etc/icinga2/zones.conf
Agent
/etc/icinga2/zones.conf
Test before the fix
Request
Response
Execution status
Master logs
Satellite logs
Agent logs
Test with the fix
Request
Response
Exectuion status*
Master logs
Satellite logs
Aagent logs