-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Should the OMRS check the originator of a Purge/Delete event when processing them? #6627
Comments
If Atlas issues a delete or update event for an element homed by another repository (ie it is a reference copy to atlas) then the Atlas repository connector is in violation of the protocol and the fix needs to be applied there. Atlas needs to maintain the correct home information in the headers of the elements of the reference copies it stores in its repository and only send events for the element where it is the home repository. It is can not do that, it should not store reference copies. So the next question is how much can the rest of the cohort guard against a rogue member? Because we use a pub/sub event system (ie kafka), it is not possible to verify the system that produced an event is the correct one. However it is possible to test that the header of the event is consistent with the header of the element. It we don't do that already, it is worth adding that check, creating an audit log record to record the repository in error and ignoring the event. This should be done for all events. When it comes down to the delete behaviour of the receiving repository connector, the implementation of a repository connector is at liberty to delete a reference copy from their repository at any time. This has no impact on the cohort because the home repository will continue to return the element on a federated query. Only local users of the repository are affected. However, if a repository connector is called to delete a reference copy they must do it so that the element is no longer returned on a federated query. If they receive a call to delete a reference copy and it turns out to be for an element that they are the home for then the repository should report an error on the audit log and not delete the element. The OMRS does this check before calling the delete for the reference copy so if the repository connector does see this problem, it suggests it is not managing its header information correctly. |
This enhancement may be of interest ... #5402 It is for a repository connector that wraps a repository connector for a repository that can not handle reference copies but wishes to store metadata from other repositories. It maintains the correct headers and events for the cohort. |
Thank you, Mandy @mandy-chessell. After reading #5402, I know you proposed the smart proxy event mapper to prevent the third-party repository from publishing its changes towards reference copy, which is a protocol violation, to the Cohort. The design is quite good. But I am still wondering about bellow two points:
|
Hello @Kelukin I agree with you on the originator checking. I am planning to add additional validation around event management to look for protocol violators of this nature. The fix will cause ADMIN audit log messages to be produced by each member of the cohort when they detect the problem. This category of audit log messages are typically routed to their operators because action is required. In this case, the action will recommend that the offender is removed from the cohort until it is fixed. This is because it threatens the integrity of the cohort's metadata (as you reported). So Atlas still needs to be fixed - and the smart proxy is just a way of doing this. The egeria community created a change to Atlas to support reference copies but the Atlas community did not want it - so if they are still of the same mind and you don't want to create an Atlas fork to add your own fix, then the smart proxy is an option. The alternative is to remove the saving of reference copies from the Atlas repository connector (this is what we have in the Egeria community's version of the Atlas connector). The smart proxy itself is not implemented. It is documented as an option that is potentially available for third party technology that cannot be changed. We are waiting for someone to request it before implementing. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions. |
Signed-off-by: Mandy Chessell <[email protected]>
I have added a fix for this - but can not test with Apache Atlas |
Is there an existing issue for this?
Please describe the new behavior that that will improve Egeria
The current implementation for processing purge/delete events might bring risks.
Different from the
processUpdatedEntityEvent
, the implementations ofprocessPurgedEntityEvent
andprocessDeletedEntityEvent
will not check the originator of the event. They delete the entity instance from the storage. Hence, when one of the cohort members sends thePURGED_ENTITY_EVENT
to the cohort, other OMRS joined the same cohort will delete their stored instance, including the entity instance's home repository.Since Egeria supports the third party to develop their repository service in the repository-proxy mode to join the cohort, it might bring security issues to the other cohort member's data. For example, one Atlas that joined the cohort might publish a
PURGED_ENTITY_EVENT
event when its user deletes an entity instance sourced from other cohort members.From my perspective, the Egeria should add an originator check when processing a Purge/Delete event. Thanks.
Alternatives
No response
Any Further Information?
No response
Would you be prepared to be assigned this issue to work on?
The text was updated successfully, but these errors were encountered: