Skip to content
This repository has been archived by the owner on Nov 30, 2022. It is now read-only.

Enable Manual Webhooks in Request Execution [#1228] #1285

Merged
merged 16 commits into from
Sep 13, 2022

Conversation

pattisdr
Copy link
Contributor

@pattisdr pattisdr commented Sep 9, 2022

👉 Note that new scopes have been added here.

Purpose

Enable the new "Manual Webhook" Connector. In short, if manual webhooks are configured, the privacy request will be in a state of requires_input until data has been supplied for each webhook. Once the privacy request is resumed, the graph will run as usual and any manually uploaded data will be directly added to the final data package. The data is not used as part of the graph and is not filtered by data category. All uploaded data is passed to the user as-is.

Changes

  • Add four new API endpoints
    • Get all access manual webhooks GET "/access_manual_webhook"
    • Upload data for a manual webhook on a given privacy request: PATCH /privacy-request/{privacy_request_id}/access_manual_webhook/{connection_key}
      • There is no webhook key since there's only one AccessManualWebhook per ConnectionConfig of type manual_webhook so the connection_key is for the ConnectionConfig.
    • Resume privacy request in requires_input state once all webhook data is uploaded: POST "/privacy-request/{privacy_request_id}/resume_from_requires_input"
    • Get manually uploaded data for a given webhook GET /privacy-request/{privacy_request_id}/access_manual_webhook/{connection_key}
  • Update privacy request execution to accommodate new manual webhooks. Once all data has been uploaded, proceed with request execution, and then take the manual data and add it directly to the final upload package.

Checklist

  • Update CHANGELOG.md file
    • Merge in main so the most recent CHANGELOG.md file is being appended to
    • Add description within the Unreleased section in an appropriate category. Add a new category from the list at the top of the file if the needed one isn't already there.
    • Add a link to this PR at the end of the description with the PR number as the text. example: #1
  • Applicable documentation updated (guides, quickstart, postman collections, tutorial, fidesdemo, database diagram.
  • If docs updated (select one):
    • documentation complete, or draft/outline provided (tag docs-team to complete/review on this branch)
    • documentation issue created (tag docs-team to complete issue separately)
  • Good unit test/integration test coverage
  • This PR contains a DB migration. If checked, the reviewer should confirm with the author that the down_revision correctly references the previous migration before merging
  • The Run Unsafe PR Checks label has been applied, and checks have passed, if this PR touches any external services

Ticket

Fixes #1228

…a manual webhook for a given privacy request with "requires_input" status.
…ebhook.

- Add new scopes for the endpoints to upload/view manual data for webhooks.
- Enforce that at least one field is added when defining a manual webhook, and add a fallback if no fields were defined.
…tus once all input has been added. None of the fields are required, but the a key for each manual webhook still needs to exist in the cache to proceed.

As part of request execution check if data has been uploaded (data can be empty) for all manual webhooks. If True, we can proceed with request execution, otherwise, we put the PrivacyRequest in "requires_input" status and exits.

Also adds the manual data uploaded directly to the packet we upload to the user at the very end.
@pattisdr pattisdr changed the title Enable Manual Webhook Request Execution [#1228] Enable Manual Webhooks in Request Execution [#1228] Sep 9, 2022
@pattisdr pattisdr marked this pull request as ready for review September 9, 2022 14:48
Comment on lines 269 to 273
manual_data, proceed = get_access_manual_webhook_inputs(
session, privacy_request
)
if not proceed:
return
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the primary change of this PR - this puts a PrivacyRequest in requires_input status if we don't have every manual webhook confirmed. If we do have confirmation from all our manual webhooks, we retrieve any manually uploaded data, and this is passed onto the upload_access_results function and data is combined with the automatically retrieved data at the end.

Comment on lines 65 to 86
def get_access_manual_webhook_inputs(
db: Session, privacy_request: PrivacyRequest
) -> Tuple[Dict[str, List[Dict[str, Optional[Any]]]], bool]:
"""Retrieves manually uploaded data for all AccessManualWebhooks and formats in a way
to match automatically retrieved data. Also returns if execution should proceed.

This data will be uploaded to the user as-is, without filtering.
"""
manual_inputs: Dict[str, List[Dict[str, Optional[Any]]]] = {}

try:
for manual_webhook in AccessManualWebhook.get_enabled(db):
manual_inputs[manual_webhook.connection_config.key] = [
privacy_request.get_manual_webhook_input(manual_webhook)
]
except (MissingManualWebhookData, ValidationError) as exc:
logger.info(exc)
privacy_request.status = PrivacyRequestStatus.requires_input
privacy_request.save(db)
return manual_inputs, False

return manual_inputs, True
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally the access results have keys in the format "dataset:collection", so here I am using the manual_webhook connectionconfig key instead, since there is no "dataset" or "collection" defined for the webhook results. I think an overlap with existing keys is unlikely since they are mashups of dataset and collection names.

As an aside, since ConnectionConfigs and AccessManualWebhooks have a 1:1 relationship, I am not giving a separate identifier to the AccessManualWebhook, and using its ConnectionConfig key everywhere. So the question is, do we want them to be separate? We could store a separate AccessManualWebhook key and use that here, and on the UI it could be surfaced connection_identifier, like we use for saas configs. I didn't think we needed this extra identifier, but I can see an opposing argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with Sean, seems okay for now to have this 1:1 mapping and use the connection_config key as the identifier for the entire webhook.

)
PRIVACY_REQUEST_REVIEW = "privacy-request:review"
PRIVACY_REQUEST_UPLOAD_DATA = "privacy-request:upload_data"
PRIVACY_REQUEST_VIEW_DATA = "privacy-request:view_data"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also need to add new scopes in fideslib at this point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanders41 are we keeping fideslib scopes synced with fidesops scopes with the fides unification work, or is it okay that this is the most current list?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been trying to keep them in sync, but it won't break if they get out of sync for now.

src/fidesops/ops/models/manual_webhook.py Show resolved Hide resolved
src/fidesops/ops/models/manual_webhook.py Show resolved Hide resolved
@eastandwestwind
Copy link
Contributor

Nice work on this @pattisdr ! Learning some new things around pydantic models and types. Just a couple Qs to address.

Tagging @conceptualshark also to review docs.

…webhook_execution

# Conflicts:
#	CHANGELOG.md
#	src/fidesops/ops/service/privacy_request/request_runner_service.py
#	tests/ops/service/privacy_request/request_runner_service_test.py
@pattisdr
Copy link
Contributor Author

Thanks for your comments @eastandwestwind, back to you!

Copy link
Contributor

@eastandwestwind eastandwestwind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate all the detail on these changes, @pattisdr !

class ManualWebhookResults(BaseSchema):
"""Represents manual webhook data retrieved from the cache and whether privacy request execution should continue"""

manual_data: Dict[str, List[Dict[str, Optional[Any]]]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

Copy link
Contributor

@conceptualshark conceptualshark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thank you for the work on the docs!

@pattisdr
Copy link
Contributor Author

Thanks for taking a look @conceptualshark!

@eastandwestwind eastandwestwind merged commit df4834a into main Sep 13, 2022
@eastandwestwind eastandwestwind deleted the fidesops_1228_manual_webhook_execution branch September 13, 2022 15:07
sanders41 pushed a commit that referenced this pull request Sep 22, 2022
* Add a method to cache data supplied for a manual webhook on a particular privacy request.

* Add an endpoint to retrieve all enabled access manual webhooks.

* Add an endpoint for uploading manual data corresponding to fields in a manual webhook for a given privacy request with "requires_input" status.

* Add an endpoint to view data manually uploaded for an access manual webhook.

- Add new scopes for the endpoints to upload/view manual data for webhooks.
- Enforce that at least one field is added when defining a manual webhook, and add a fallback if no fields were defined.

* Add an endpoint to resume a privacy request from "requires_input" status once all input has been added.  None of the fields are required, but the a key for each manual webhook still needs to exist in the cache to proceed.

As part of request execution check if data has been uploaded (data can be empty) for all manual webhooks. If True, we can proceed with request execution, otherwise, we put the PrivacyRequest in "requires_input" status and exits.

Also adds the manual data uploaded directly to the packet we upload to the user at the very end.

* Update postman collection.

* Fix request_id query param in existing postman request.

* Include additional details about how to resume a "requires_input" privacy request when getting its status.

* Add docs and update changelog.

* Upload new ERD diagram.

* Don't put a privacy request in requires_input state if this policy only has erasure rules.

* Respond to CR!

* Update manual_webhooks.md
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Backend - Add manual webhook execution
4 participants